Steps To Big Data: Hello, Cascading

In previous post Steps To Big Data: Hello, Pail, we have defined our fact model and shown how to use Pail to store out data. What is fact model?

A Fact Model, as we see it, structures basic knowledge about business operations from a business perspective. “Basic” means that the knowledge it represents cannot be derived or computed from any other knowledge. It that sense, a Fact Model is a crucial starting point for developing more advanced forms of business knowledge, including measures and rules.

As described above, fact model cannot be derived or computed from any other knowledge. In other words, its purpose is to generate data. The process of generating data from fact model could be illustrated as below.

Cascading supports to create and execute complex data processing workflows on top of Hadoop using any JVM-based language(Java, Scala, JRuby, Clojure, etc.) Most cascading related libraries are in the Maven Repository http://conjars.org/

We are going to use cascading to count users’ actions.

Continue reading