Presentation: Tweet"Case Study: Making Hadoop & Cassandra Work Together to process 5+Tb of Data Daily"
Unless you
have implemented architectures that use NoSQL databases and
frameworks that support data-intensive distributed applications, then
myriads of options are probably a slight enigma.
This
session will give you an insight into what to consider when you need
to make a NoSQL column based database like Cassandra work together
with data-intensive distributed framework like Hadoop.
• The
basic product idea
- Why NoSQL?
-
The problem: need to collect, process and store 5Tb of data daily
•
The solution: A combination of Cassandra, Hadoop and some MySQL
•
Best practices for modeling your data in Cassandra
• Best
practices for scaling Hadoop Clusters
• Lessons learned about
Cassandra’s strong and weak areas
• Lessons learned from
implementing Map/Reduce
• Anti-Patterns (things to avoid)