Presentation: Tweet"Case Study: Making Hadoop & Cassandra Work Together to process 5+Tb of Data Daily"
have implemented architectures that use NoSQL databases and
frameworks that support data-intensive distributed applications, then
myriads of options are probably a slight enigma.
This session will give you an insight into what to consider when you need to make a NoSQL column based database like Cassandra work together with data-intensive distributed framework like Hadoop.
• The basic product idea
- Why NoSQL?
- The problem: need to collect, process and store 5Tb of data daily
• The solution: A combination of Cassandra, Hadoop and some MySQL
• Best practices for modeling your data in Cassandra
• Best practices for scaling Hadoop Clusters
• Lessons learned about Cassandra’s strong and weak areas
• Lessons learned from implementing Map/Reduce
• Anti-Patterns (things to avoid)