This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Friday, November 11 • 10:10am - 10:30am
Building a High-Performance Database with Scala, Akka, Cassandra, and Spark

Sign up or log in to save this to your schedule and see who's attending!

#distributedsystems #scala #akka #spark #FiloDB #cassandra Scala and its large ecosystem of libraries are increasingly being used to build highly scalable and performant data systems. In this talk, I share years of experience building high performance data systems using Scala, Akka, and Spark, plus recent experience building FiloDB, a high performance analytics database built on these technologies. How does FiloDB fit into the modern big data streaming world? How do you leverage all the features of Spark to make a database? How do we balance Scala and functional programming with very high performance demands? What are some tips to watch out for when building very very fast Scala code? - Introduction to FiloDB and its use cases for analyzing streaming and static data - How FiloDB fits into the SMACK stack for event storage and deep data analysis / machine learning - Some interesting use cases, such as streaming support for smart cities / IoT - Integration of Spark DataFrames and Data Sources - When to use Futures, Actors, or neither - Writing a reactive, at-least-once data pipeline with back pressure - Reactive stack metrics and performance monitoring - Filo: summing integers at billions of ops per second, taking advantage of processor cache and SIMD with super fast vector operations - Serialization, GC, and off-heap: how to leverage binary data structures for the win

avatar for Evan Chan

Evan Chan

Evan loves to design, build, and improve bleeding edge distributed data and backend systems using the latest in open source technologies.  He is the creator of the FiloDB open-source distributed analytical database, as well as the Spark Job Server.  He has led the design and implementation of multiple big data platforms based on Storm, Spark, Kafka, Cassandra, and Scala/Akka, including a columnar real-time distributed query engine. He... Read More →

Friday November 11, 2016 10:10am - 10:30am
Off by One

Attendees (67)