Loading…
Scalæ By the Bay has ended
Friday, November 11
 

9:00am PST

Keynote: Typelevel in 2016
There has been a huge amount of activity around the Typelevel family of projects this year. The arrival of Cats on the scene marked the beginning an exciting period of collaboration among the Typelevel projects and reaching out to the wider Scala community that hadn't been possible before. Now, at the end of 2016, we have had two Typelevel conferences, and numerous other workshops and hack days. Things are going from strength to strength. This talk will give a flavour of what has been going on: the collaborations between Algebra, Spire and Cats; between Cats and shapeless; between shapeless and scodec, doobie, ScalaCheck and Circe; and how all of this is feeding into the rebooted Typelevel Scala. It's also an open invitation to people right across the Scala spectrum to get involved in these projects and see what they can do for them in their own work.

Friday November 11, 2016 9:00am - 9:40am PST
Caching

9:50am PST

Introduction to Big Data and Spark
This is an introductory talk for those who want to get into Big Data and learn about Spark, but don't know where to start. Spark is a fast easy-to-use general-purpose cluster computing framework for processing large datasets. It has become the most active open-source big data project. It is hotter than Hadoop was a few years ago. The talk will start with an introduction to Big Data and the challenges associated with it. Next, Mohammed will dive into Spark and talk about how it can be used to solve those challenges. In addition, he will discuss the following: a) Why Spark has set the Big Data world on fire b) Why people are replacing Hadoop MapReduce with Spark c) What kind of applications benefit from Spark d) How Spark works (high-level architecture) Finally, he will introduce the key libraries that come pre-packaged with Spark and discuss how these libraries simplify a variety of analytical tasks, including: a) Batch processing b) Interactive ad hoc analytics c) Stream processing d) Graph analytics e) Machine learning

Speakers
avatar for Mohammed Guller

Mohammed Guller

Principal Architect, Glassbeam
Passionate about building new products, machine learning, and big data analytics. Built several products from the ground up. Author of Big Data Analytics with Spark.


Friday November 11, 2016 9:50am - 10:10am PST
Off by One

9:50am PST

Scala 2.12 and Beyond
I'll talk about what's new in Scala 2.12, how Scala is still more suited for functional programming than Java 8, and give you a quick glimpse of what's next in Scala 2.13. Scala 2.12's backend takes full advantage of the improved support for functional programming on the Java 8 platform. Little has changed on the "outside" since Scala 2.11, but we have given the compiler internals a significant overhaul. Additionally, Scala 2.12 ships with a new optimizer! In this talk, we'll see how Scala functions compile to the same byte code as in Java 8 and explain the simplified encoding of Scala traits as Java 8 interfaces. We'll also look at the cohesive set of language features and design principles that make functional programming such a joy in Scala.

Speakers
avatar for Adriaan Moors

Adriaan Moors

Scala Team Lead, Lightbend, Inc
Scala Team Lead, Lightbend


Friday November 11, 2016 9:50am - 10:30am PST
Naming

9:50am PST

Finagle as a proxy: resilient, Twitter-style microservices for polyglot environments
Finagle is an open-source, high-volume RPC client library which powers the application infrastructure of companies like Twitter, Pinterest and Soundcloud. In this talk, we introduce "linkerd", an open source RPC proxy built on Finagle and specifically designed for microservices. We describe how linkerd can be used to “wrap” polyglot multi-service applications in Finagle’s operational model, adding connection pooling, load balancing, failure detection, and failover mechanisms to existing applications with little to no code change. We demonstrate how this model allows polyglot applications to take advantage Finagle’s production-tested capabilities around scalability, reliability, and fault-tolerance, even in the presence of unpredictable traffic volumes and unreliable hardware. We describe how Finagle was used and extended in the creation of linkerd, and walk through the roadmap of upcoming features.

Speakers
avatar for Oliver Gould

Oliver Gould

CTO, Buoyant
Oliver Gould is a core maintainer of Linkerd, and is the co-founder and CTO of Buoyant, where he leads open source development efforts. Prior to Buoyant, he was a staff infrastructure engineer at Twitter, where he was the tech lead of Observability, Traffic, and Configuration & Coordination... Read More →


Friday November 11, 2016 9:50am - 10:30am PST
Caching

10:10am PST

Building a High-Performance Database with Scala, Akka, Cassandra, and Spark
#distributedsystems #scala #akka #spark #FiloDB #cassandra Scala and its large ecosystem of libraries are increasingly being used to build highly scalable and performant data systems. In this talk, I share years of experience building high performance data systems using Scala, Akka, and Spark, plus recent experience building FiloDB, a high performance analytics database built on these technologies. How does FiloDB fit into the modern big data streaming world? How do you leverage all the features of Spark to make a database? How do we balance Scala and functional programming with very high performance demands? What are some tips to watch out for when building very very fast Scala code? - Introduction to FiloDB and its use cases for analyzing streaming and static data - How FiloDB fits into the SMACK stack for event storage and deep data analysis / machine learning - Some interesting use cases, such as streaming support for smart cities / IoT - Integration of Spark DataFrames and Data Sources - When to use Futures, Actors, or neither - Writing a reactive, at-least-once data pipeline with back pressure - Reactive stack metrics and performance monitoring - Filo: summing integers at billions of ops per second, taking advantage of processor cache and SIMD with super fast vector operations - Serialization, GC, and off-heap: how to leverage binary data structures for the win

Speakers
avatar for Evan Chan

Evan Chan

Senior Data Engineer, UrbanLogiq
Evan is currently Senior Data Engineer at UrbanLogiq, where he is using Rust, among other tools, in building robust data platforms to help public servants build better communities. Evan has been a distributed systems / data / software engineer for twenty years. He led a team developing... Read More →


Friday November 11, 2016 10:10am - 10:30am PST
Off by One

10:40am PST

Scaling the Internet of Things with Scala
In early 2014 when we sat down to start building eero’s cloud backend, we had a big decision to make. Which language would we use to get the best performance from our eero WiFi systems. On the backend of eero, we faced a challenge: how would we build a highly available, high performance infrastructure that would be able to communicate with each eero device in customer’s homes? In order to do this correctly, it was important for us to choose an architecture that would scale up and out, without having to constantly rebuild it. We chose Scala as our primary language and Akka for core pieces of our infrastructure. Concurrent requests, long-running workers, and parallelized jobs become very complex, very quickly. And as things scale, the introduction of performance optimizations, like caching, further complicates things. Today, by using Scala, we are able to support asynchronous IO in a high performing, highly concurrent environment. In this talk, I will walk you through eero’s experience using Scala and how it’s helping us scale to serve the Internet of Things.

Speakers
avatar for Amos Schallich

Amos Schallich

Co-founder and VP of Engineering, eero
Amos manages eero’s cloud infrastructure, data analytics, and mobile teams as VP of Engineering. He previously worked at Tagged and BigTent, where he focused on backend server development and scaling. Amos grew up in Mt. Shasta, California, and studied large-scale distributed systems... Read More →


Friday November 11, 2016 10:40am - 11:00am PST
Off by One

10:40am PST

Why the Free Monad Isn't Free
Scala developers love to discuss Monads, their metaphors, and their many use cases. Recognizing that monadic design and development patterns have their place, this talk will discuss the price of implementing the Free Monad in your code - spoiler alert - it's not free. We will define the Free Monad (while proving you don't have to be an expert in category theory to understand) and give you the confidence to know when it is is not the answer in your code. We will also provide alternatives that provide greater maintainability and discuss the tradeoffs in performance and design.

Speakers
avatar for Kelley Robinson

Kelley Robinson

Security Developer Advocate, Twilio
Kelley works on the Account Security team at Twilio, helping developers manage and secure customer identity in their software applications. Previously she worked in a variety of API platform and data engineering roles at startups. Her research focuses on authentication user experience... Read More →


Friday November 11, 2016 10:40am - 11:00am PST
Naming

10:40am PST

GraphQL: IoC makes its way to HTTP ... and it's great!
GraphQL is spec out of Facebook describing a new way to write HTTP APIs. Unlike REST, it gives control to the client. Clients query for the data they want, and servers respond by returning exactly -- and only -- the requested data. This has the advantage of letting clients imagine new ways to use data without the need for additional server side development. GraphQL APIs are described by constructing a strongly typed schema against which queries can be both constructed and validated. This strongly typed schema maps quite well to Scala's type system, and there is a fantastic Scala implementation of the GraphQL spec in Scala called Sangria. In this talk, I'll show some real life examples from our startup justifying why you might want to use GraphQL instead of standard REST. I'll execute some queries against our own API, which ought to illuminate how it works and why it's great. I'll discuss the eco-system around GraphQL, and finally walk through, and do a quick tutorial on how to turn your Scala case classes and functions into a full-blown schema and API.

Speakers
avatar for Dustin Whitney

Dustin Whitney

CTO, Project September
Co-Founder and CTO, Project September


Friday November 11, 2016 10:40am - 11:00am PST
Caching

11:10am PST

NLP in Action at SalesforceIQ
This talk will showcase an NLP pipeline we have built with Scala and Spark at SalesforceIQ to analyze and derive insights from large amounts (several hundreds of millions of examples) of text data. Our stack utilizes EMR, Spark, S3, Avro, Azkaban, OpenNLP, and many elements from the functional programming paradigm (read: semigroups, monoids, and foldMaps, oh my!) to build a scalable and powerful pipeline for extracting rich information from email content. This pipeline currently powers our suggested follow-up feature, which informs customers when they need to follow up with an important email or conversation, in addition to foundational features for other use cases. As data processing pipelines become ubiquitous, and more people turn to Spark and Scala to build such pipelines, this talk will answer some questions of how to effectively go about the task. Elements from functional programming and libraries such as Twitter's algebird, scalaz, or cats allow for natural and efficient implementations of many core aspects of distributed data processing. We couple this with the OpenNLP library to create a data pipeline for the linguistic analysis of text, primarily in the form of email content. This has given us a solid foundation for engineering text based features as well as training text-based models for a variety of supervised learning tasks that outperforms a similar pipeline in traditional Map/Reduce with Java, in a more maintainable and scalable way.

Speakers
avatar for Ascander Dost

Ascander Dost

Lead Software Engineer, Salesforce
Ascander is a lead engineer at SalesforceIQ, where he works on data processing infrastructure, extracting meaning from email messages, and creative cursing. He received a PhD in Linguistics from UC Santa Cruz a long time ago, but seems to have mostly recovered. He enjoys writing Scala... Read More →
avatar for Alexis Roos

Alexis Roos

Sr Engineering Manager, SalesforceIQ
Alexis has over 20 years of software engineering and management experience with emphasis in large scale data science and engineering along with application infrastructure. As an engineering manager at Salesforce, Alexis is managing all back-end engineering for Salesforce IQ CRM which... Read More →


Friday November 11, 2016 11:10am - 11:30am PST
Off by One

11:10am PST

Scala on Rails: Yet Another Web Framework as a Scala Gateway
Skinny is a full-stack web app framework optimized for sustainable productivity for Servlet-based web app development. To put it simply, Skinny framework’s concept is Scala on Rails. If you're already familiar with the concept Ruby on Rails spread, you should be able to promptly work with it. Skinny is not only easy-to-understand one for Rails developers but also designed to be as friendly as possible to Scala beginners.

Skinny's components are independent of Skinny Web framework and are available as small libraries. Many projects I've worked on and am working on happily use Skinny's components such as Skinny ORM in Web applications built with Play, batch processing systems and scripts to automate tasks.

In this talk, I'd like to briefly introduce Skinny and do live coding a little to demonstrate the usability and the productivity of the framework.


Speakers
avatar for Kazuhiro Sera

Kazuhiro Sera

Senior Software Engineer, Salesforce
Scala enthusiast in Tokyo, Japan. Creator of ScalikeJDBC and Skinny Framework. One of the active maintainers of json4s and Scalate.



Friday November 11, 2016 11:10am - 11:30am PST
Naming

11:10am PST

Serving images at Criteo
At Criteo, we serve billions of images every day, and every day this number increases. Our legacy C/C++ solution was not scaling anymore, so we decided to give Finagle try for a first public facing service. In this talk we will present how we did, using Finagle not only to serve these images to our users, but also to fetch them from our partners as well, all at quite a large scale.

Speakers
avatar for Vincent Guerci

Vincent Guerci

Software Engineer, Criteo
Previously a Mobile Developer, now working as Software Engineer on various technologies for applications @criteo which requires scalability...


Friday November 11, 2016 11:10am - 11:30am PST
Caching

11:40am PST

Scio - A Scala API for Google Cloud Dataflow
We will present Scio, a Scala API for Google Cloud Dataflow (incubated as Apache Beam). Apache Beam offers a simple, unified programming model for both batch and streaming data processing while Scio brings it much closer to other high level APIs many data engineers are familiar with, e.g. Spark and Scalding. We will cover design and implementation of the framework, including features like type safe BigQuery macros, REPL and serialization. There will also be a live coding demo.

Speakers
avatar for Neville Li

Neville Li

Software Engineer, Spotify
Neville is a software engineer at Spotify who works mainly on data infrastructure and tools for machine learning and advanced analytics. In the past few years he has been driving the adoption of Scala and new data tools for music recommendation, including Scalding, Spark, Storm and... Read More →


Friday November 11, 2016 11:40am - 12:20pm PST
Off by One

11:40am PST

Spoiled by higher-kinded types
Scala is one of the few languages that have higher-kinded types; a simple feature with profound implications. This talk will explore uses of higher-kinded types and show why they are an indispensable feature.

Speakers
avatar for Adelbert Chang

Adelbert Chang

Lead Data Engineer, Target
Adelbert Chang is a Lead Data Engineer at Target where he works on infrastructure systems for the Data Science and Optimization team. Previously he worked at U.C. Santa Barbara doing research in large-scale graph querying and modeling, and in industry on machine learning systems... Read More →


Friday November 11, 2016 11:40am - 12:20pm PST
Naming

11:40am PST

Finagle Your Own Codec
Many users of Finagle use the HTTP or Thrift protocols. We have built a protocol which uses Google Protobuf on the wire. We'll talk about why you might build your own protocol, how to go about building one, and some of the gotchas we encountered along the way. Attendees will come away with information they need to help decide whether to build their own protocols or use existing ones.

Speakers
avatar for Chris Phelps

Chris Phelps

Principal Software Engineer, Splunk
I've been coding in Java since the early days of the language, and in Scala for the last 4 years. My main areas of focus are in microservices and reactive approaches. As our organization is a polyglot development environment, I'd also love to talk to you about adopting and evangelizing... Read More →


Friday November 11, 2016 11:40am - 12:20pm PST
Caching

1:10pm PST

Endpoint Security with Complex Data
One of the challenges of modern data science is handling complex data structures like structs, arrays, maps, queues, images, etc. in datawarehouse environments (as oppose to flat table representations). In this talk I will cover the existing ways to work with complex data structures with open-source projects like Pig, Hive and Impala as well as how E8 Security used Scala to simplify data pipelines with complex data structures. I will use an example of endpoint security computations. An endpoint in Enterprise Security is any computing device exposed to the clients or customers that request access to the corporate network, which yet cannot be entirely controlled or administered by the network administrators for one or another reason. The challenge is usually solved by additional monitoring of the devices themselves and the network traffic emanating from/to the device. In this particular instance, E8 built a machine learning based solution that tracks the footprint of the system and builds threat models based on the changes in the device footprints, which requires extensive use of complex and nested data structures. Currently E8 Security has customers tracking more than 0.5 million of endpoints running 10,000s different processes.

Speakers
avatar for Alex Kozlov

Alex Kozlov

Architect, E8 Security, Inc.
Just a Humble Big Data Architect


Friday November 11, 2016 1:10pm - 1:30pm PST
Off by One

1:10pm PST

Embedded Logic Programming in Scala
Logic (or relational) programming is perhaps underappreciated compared to its declarative sibling, functional programming. In this talk, we'll examine a relational programming language, miniKanren, embedded in Scala as a DSL. Some problems (layout, scheduling, type inference, logic puzzles) are more clearly expressed as a set of constraints and relations. We'll demonstrate this expressive power by live-coding solutions to a few puzzles and show that we can solve them relationally while still writing what feels like idiomatic, functional Scala. Come listen if you want to witness problem-solving in a programming paradigm that can help us discover solutions to mind-bending puzzles in minutes. As a bonus, these ideas can be re-used for Scala type-level programming.

Speakers
avatar for Stewart Stewart

Stewart Stewart

Software Consultant, Inner Product LLC
Stewart Stewart is a software developer at Driver, a San Francisco based startup that analyzes tumors and connects cancer patients with personalized medicine. He also helps organize events at SF Scala.


Friday November 11, 2016 1:10pm - 1:30pm PST
Naming

1:10pm PST

Streams for (Co)Free!
There are many popular stream libraries for Scala developers, including Akka Streams, scalaz-stream, fs2, plus others in the Java ecosystem. While all excellent choices for building reactive Scala applications, their reliance on effects makes them particularly difficult to test and reason about. In this talk, long-time Scala functional programmer John A. De Goes takes to the stage to demonstrate a new approach to modeling streams that requires less machinery and has more reasoning power, composability, flexibility, and testability than many other approaches. By attending the talk, you'll learn how the best stream library may be the one you get for (co)free!

Speakers
avatar for John A. De Goes

John A. De Goes

Solution Architect, De Goes Consulting
John A. De Goes has been writing Scala software for more than eight years at multiple companies, and has assembled world-renowned Scala engineering teams, trained new developers in Scala, and developed several successful open source Scala projects.Known for his ability to take very... Read More →


Friday November 11, 2016 1:10pm - 1:30pm PST
Caching

1:40pm PST

Simplifying DevOps with SlackBots
The ultimate goal of development and operations engineering is to build fully self-healing production systems. In reality, this is difficult to achieve for a variety of reasons and engineers often have to step in to fix broken processes. Simultaneously, there has been an explosion of content delivery and information management tools to deal with, necessitating potentially many touch-points in order to debug a single failure. This talk will focus on how Slackbots can be used to develop standalone application ecosystems to reduce the complexity of engaging with these different information management tools, and help make devops and engineering teams more effective. I will walk through the implementation of a Slackbot in Scala using Argonaut and Play Framework to monitor a machine learning platform. I will also demonstrate how Slackbots can give teams the ability to monitor and manage their work from mobile devices, enabling them to build very rich sets of features, more easily get real time notifications, updates of system processes, execute commands and re-run jobs on the go. In our experience, we have seen that having Slackbots has helped create a culture that enables developers to work flexibly and collaborate more efficiently thereby improving team morale.

Speakers
avatar for Chalenge Masekera

Chalenge Masekera

Data Scientist, Salesforce
Building scalable products that are sprinkled with some machine learning fairy dust. Democratizing data science to enable businesses to leverage their data for better efficiency and stronger business performance. Also passionate about using data for ICTDs. Beer aficionado with a bias... Read More →


Friday November 11, 2016 1:40pm - 2:00pm PST
Off by One

1:40pm PST

Easy dependency management with coursier
coursier is an attempt at making it easier to deal with Maven / Ivy dependencies, from Scala and the command-line. It's a replacement for Ivy or Aether, rewritten from scratch. It also features a working SBT plugin, and can even be used from Scala JS. It follows functional programming practices in both its very core and its user-facing API, and even makes use of algebraic structures like lattices at times. Its command-line programs allow to effortlessly launch programs from Maven / Ivy artifacts, list dependencies, or update caches. In this talk, we'll review the possibilities offered by its API, its command-line tools, and its SBT plugin.

Speakers
AA

Alexandre Archambault

Software engineer, Teads.tv
Shapeless contributor, author of coursier (dependency management, get-coursier.io), jupyter-scala, argonaut-shapeless, scalacheck-shapeless, ...


Friday November 11, 2016 1:40pm - 2:00pm PST
Naming

1:40pm PST

So, You Want To Be Functional and Reactive? Here is how...
You are a Scala champion, but how about the rest of your organization? Learn from our experience in migrating developers from Java or node.js to the promised land of Scala, Functional Programming and Reactive Architectures. There are many ways to get there, but we were always passionate about finding the path with the least friction and greatest productivity. Come and see what works in building productive Scala developers and teams and what treacherous paths to avoid. Special coverage of Big Data and Spark!

Speakers
avatar for Vladimir Bacvanski

Vladimir Bacvanski

Principal Architect, Strategic Architecture, PayPal
Dr. Vladimir Bacvanski's interest is in better and more productive ways to develop highly scalable and reliable software systems. Before joining PayPal, he was the CTO and founder of SciSpike, a company doing custom development and consulting. His recent projects include Big Data... Read More →


Friday November 11, 2016 1:40pm - 2:00pm PST
Caching

2:10pm PST

Scala: The unpredicted lingua franca for data science
It was true until pretty recently that data scientists’ languages of choice to manipulate and make sense out of data were Python, R, or MATLAB, which led to split in the data science community and duplication of efforts in languages offering similar sets of functionality. Then distributed technologies came out of the blue, most using a convenient and easy-to-deploy platform, the JVM. Data scientists are now part of heterogeneous teams that face many problems and must work toward global solutions together, including a new responsibility to be productive and agile in order to have their work integrated into platforms. This is why technologies like Apache Spark are so important and are gaining this traction from different communities. And even though some bindings are available for legacy languages, all the creative, new ways to analyze data are done in Scala. Using a fully productive and reproducible environment combining the Spark Notebook and Docker, Xavier Tordoir explore what it means to do data science today and why Scala succeeds at coping with large and fast data where older languages fail. Xavier then introduce and summarize all the new methodologies and scientific advances in machine learning that use Scala as the main language, including Splash, mic-cut problem, OptiML, needle (DL), ADAM, and more, and demonstrate how these programs work for data scientists by enabling interactivity, live reactivity, charting capabilities, and robustness in Scala—things that were still missing from the legacy languages.

Speakers
avatar for Xavier Tordoir

Xavier Tordoir

Founder, Data Fellas, Inc.
Xavier started his career as a researcher in Experimental Physics and also focused on data processing. Further down the road, he took part in projects in finance, genomics and software development for academic research. During that time, he worked on timeseries, on prediction of biological... Read More →


Friday November 11, 2016 2:10pm - 2:50pm PST
Off by One

2:10pm PST

All you ever wanted to know about "for", and more...
We will take a deep dive into one of Scala's more advanced language constructs: the for expression. In this coding-example-heavy talk we will cover topics like: * For, it's not about looping * Work on what's inside, ignore what's outside * To yield or not to yield, there is no question * Don't mix your types * Statements in an expression? * Pattern Matches * Guards * In-line assignments * Fors within fors * Using a for with your own types * How it all de-sugars And much more. This talk aims to have something for everyone. Whether just starting out with Scala or a multi-year Scala veteran, there is likely something you didn't know about for in this talk.

Speakers
avatar for Richard Wall

Richard Wall

CEO, Escalate Software
Long time Scala developer, trainer and enthusiast. Started possibly the first Scala user group - Bay Area Scala Enthusiasts. Winner of the inaugural Phil Bagwell award for Scala community work. Scalawag and Java Posse podcast co-host. Hiker, biker, music lover, love to travel.


Friday November 11, 2016 2:10pm - 2:50pm PST
Naming

2:10pm PST

Concurrent Join Calculus in Scala
Join Calculus is a little-known programming paradigm for purely functional concurrency. Join Calculus develops upon the Actor model to make concurrent programming less imperative, type-safe, deadlock-free, and even more intuitive. I give an introduction to Join Calculus and present examples such as the "dining philosophers" problem and a concurrent merge sort. I present a prototype implementation of Join Calculus as an embedded Scala DSL, based on previous work of Philipp Haller and Jiansen He.

Speakers
avatar for Sergei Winitzki

Sergei Winitzki

Senior Software Engineer, Workday Inc.
Theoretical physicist turned software engineer, passionate for functional programming, functional type theory, and declarative domain-specific languages.


Friday November 11, 2016 2:10pm - 2:50pm PST
Caching

3:00pm PST

Complete big data pipeline with Apache Zeppelin
Apache Zeppelin is interactive data analytics environment for computing system. It deeply integrates to Apache spark and many other computing framework and provides beautiful interactive web-based interface, data visualization, collaborative work environment and many other nice features to make your big data pipeline complete. Talk about some scenarios and examples with live demo and discuss how you can integrate Apache Zeppelin into your data pipeline as well as future roadmap.

Speakers
avatar for Moon

Moon

cto, NFLabs
Moon soo Lee is a creator for Apache Zeppelin and a Co-Founder, CTO at NFLabs. For past few years he has been working on bootstrapping Zeppelin project and it’s community. His recent focus is growing Zeppelin community and getting adoptions.


Friday November 11, 2016 3:00pm - 3:40pm PST
Off by One

3:00pm PST

Functor, Monad, Applicative, in plain words
This talk is targeting regular programmers that have not seen anything like this before. I just explain these specific parameterized types, and what they are good for. Tested on our Engineers; they never had a question afterwards.

Speakers
avatar for Vlad

Vlad

contributor, Patryshev
Software developer with an experience in categories and toposes.Teaching logic and formal methods at Santa Clara University.Working as a data engineer at Salesforce.



Friday November 11, 2016 3:00pm - 3:40pm PST
Naming

3:00pm PST

The Future of Services
Microservices has become a difficult term to pinpoint as more people use it to describe various approaches to building service-based applications. Many of these approaches have become anti-patterns to scale, such as sharing code between services and traditional monolithic CRUD data storage strategies. This talk will focus on how to build elastic, resilient service-based applications that can handle tremendous amounts of data in real time, and discuss how to identify and decompose individual microservices.

Speakers
avatar for Jamie Allen

Jamie Allen

Director of Engineering, Starbucks
Jamie is Director of Engineering at Starbucks for the Unified Commerce Platform initiative, a project using Scala, Akka, pure FP and a microservice-based architecture to provide high-availability experiences for Starbucks customers. He previously spent over 4 years working at Typesafe/Lightbend... Read More →


Friday November 11, 2016 3:00pm - 3:40pm PST
Caching

4:00pm PST

Productive technology at scale with Scala
VMs, Containers, Microservices, CQRS, Actors, Futures, Streams, NoSQL, NewSQL, CRDTs. We are awash in more technology today than ever before. How do you choose what to incorporate as you build your startup? What’s critical, and what’s snake oil? Coursera is a platform for education at scale. Although we first started in 2012, we have grown to serve over 20 million learners from every country on the planet. This talk will trace Coursera’s technology evolution from a PHP and Python monolith (bi-lith, if you will) to a modern microservices-based architecture in Scala. Our frontend technology has evolved from JQuery to React, from integrated to independent. Along the way, we grew 4 native apps (iPhone, iPad, Apple TV, and Android), and evolved from DevOps to to No-Ops. We will cover everything from database technology, IO model, build technology, telemetry systems, scaling, and deployment tooling. Come hear about our key successful decisions, our colossal blunders, our mistakes that just happened to work out, and our open sourced projects you can leverage in your own startups!

Speakers
avatar for Brennan Saeta

Brennan Saeta

Staff Software Engineer, Technical Committee Chair, Information Security Officer, Coursera Inc
Brennan Saeta leads the team responsible for the development environment, core libraries, and the common infrastructure powering the production site at Coursera. Since joining the company, Brennan has written deployment tooling, developed internal frameworks, built a secure Docker-based... Read More →


Friday November 11, 2016 4:00pm - 4:40pm PST
Off by One

4:00pm PST

Unzipping Immutability
As Scala developers, we embrace immutable data structures, but their inner beauty sometimes gets overlooked. In this talk we’ll try to uncover it through interactive visualizations, watching the data react to the changes in our code. Having surfaced immutability’s crucial tricks, we’ll move our focus to lens and zippers — handy tools that enable convenient navigation and manipulation of custom immutable domain entities.

Speakers
avatar for Nick Stanchenko

Nick Stanchenko

Feedzai
Nick is a Lisbon-based Scala enthusiast with background in computer science and human-computer interaction. Previously known in the Scala community as the creator of Macroid, a composable UI DSL for Android, Nick is currently fighting fraud with machine learning at Feedzai. His interests... Read More →


teaser gif

Friday November 11, 2016 4:00pm - 5:00pm PST
Naming

4:00pm PST

Akka Streams for Large Scale Data Processing
"With over 50 million members, Credit Karma is the most utilized and trusted personal finance platform in the U.S. To handle tens of millions of Americans’ credit information, we use Akka Streams for high throughput data transfer. We will discuss how we quickly built services using Akka Actors to help us parallelize, parse and send data to our data ingestion service. We then dive into problems we faced and why we chose to move those services to Akka Streams, what we learned, and the tradeoffs along the way. In this panel, Credit Karma shares best practices on how to implement Akka Streams at scale."

Speakers
avatar for Zack Loebel-Begelman

Zack Loebel-Begelman

Senior Software Engineer, Creditkarma
As a senior software engineer on the data and analytics pipeline, Zack’s work allows Credit Karma to provide tailored recommendations for each individual member’s specific financial situation. Zack joined Credit Karma after two years designing and launching data engines to support... Read More →
avatar for Dustin Lyons

Dustin Lyons

Engineering Manager, Credit Karma
I work on Big Data supporting over 60M members at Credit Karma.


Friday November 11, 2016 4:00pm - 5:00pm PST
Caching

5:00pm PST

Scaling Scala Teams
This panel will present the state of software engineering teams in industry, using Functional Programming. Such teams can achieve tremendous success when aligned with company architectures and lead by managers fully appreciative of technology. We'll see whether various myths about recruiting and complexity are true, and show ways to hire, train, and retain the best talent and leadership required to build the best startups and modern enterprises with thoughtful software engineering principles.

Speakers
avatar for Tihomir Bajić

Tihomir Bajić

VP Engineering
VP Engineering, LTSE.com
avatar for Vitaly Gordon

Vitaly Gordon

CEO, Faros AI
avatar for Ity Kaul

Ity Kaul

Tech Lead, Twitter, Inc.
avatar for Tim Perrett

Tim Perrett

Head of Infrastructure Engineering, Verizon
Avid functional programmer, experienced distributed systems engineer and published author. Primarily interested in schedulers, datacenter design, low-latency data access and the application of functional paradigms in large enterprise applications.
avatar for Roy Rapoport

Roy Rapoport

Senior DevOps Engineer, Monitoring, Netflix, Inc
Manager, Insights


Friday November 11, 2016 5:00pm - 6:00pm PST
Caching

6:00pm PST

Happy Hour: Sponsored By AOL.
The Happy Hours at Twitter are legendary, and this will be no exception. On Friday, we welcome all the participants, and celebrate Scala 2.12 release, in style, at scale!

Friday November 11, 2016 6:00pm - 8:00pm PST
Aviator -- Main Reception Area
 
Saturday, November 12
 

9:00am PST

Keynote: Exploring the Unknown With Scala

Whatever your experience level, I'm willing to bet that you want to become a better engineer and problem-solver. You have probably already noticed that exploring unfamiliar technologies, patterns, algorithms and approaches helps you become better at your job, and sometimes even identify what job you want to be in. At this point you probably have a voice in your head saying that exploring, while fun and exciting, takes time. Perhaps you often tell yourself that you're too busy getting things done to indulge your curiosity.

I'm going to tell you the story of my personal confrontation with a dilemma facing every human being on earth: exploration vs. exploitation. I'll share with you the lessons I've learned about becoming a better explorer. I'll discuss how my usage of Scala and engineering mindset have changed as I have gone from developing large-scale production applications to conducting a series of experiments that test hypotheses. I hope you walk out of this talk with a renewed enthusiasm for exploration and a framework that you can use to decide when to explore and when to exploit.




Speakers
avatar for Julie Pitt

Julie Pitt

Director, Machine Learning Infrastructure, Netflix
Julie leads the Machine Learning Infrastructure at Netflix, with the goal of scaling Data Science while increasing innovation. She previously built streaming infrastructure behind the "play" button while Netflix was transitioning from domestic DVD-by-mail service to international... Read More →


Saturday November 12, 2016 9:00am - 9:40am PST
Caching

9:50am PST

Spark and Protocol Buffers - An Awesome Combination
Have you ever been building a project using JSON, and managing the data format became too complex? Schema changes and type safety become a big pain. I will share one of my favorite solutions we used at Google, that many Scala programmers can appreciate: Protocol buffers. Protocol buffers are Google’s cross-platform language-agnostic mechanism for parsing and serializing structured data. With protocol buffers, you define your data schema, and the compiler generates parsers and serializers for your data in many different languages. ScalaPB is a library and code generator that brings the awesome power of protocol buffers to Scala and Spark. ScalaPB takes your data schema and generates case classes, along with parsers, serializers, and even lenses for convenient field updates within deeply nested structures. In this talk, I’ll give a gentle intro to protocol buffers and ScalaPB. I will show how you can use protocol buffers and ScalaPB as your project’s data exchange format to take the pain away from schema evolution and cross-team data sharing. We will do some live coding and build a Spark application that processes millions of protobufs!

Speakers
avatar for Nadav Samet

Nadav Samet

CTO, TrueAccord
Nadav Samet is the CTO and co-founder of TrueAccord, a Scala-based Fintech startup. He started programming when he was five years old and has been passionate about it ever since. Before TrueAccord, he was working at Google, where he helped stabilizing GMail, and then went on to emerging... Read More →


Saturday November 12, 2016 9:50am - 10:30am PST
Off by One

9:50am PST

Scala Scripting
This talk will demonstrate a new script-file format for writing your Scala code. Unlike traditional Scala projects, which are built with SBT or Maven or Ant and edited inside your IDE, Scala Scripts do not need a "project" or "build tool" in order to run. You simply write your code in a single file, and run it. Need code in another script? Simply import it. Need a third-party library? You can import it too. This greatly reduces the barrier to entry of getting started writing Scala code, and allows Scala to be used for common housekeeping work at the command-line, much like Python or Ruby is used today. I will demonstrate this Scala Scripting file format, explain how it works, and what place it could find in the Scala ecosystem today.

Speakers
avatar for Li Haoyi

Li Haoyi

Software Engineer, Dropbox
Haoyi is a software engineer at Dropbox who works on Python/Coffeescript during the day and contributes to the Scala open-source ecosystem at night. He is known for his contributions to the Scala.js project, writing a JVM from scratch in 3000LOC, and doin


Saturday November 12, 2016 9:50am - 10:30am PST
Naming

9:50am PST

Developing microservices with aggregates
The Domain Model pattern is a great way to develop complex business logic. Unfortunately, a typical domain model is a tangled, birds nest of classes. It can’t be decomposed into microservices. Moreover, business logic often relies on ACID transactions to maintain consistency. Fortunately, there is a solution to this problem: aggregates. An aggregate is an often overlooked modeling concept from the must read book Domain Driven Design. In this talk you will learn how aggregates enable you to develop business logic for the modern world of microservices and NoSQL. We will describe how to use aggregates to design modular business logic that can be partitioned into microservices. You will learn how aggregates enable you to use eventual consistency instead of ACID. We will describe the design of a Scala microservice that is built using functional aggregates.

Speakers
avatar for Chris Richardson

Chris Richardson

Founder, Eventuate
Chris Richardson is a developer and architect. He is a Java Champion, a JavaOne rock star and the author of POJOs in Action, which describes how to build enterprise Java applications with frameworks such as Spring and Hibernate. Chris was also the founder of the original CloudFoundry.com... Read More →


Saturday November 12, 2016 9:50am - 10:30am PST
Caching

10:40am PST

This programmer modeled his code after wooden nesting dolls. What happens next will amaze you.
Recursion has been called the GOTO of FP, yet we use it constantly – every `List` operation, every event loop, and myriad other places – it litters our code. So, how can we tame it? Can we at least contain it, if not eliminate it entirely? This talk introduces a new Typelevel library intended to free us from primitive recursion. Matryoshka contains a family of folds, unfolds, and transformations that can be applied to any recursive structure. We’ll learn how to separate the recursion from our logic and write “flat” operations that are more readily checkable by the compiler. We’ll also look at how the library allows us to fuse many transformations into one, handle short-circuiting, and even annotate data structures for free.

Speakers
avatar for Greg Pfeil

Greg Pfeil

Senior Software Engineer, Formation
Greg has been working full-time with pure FP in Haskell and Scala for over six years. He currently abuses laziness for Formation, to extract efficient evaluation from exponential algorithms. He’s also known for inflicting recursion schemes on everyone and designing languages that... Read More →


Saturday November 12, 2016 10:40am - 11:00am PST
Naming

10:40am PST

Top Mistakes When Writing Reactive Applications
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to. In this talk we’re going to discuss the most common things I’ve seen in the field that prevented applications to work as expected. I’d like to talk about typical pitfalls that might cause troubles, about trade-offs that might not be fully understood or important choices that might be overlooked including persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more. This talk should be interesting for anyone who is thinking about, implementing, or have already deployed reactive application. My goal is to provide is to provide comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on Lightbend platform but understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.

Speakers
avatar for Petr Zapletal

Petr Zapletal

Lead Consultant, Cake Solutions
Petr is a Software Engineer who specialises in the design and implementation of highly scaleable, reactive and resilient distributed systems. He is a functional programming and open source enthusiast and has expertise in the area of big data and machine classification techniques. Petr... Read More →


Saturday November 12, 2016 10:40am - 11:00am PST
Caching

10:40am PST

Query Generation Across Multiple Data Stores
In this talk, we’ll discuss how we define and query cubes across multiple data stores for reporting purposes. With a single definition, we are able to decide at query time the best table/data source to answer a given request. We must take into consideration things such as time zone conversion, data availability, supported fact/dim based operations, request granularity, defined constraints, time range of request, and etc. Ultimately, our request is answered using Hive or RDBMS or Druid. This allows us to take advantage of performance characteristics of each data store while also allowing for a single interface for querying. Our goal isn’t to create a unified SQL layer which can be used to query multiple data stores. Our goal is to define a single view of the data where we can define post aggregates or other derived expressions which can later be used to programmatically generate a query for the target data store.

Speakers
avatar for Hiral Patel

Hiral Patel

Technologist, Yahoo Inc
Hiral's been working with Scala for the past 6 years and Big Data for the past 12 years. He's built data platform's, data intensive applications, and real-time analytics frameworks. Hiral is currently a Senior Principal Architect/Engineer at Yahoo Inc.


Saturday November 12, 2016 10:40am - 11:20am PST
Off by One

11:10am PST

The Essence of Functional Structures in Scala
In this talk, you will get a practical overview of various functional structures that allows building powerful, highly-composable and purely-functional abstractions. The primary goal of this session is to delve into these structures and gain insight on applying them in solving real-world problems.

From scratch, you will learn to construct the Scala type classes to represent these structures. Starting with basic constructs such as Semigroups and Monoids, you will continue exploring structures such as Functors, Applicative Functor and Monads. Alongside, from the practical point of view, you will discover the essence of these structures in encoding purely functional programs that are robust, comprehensible and correct-by-construction.

No prior knowledge of functional programming is assumed in this talk.

Speakers
avatar for Adil Akhter

Adil Akhter

Lead Engineer, ING
Functional Programming, Category Theory, Scala, Haskell.



Saturday November 12, 2016 11:10am - 11:30am PST
Naming

11:10am PST

I built an actor system in Rust. Then I built a company on Akka.
From the shear volume of data that the systems we are building at Datalogue need to process there's a couple of things that are constantly on our minds. 1) how do we write correct, distributable programs while maintaining a level of development productivity that allows us to innovate? 2) how do we leverage existing data infrastructures to keep our costs manageable by only storing the data we need to create delightful user experiences? For a while, I thought that would mean squeezing every ounce of performance low level performance and having the ability to write correct distributed programs. This first hypothesis is why I went from actor system in Rust to building a company on Akka. My talk will focus on the transition from building an actor system using a new highly, hyped, low level language like Rust to leveraging the power of an existing framework like Akka and the beauty of Scala.

Speakers
avatar for Tim Delisle

Tim Delisle

Co-founder and CEO, Datalogue
Deep learning geek with a passion for Scala and Rust.


Saturday November 12, 2016 11:10am - 11:30am PST
Caching

11:40am PST

Algebird and the coming Grand Unification
Algebird is Twitter's library for algebra which models many big data algorithms as monoids and semigroups (and more). In this talk we will see how to use algebird with scalding and spark. Finally we will see how it Will integrate with spire and cats via the typelevel algebra project.

Speakers
avatar for Oscar Boykin

Oscar Boykin

Machine Learning Infrastructure, Stripe
Oscar is the creating of Scalding, Summingbird, and Algebird, and is an overall professor and mathematician turned software magician.


Saturday November 12, 2016 11:40am - 12:20pm PST
Off by One

11:40am PST

Toward a Safer Scala
Scala has many, well-known WTF LOLs that result from the language too enthusiastically attempting to help the programmer. Scala can also be written across a wide range of styles from the Typelevel to the Better Java. The Scala ecosystem has static analysis and linting tools which can help avoid confusing behavior, baffling compiler messages, and divergent coding styles. A survey of the available tools will prepare attendees to determine which can be best applied in greenfield and existing projects. A case study of a team using static analysis tools over many years will also be presented. Attendees will leave with steps to immediately improve their production builds and strategies to introduce more sweeping changes with time.

Speakers
avatar for Leif Wickland

Leif Wickland

Software, Rubicon Project
In one of his first jobs, Leif wrote software for livestock auction yards. His boss also owned the town newspaper. When the only reporter fell ill, Leif played journalist for a couple days. He published pieces he's glad don't appear online. His second most noteworthy accomplishment... Read More →


Saturday November 12, 2016 11:40am - 12:20pm PST
Naming

11:40am PST

Building High Performance Microservices with Colossus
Colossus is a Scala I/O and microservice framework that has been built from the ground up with a strong focus on performance and simplicity. While primarily aimed at building low-latency services, it can serve as a fully generalized I/O layer for virtually any application. Whether you're doing low-level bit-pushing or making high-level distributed services, Colossus is here to help. In this talk I'll cover the basics of what Colossus can do, how we made it fast, and how we use it at Tumblr to power some of our platform's most popular features.

Speakers
avatar for Dan Simon

Dan Simon

Staff Engineer, Tumblr
Dan Simon is a Staff Engineer at Tumblr, specializing in distributed systems and platform services. Over the last 5 years Dan has worked on a wide variety of projects across Tumblr's infrastructure and is the lead developer of Colossus, a lightweight service framework that powers... Read More →


Saturday November 12, 2016 11:40am - 12:20pm PST
Caching

1:10pm PST

Versatile Scala notebooks with Jupyter + Jupyter Scala
Jupyter Scala is a versatile and modular Scala kernel for Jupyter (https://jupyter.org/). Based off a modified version version of Ammonite (https://github.com/lihaoyi/Ammonite), it aims at making it easy to extend the notebook capabilities through an API, allowing an easy interaction between user libraries and both the notebook client (for plotting, pretty-printing, ...) and the interpreter internals (for bridges towards Spark, Flink, ...). Thanks to its modularity, it is usable for fast prototyping, data science, as a day-to-day graphical REPL, etc.

Speakers
AA

Alexandre Archambault

Software engineer, Teads.tv
Shapeless contributor, author of coursier (dependency management, get-coursier.io), jupyter-scala, argonaut-shapeless, scalacheck-shapeless, ...


Saturday November 12, 2016 1:10pm - 1:30pm PST
Off by One

1:10pm PST

Building a metadata store with Akka
Akka-http is the (awesome) extension of Akka that enables processing HTTP messages as a stream, and gives functional developers a powerful toolkit for creating web apps. At Salesforce we used akka-http to build a metadata store that enables ingesting data to S3 and custom processing using the actor model. My talk will provide code examples and lessons we learned while working with akka-http.

Speakers
avatar for Jean-Marc Soumet

Jean-Marc Soumet

Lead Data Engineer, Salesforce
I studied CS at EPITA in Paris, France. I moved to the US in 2004, to work in Information Security, and Supply Chain Management. In 2014 I completed my MBA at Santa Clara University, and got through the Insight Data Engineering fellowship program. Since then I have been working... Read More →


Saturday November 12, 2016 1:10pm - 1:30pm PST
Naming

1:10pm PST

Scaling out a Rails app with Finagle
A lot of startups use Ruby on Rails for fast development, which makes sense -- it’s great for building CRUD apps quickly and cheaply. But building your app with Rails has it’s own drawbacks: it can get huge and messy, runtime is slow, and concurrency is an afterthought. While that shouldn’t stop you from using the framework, it’s important to think about longterm scalability during development. At Brigade, we have been developing a social network for politics using Rails. However, with time and more users, we decided to migrate more complex and modular realtime computations to Finagle micro-services. This talk will discuss the technolgies we selected (Scala, Finagle, Thrift) and how we addressed issues of debuggability, data-access, and integrating in various environments.

Speakers
SB

Stephanie Bian

Software Engineer, Brigade
Stephanie is a data engineer at Brigade, where she works on data infrastructure, analytics, and search & relevance using Scala, Finagle, Spark, Kafka, and Elasticsearch. Previously, she worked on the Trends team at Twitter, where she was initiated into the Scala community.


Saturday November 12, 2016 1:10pm - 1:30pm PST
Caching

1:40pm PST

Spark DataFrames for Data Munging
Have you ever been handed a few hundred gigabytes of data collected by someone else? What’s in that data and how will you analyze it? Data munging is a messy job that most data engineers and data scientists have to deal with. When it needs to be done at scale, one of the best tools for the job is the Spark DataFrame Scala API. DataFrames were first introduced in Spark 1.3, with major improvements in 1.4-1.6, and 2.0.

In this talk, you’ll learn the top reasons why Spark DataFrames, when combined with notebooks, are great for data exploration and data munging:
* Spark is fast, interactive, and scalable.
* Built-in support for semi-structured input, namely JSON.
* Summary statistics and approximate counting for quick overviews of a data set.
* Language-integrated SQL and UDFs for querying the data.
* Numerous utility functions for math, string, and date-time manipulation.
* Datasets, in Spark 2.0, for functional transformations.

This talk will include a live demo of the Spark DataFrame Scala API for data exploration and data munging on a real data set, with a Zeppelin notebook. The data set will be Tweet data in JSON format. The speaker is a data analytics developer who has been data munging with Spark DataFrames since they were first introduced in Spark 1.3. She has over ten years experience developing analytics and data pipelines at scale.

Speakers
avatar for (Susan) Xinh Huynh

(Susan) Xinh Huynh

Software Engineer, Mesosphere
Susan is a data analytics developer who has been data munging with Spark DataFrames since they were first introduced in Spark 1.3. She has over ten years experience in analytics, big data, and data science. She is currently working on the Mesos - DC/OS big data stack at Mesosphere... Read More →


Saturday November 12, 2016 1:40pm - 2:00pm PST
Off by One

1:40pm PST

Random Data Generation with ScalaCheck
ScalaCheck is a well-known library for property-base testing. However, property-base testing is not always possible when side effects are involved, for example when writing an integration test that involves data being stored in a database. When writing non-property-base tests, we often need to initialise some data and then verify some assertions on it. However, manual data generation can make our data biased and stop from spotting bugs in our code. Having our data generated randomly not only it would make our test less biased, but it will also make it a lot more readable by highlighting what part of our data are actually relevant in our test. In this talk we will discuss how to reuse some of the existing ScalaCheck code to generate random instances of given types and how these can be combined to generate random case classes. We will analyse the properties of a ScalaCheck generator and provide examples of how we can manipulate existing generators to meet our needs. Finally, we will show how random data generation can also be used in development to restore our data-driven-application in a particular state.

Speakers
avatar for Daniela Sfregola

Daniela Sfregola

Blogger and Software Engineer, Daniela Tech LTD
Daniela Sfregola is a Software Consultant based in London, UK. She has worked as Java developer before moving towards Scala. She is an active contributor to the Scala Community and a passionate blogger at danielasfregola.com.


Saturday November 12, 2016 1:40pm - 2:00pm PST
Naming

1:40pm PST

Functional API for defining type safe, reliable Akka actors
Akka Typed is a fairly new module of Akka, providing a statically type safe way to define actors, and interact with them. However, still being an experimental module, its integration with various other components of Akka is not yet complete. For example, it provides no support for defining event sourced (persistent) actors. This talk will present a functional API we built on top of Akka Typed. This API makes it easy to define type safe event sourced actors, and it is also integrated with Akka cluster sharding. The talk will describe how various functional programming techniques allow us to provide a convenient and flexible API offering static type safety. For example, free monads allow us to define a DSL for event sourcing (using an "actor algebra"), and type classes make the system extensible by end users. A sketch of the implementation will also be presented, demonstrating some practical aspects of these techniques. We will also discuss the advantages of our typed, monadic persistence API over the untyped one provided by Akka Persistence, and even look at some possibilities for further improvements.

Saturday November 12, 2016 1:40pm - 2:00pm PST
Caching

2:10pm PST

Beyond Shuffling: Scaling Apache Spark
This session will cover personal & community experiences scaling Spark jobs to large datasets and the resulting best practices along with code snippets to illustrate. The planned topics are: - Using Spark counters for performance investigation - Spark collects a large number of statistics about our code, but how often do we really look at them? We will cover how to investigate performance issues and figure out where to best spend our time using both counters and the UI. - Working with Key/Value Data - Replacing groupByKey for awesomeness groupByKey makes it too easy to accidently collect individual records which are too large to process. We will talk about how to replace it in different common cases with more memory efficient operations. - Effective caching & checkpointing - Being able to reuse previously computed RDDs without recomputing can substantially reduce execution time. Choosing when to cache, checkpoint, or what storage level to use can have a huge performance impact. - Considerations for noisy clusters - Functional transformations with Spark Datasets - How to have the some of benefits of Spark’s DataFrames while still having the ability to work with arbitrary Scala code

Speakers
avatar for Holden Karau

Holden Karau

Developer Advocate, Google
Holden Karau is a transgender Canadian open source developer advocate at Google focusing on Apache Spark, Beam, and related big data tools. Previously, she worked at IBM, Alpine, Databricks, Google (yes, this is her second time), Foursquare, and Amazon. Holden is the coauthor of Learning... Read More →


Saturday November 12, 2016 2:10pm - 2:50pm PST
Off by One

2:10pm PST

Finding Simple in Scala
You have a mission to write software. You have chosen to use Scala, or perhaps, Scala has been chosen for you. Scala abundance of features offers many potential paths to your goal. How do you find your way? In this talk, Bill Venners will look draw on his almost ten years of Scala programming to give insights into using it effectively. First, he'll try to convince you to strive for the elusive property of "simplicity" in your code and designs. Then, he'll show you how you can achieve simplicity in design if Scala is the language in which you must express your design.

Speakers
avatar for Bill Venners

Bill Venners

Principal, Artima
Bill Venners is president of Artima, Inc., publisher of Scala consulting, training, books, and developer tools. He is the lead developer and designer of ScalaTest, an open source testing tool for Scala and Java developers, and Scalactic, a library of utilities related to quality... Read More →


Saturday November 12, 2016 2:10pm - 2:50pm PST
Naming

2:10pm PST

Implement a scalable statistical aggregation system using akka
At Symantec email security group, a common problem we face is to aggregate multiple metrics with different time granularity in real-time from hundreds of millions emails per day. Various existing solutions try to address the problem by using batch and/or streaming algorithms. Often such approach requires the use of many different technologies and are expensive to run. Another approach is to use statistical data structures such as Count Min Sketch that can greatly reduce the overheads of storage and processing at the cost of accuracy. However, implement such algorithms at large scale poses several problems. In this talk, we introduce Algegate (algebra + aggregate) a pure statistical, distributed platform implemented using Akka. It is designed to be fault-tolerant, back-pressure compliant and easily to scale out to multiple nodes.

Speakers
avatar for Stanley Nguyen, Vu Ho

Stanley Nguyen, Vu Ho

Software Engineer, Symantec
Stanley Nguyen is a software developer in Email Security group at Symantec where he helps to build a high availability big data platform, write high performance backend services and develop interactive visualisation interfaces. He writes a majority of code in Scala, Go and NodeJS. Vu... Read More →


Saturday November 12, 2016 2:10pm - 2:50pm PST
Caching

3:00pm PST

One-Click Deploy Spark ML + TensorFlow AI Models
In this completely 100% Open Source demo-based talk, Chris will be addressing an area of machine learning and artificial intelligence that is often overlooked:  the real-time, end-user-facing "serving” layer in a hybrid-cloud and on-premise deployment environment using Jupyter, NetflixOSS, Docker, and Kubernetes.
Serving models to end-users in real-time in a highly-scalable, fault-tolerant manner requires not only an understanding of machine learning fundamentals, but also an understanding of distributed systems and scalable microservices. 
Chris will combine his work experience from both Databricks and Netflix to present a 100% open source, real-world, hybrid-cloud, on-premise, and NetflixOSS-based production-ready environment to serve your notebook-based Spark ML and TensorFlow AI models with highly-scalable and highly-available robustness.

Speakers
avatar for Chris Fregly

Chris Fregly

Solution Architect, AI and machine learning, AWS


Saturday November 12, 2016 3:00pm - 3:20pm PST
Caching

3:00pm PST

Processing 100's of TB of Genomic Data With ADAM And Toil
Modern genome sequencing projects capture hundreds of gigabytes of data per individual. In this talk, we discuss recent work where we used the Spark-based ADAM tool to recompute genomic variants from 70TB of reads from the Simons Genome Diversity dataset. ADAM presents a drop-in, Spark-based replacement for conventional genomics pipelines like the GATK. We ran this computation across hundreds of nodes on Amazon EC2 using Toil, a novel cluster orchestration tool. Toil was used to automatically scale the number of nodes used, and to seamlessly run large single node jobs and Spark clusters in a single workflow. By combining ADAM and Toil, we are able to improve end-to-end pipeline runtime while taking advantage of the EC2 Spot Instances market. Additionally, Toil is designed for scientific reproducibility, and our entire workflow was run using Docker containers to ensure that there is a static set of binaries that could be used to reproduce the pipeline at a later date. ADAM and Toil are both freely available Apache 2 licensed tools.

Speakers
FA

Frank Austin Nothaft

Research Assistant, UC Berkeley AMPLab
Frank is a PhD student at UC Berkeley, working in the AMP and ASPIRE labs with David Patterson and Anthony Joseph. Frank's research is focused on scalable systems for processing genomic data. He works on the ADAM/Big Data Genomics project which seeks to build open source tools for... Read More →


Saturday November 12, 2016 3:00pm - 3:40pm PST
Off by One

3:00pm PST

Scala.meta: the past, the present, and the future
Scala.meta has delivered on its promise to become a foundation for next-generation tooling in Scala. Recently released v1.0 is capable of capturing Scala code exactly as it is written - with all the original formatting and attention to minor syntactic details - opening unique possibilities for tool authors. Codacy’s Scala engine and Scalafmt are already taking full advantage of scala.meta’s features, and we expect more tools to follow. Join us for a whirlwind tour of existing functionality and a sneak peek into our plans for the future.

Saturday November 12, 2016 3:00pm - 3:40pm PST
Naming

3:30pm PST

Reactive Resumes
We are developing a resumes processing pipeline enabling real-time, high scale resume parsing and HR intelligence: we analyze resume files (doc/pdf) in bulk identifying key relevant data points and metrics (candidate code, portfolio, abilities, education, experience) and employ this context to provide rich market insights and actionables in the HR process (identifying the most qualified candidates for given job descriptions, providing salary forecasts for each resume, identifying candidates similar to each other, exposing latent candidate insights). Our mission is speeding up and improving the hiring/capacity planning/contingency work force management processes by helping recruitment-focused entities to identify the relevant and fresh talent in the market, an insights-driven noise filter in HR.

Speakers
avatar for Adrian Mihai

Adrian Mihai

co-founder / cto, opening.io
Automation, Reactive systems, AI, HR, Partnerships & Investment


Saturday November 12, 2016 3:30pm - 4:00pm PST
Caching

4:00pm PST

Robust Stateful Stream Processing with Apache Flink
Jamie Grier will give a talk and hands-on demonstration of some of the advanced features available in Apache Flink. The talk will focus on features unique to Flink that allow one to achieve truly robust stateful computation over data streams. Topics covered will be: * Event Time vs Processing Time and why it matters * Robust handling of state * Handling failures * Handling code or cluster upgrades without losing state * Apache Flink's Savepoints * Handling data re-processing (after code changes, bug fixes, etc)

Speakers
avatar for Jamie Grier

Jamie Grier

Director of Applications Engineering, data Artisans
Jamie Grier is now Director of Applications Engineering at data Artisans where he’s extremely excited to be able to help others realize the potential of Flink in their own projects. His goal is to help others design systems to solve challenging problems in the real world. Jamie... Read More →


Saturday November 12, 2016 4:00pm - 4:40pm PST
Off by One

4:00pm PST

The False Economy of OOP
For the better part of the last three decades, object-oriented programming has served as the goto strategy for building complex software systems. Yet, much of this time a far more ancient model, functional programming, has been confined to academia. Then at the turn of the century: something peculiar happened -- FP experienced a resurgence in the industry. Yes, for many, it was just a fad, a passing curiosity, if you will. But why the sudden interest? And even more curiously: as FP has gained popularity, why do we have some companies embracing it with enthusiasm, while others have tossed it aside -- rushing back to the familiar world of OOP? The answer is that we are entering the most turbulent time for FP, a time where both its challenges as well as its promises are gaining more attention. Suddenly, what would seem to be completely esoteric questions: "what does it mean to build software?", we are now being forced to confront. And with aging OOP software implementations, and much of the industry clinging onto them: the false economy of OOP -- that is, it's failed promises and innate source of incidental complexity -- is becoming more pronounced. In this talk we will explore the core issues with OOP, and explore exactly why functional programming is becoming the preferred alternative.

Speakers
avatar for Ryan Delucchi

Ryan Delucchi

Principal Engineer, AOL
Ryan Delucchi has worked in the software industry for 12 years at various companies, including TiVo, Hotwire and Netflix. Furthermore, he has built a scripting engine for processing search results and mass transit route validation at startups Mobile Content Networks and Urban Mapping... Read More →


Saturday November 12, 2016 4:00pm - 5:00pm PST
Naming

4:00pm PST

Finagle/Finatra QnA
Twitter's Core System Libraries team will be hosting a QnA session/round table on Finagle/Finatra and the rest of Twitter's RPC stack. Join us to discuss the past, the present, and the future of the OSS RPC layer powering the infrastructure of companies like Twitter, SoundCloud, Pinterest, and many more.

Moderators
avatar for Vladimir Kostyukov

Vladimir Kostyukov

Software Egineer, Twitter, Inc
Finagle contributor. Finch maintainer.

Speakers
avatar for Bryce Anderson

Bryce Anderson

Software Engineer, Twitter
Since 2016 Bryce has been with Twitter's Core System Libraries team working predominantly on the Finagle RPC library. Bryce enjoys long walks through RFC's and analyzing the potential for graph-wide meltdowns in service-mesh load balancers.
avatar for Christopher Coco

Christopher Coco

Staff Software Engineer, Twitter, Inc.
Co-creator and maintainer of the Finatra Scala services framework.
avatar for Jillian Crossley

Jillian Crossley

Software Engineer, Twitter
I work on Finagle @Twitter
avatar for Moses Nakamura

Moses Nakamura

Software Engineer, Twitter
avatar for Ryan O'Neill

Ryan O'Neill

Senior Software Engineer, Twitter
avatar for Kevin Oliver

Kevin Oliver

Twitter Inc


Saturday November 12, 2016 4:00pm - 5:00pm PST
Caching

5:00pm PST

Modern Software Architectures and Data Pipelines
Throughout our four-year history, Scala and Scale By the Bay is leading the way on evangelizing and understansing modern software architectures. We have the best set of them here, including Akka, Kafka, Spark, Finagle, Lagom, and so on. How do they come together in a SMACK / MIND Stack? What are the best practices to follow and pitfalls to avoid? This panels of experienced practitioners will discuss and illuminate it all.

Speakers
avatar for Helena Edelson

Helena Edelson

CEO, The Axis Initiative
Helena is using AI and complex adaptive systems to study and help endangered species under climate change, biodiversity loss, human-wildlife conflict and illegal wildlife trade. Bridging academia and industry, she is a member of the Environmental Intelligence team of the Interagency... Read More →
avatar for Chris Fregly

Chris Fregly

Solution Architect, AI and machine learning, AWS
avatar for Calvin Jia

Calvin Jia

Software Engineer, Alluxio
Calvin Jia is the top contributor of the Alluxio project. He has been involved as a core maintainer and release manager since the early days when the project was known as Tachyon. Calvin has a B.S. from the University of California, Berkeley.
avatar for Rajesh Muppalla

Rajesh Muppalla

Co-Founder and Director of Engineering, Indix
Rajesh leads the data platform team at Indix thats building the world's largest source of structured product information. Over the last 5 years, his team has built several data pipelines using Akka, Scalding, Kafka and more recently Akka-Streams and Spark. He has also blogged and... Read More →
avatar for Nikita Shamgunov

Nikita Shamgunov

Chief Technology Officer and Co-Founder, MemSQL
Nikita Shamgunov co-founded MemSQL and has served as CTO since inception. Prior to co-founding the company, Nikita worked on core infrastructure systems at Facebook. He served as a senior database engineer at Microsoft SQL Server for more than half a decade. Nikita holds a bachelor’s... Read More →
avatar for Reynold Xin

Reynold Xin

Chief Architect, Databricks
Reynold Xin is a co-founder and Chief Architect at Databricks, where he oversees the company's Spark development. He was the release manager for Spark's 2.0 release, and the driver behind most of the major recent changes in Spark, e.g. DataFrame API, Project Tungsten. Prior to Databricks... Read More →


Saturday November 12, 2016 5:00pm - 6:00pm PST
Caching

6:00pm PST

Happy Hour: Sponsored By 47 Degrees
The second and final Happy Hour of Scalæ By the Bay 2016 follows the Data Pipelines panel.

Saturday November 12, 2016 6:00pm - 8:00pm PST
Aviator -- Main Reception Area
 
Sunday, November 13
 

9:00am PST

Keynote: Apache Kafka, Stream Processing, and Microservices
Abstract: Microservices are rightfully associated with REST, but a lot of what a company does isn't easily modeled by a blocking request and response. What is the right way to build this type of asynchronous service? This talk will make the case for Apache Kafka as a platform for event-drive apps. It will show the relationship between this type of service and the emerging paradigm of stream processing, and will introduce Kafka Streams, a powerful distributed stream processing client that is part of Apache Kafka.

Speakers
avatar for Jay Kreps

Jay Kreps

Jay (@jaykreps) is co-founder and CEO at Confluent. Prior to Confluent, Jay Kreps was the initial developer on several open source projects, including Apache Kafka, Apache Samza, Voldemort. He was the lead architect for data infrastructure at LinkedIn.


Sunday November 13, 2016 9:00am - 9:40am PST
Caching

9:50am PST

Twitter Heron on YARN/REEF
Twitter Heron is the next generation streaming system that has been in production for more than 2 years. Heron is used for diverse use cases including but not limited to ETL, BI, machine learning and media processing. In this talk, we will talk about how Microsoft is improving the Heron real-time streaming engine. First, we will discuss how one can deploy Heron with the widely used YARN scheduler to seamlessly integrate with the popular Hadoop solutions stack. Second, we will present the ongoing efforts aimed at making Heron use hardware resources more efficiently by optimizing performance and resource utilization of Heron topologies. Finally, we will round off the talk with a peek at the exciting upcoming future work.

Speakers
avatar for Ashvin Agrawal

Ashvin Agrawal

Microsoft
Ashvin is a software engineer with a work experience of more than 10+ years. He specializes in developing large scale distributed systems. Ashvin is currently a Senior Research Engineer at Microsoft where he works on streaming systems and contributing to Twitter Heron project. Ashvin... Read More →
AF

Avrillia Floratou

Avrillia is a Senior Scientist at Microsoft's Cloud and Information Services Lab, where her research is focused on scalable real-time stream processing systems. She is also an active contributor to Heron, collaborating with Twitter. Prior to her current role, she was a research scientist... Read More →
avatar for Karthik Ramasamy

Karthik Ramasamy

Engineering Manager, Twitter
Karthik is the engineering manager for Real Time Compute at Twitter and co-creator of Heron. He has two decades of experience working in parallel databases, big data infrastructure and networking. He cofounded Locomatix, a company that specializes in realtime streaming processing... Read More →


Sunday November 13, 2016 9:50am - 10:30am PST
Off by One

9:50am PST

Functional Database Strategies
As a functional programmer, your collections and global caches are off-limits to mutation. But what about your database? If you're still working with mutable table rows, it's time to learn how to make even the stodgiest, legacy-filled relational database functional. In this talk we'll look at why & how to use immutable tables and event-sourced persistence. We'll cover strategies that'll work with your relational database and some straightforward NoSQL solutions that won't require a team of admins.

Speakers
avatar for Jason Swartz

Jason Swartz

Software Developer, Twitch
Building the next generation of scalable edge services at Twitch. Author of Learning Scala (O'Reilly Media, 2014)


Sunday November 13, 2016 9:50am - 10:30am PST
Naming

9:50am PST

Compositional Streaming with FS2
In recent years, a number of open source Scala libraries have appeared that support working with data streams. In this talk, we’ll look at Functional Streams for Scala (FS2), the library formerly known as Scalaz Stream, and explore its unique take on stream processing. We’ll look at working with impure data sources in a pure world, data transformations, and patterns for stream based program design.

Speakers
avatar for Michael Pilquist

Michael Pilquist

Distinguished Engineer, Comcast
Michael Pilquist is the author of Scodec, a suite of open source Scala libraries for working with binary data, and Simulacrum, a library that simplifies working with type classes. He is a committer/maintainer on a number of other projects in the Scala ecosystem, including Cats and... Read More →


Sunday November 13, 2016 9:50am - 10:30am PST
Caching

10:40am PST

Distributed Commit Logs with Apache Kafka
Apache Kafka was created at LinkedIn as a resilient and scalable distributed commit log providing a traditional publish / subscribe interface. Now open source through Apache, Kafka is being used by numerous large enterprises for a variety of use cases. This session will introduce the basics of Kafka and walk through some code examples that will show how to begin using it.

Speakers
avatar for James Ward

James Ward

Developer Advocate, Google Cloud
James Ward is a nerd / software developer who shares what he learns with others though presentations, blogs, demos, and code. After over two decades of professional programming, he is now a self-proclaimed Typed Pure Functional Programming zealot but often compromises on his ideals... Read More →


Sunday November 13, 2016 10:40am - 11:00am PST
Off by One

10:40am PST

Beyond GoF: functional programming patterns in Scala
In this talk, the one to one mappings of the classic GoF patterns to their functional counterparts will be demonstrated. Where the above is not possible, the conceptual analogs will be proposed. Most of the examples will be accompanied with a real life production code. This is a beginner\intermediate level talk.

Speakers
avatar for Kostiantyn Lukianets

Kostiantyn Lukianets

IT Specialist Scrum Master, ING
Java/Scala developer, Scrum Master and all around IT enthusiast.


Sunday November 13, 2016 10:40am - 11:00am PST
Naming

10:40am PST

Finding the Free Way
Free Monads are quickly being adopted as the best technique for developing in a pure functional style. Unfortunately, the details for how to best apply them is often left as "an exercise for the reader." Recently my team began using Free Monads to build Web Services within the Play Framework. We wanted to use Free Monads in an easy to follow way with minimum boilerplate, while still slotting naturally into the Play Framework. In this talk I'll outline how we took some wrong turns, hit a few potholes, but ultimately found a way to use Free that works for us.

Speakers
avatar for David Cleaver

David Cleaver

Senior Principal Engineer, Comcast
Dave Cleaver is a Senior Principal Engineer at Comcast designing and implementing scalable Web Services and Platforms. He has spent the last two years developing and championing solutions in Scala. His interests include AI planning, distributed systems, programming languages, and... Read More →


Sunday November 13, 2016 10:40am - 11:00am PST
Caching

11:10am PST

Meta Data Science: When all the world's data scientists are just not enough
Due to privacy concerns and the nature of SAAS businesses, platforms like CRM systems often have to provide intelligent data-driven features that are built from many different unique, per-customer machine learnt models. In the case of Salesforce, this entails building hundreds of thousands of models tuned for as many distinctly different customers for any given data-driven application. In this talk I will describe our home grown scala and SparkML-based machine learning platform that has the following characteristics: - Automated feature engineering resulting in much quicker modeling turnarounds and higher accuracy than general purpose modeling libraries such as scikit-learn. - Automatic hyper-parameter optimization, feature selection and model selection resulting in a very good model for every specific customer of the product. - Modular workflows and transformations that complement systems like SparkML and KeystoneML. - Huge scale that enables training thousands of model per day. This talk will give the audience a good idea of which parts of the typical machine learning pipeline are easier to automate, and which are harder.

Speakers
avatar for Shubha Nabar

Shubha Nabar

Senior Director, Data Science, Salesforce Einstein
Shubha Nabar is a senior director of data science at Salesforce Einstein, where she and her team enable the hundreds of thousands of Salesforce-driven businesses to make smarter decisions by providing advanced AI capabilities through the Salesforce platform. She has 8 years of experience... Read More →


Sunday November 13, 2016 11:10am - 11:30am PST
Off by One

11:10am PST

The Client-Side Apocalypse
Imagine building performant, dynamic web applications with pure Scala. With the release of Lift 3.1, you can! Where Scala.js delivers on the promise of pure Scala, Lift 3.1 can deliver the same but without burdening the browser with large code downloads. Unlike with client-side rendering where the application needs a minimum of three round trips to the server before page load is complete, server-side rendering can load pre-rendered HTML immediately. In previous versions of Lift, the developer must hand-write JavaScript or use a framework to achieve dynamic views. Now virtual-DOM diffing has been added to Lift’s powerful, secure framework allowing you to only write enough code to render application state into a page. In this way, you keep your application code purely functional without writing routines that mutate the state of the DOM. Not only is the burden on the browser lighter, but also by only updating the DOM after changes are committed on the server, the user learns to trust the application actually did what the user sees.

Speakers
avatar for Joe Barnes

Joe Barnes

Senior Software Engineer, AOL / go90
Joe Barnes is currently a Senior Software Engineer at AOL where he develops backend analytics applications and devops for go90. He has spent most of the last decade developing applications on the JVM, with Scala taking focus in late 2012. His contributions to the development community... Read More →


Sunday November 13, 2016 11:10am - 11:30am PST
Naming

11:10am PST

Deep Learning Around Us
We are going to investigate 4-5 real life examples of ordinary companies achieving extraordinary results by deploying deep learning technologies. We are going to look at the their initial motivation, what worked well, what was challenging and the results achieved. This talk is designed for people without deep learning experience.

Speakers
avatar for Alex Ermolaev

Alex Ermolaev

Strategic Alliances, Nvidia
I work with visionary engineers building future artificial intelligence applications and platforms.


Sunday November 13, 2016 11:10am - 11:30am PST
Caching

11:40am PST

Building a Machine Learning Orchestration Framework in Scala
A scaleable and adaptable machine learning platform is essential for an organization to harness the full potential of their data. This talk introduces Meson, a General Purpose Workflow Orchestration and Scheduling framework. Meson is a Scala application that manages heterogeneous workloads integrating Spark, Python, R, Docker and other toolkits. We will demonstrate how a simple user interface and a DSL allows data scientists to design and run machine learning pipelines that seamlessly exchange data and artifacts between disparate jobs. We are working on open sourcing Meson and by this talk it may very well be open sourced.

Speakers
KS

Kedar Sadekar

Sr. Software Engineer, Netflix Inc
I am a Senior software engineer on the Personalization Infrastructure team at Netflix that builds scalable, distributed computing systems for the algorithmic engineers that help improve member personalization. Interested in speaking about ML infrastructure challenges and solutions... Read More →
avatar for Davis Shepherd

Davis Shepherd

Senior Software Engineer, Netflix
I started working with Scala when I was writing applications on top of Apache Spark. I spent 4 years building re-enforcement learning systems with these tools. I now work at Netflix, developing their next generation of ML workflow management software all in Scala


Sunday November 13, 2016 11:40am - 12:20pm PST
Off by One

11:40am PST

Don’t Blow Your Stack: Recursive Functions for Beginners in Scala
Recursion is a fundamental building block of algorithmic design and a corner stone of many functional data structures. The Scala compiler has direct support for tail recursion and indirect support for mutually recursive function calls with a compiler plugin. So what makes a function recursive and what shape must it have to allow for tail call elimination? For that matter, what is tail call elimination? What does it mean to be mutually recursive? How does all this get translated into working, non-stack blowing code at compilation time? In this beginner-friendly look at all things recursion we’ll show how to take an imperative function and transform it into a recursive function. Through live code examples, attendees will see how to keep code referentially transparent without the need for external mutable variables while correctly handling state between “loops." Finally, we’ll introduce a new compiler plugin that adds mutual tail recursion and demonstrate how it can simplify complex, twisted logic.

Speakers
avatar for Owein Reese

Owein Reese

MediaMath
A hands on manager who continues to write code at work and in his spare time. His group is distributed across the country and he loves being able to work and hire anyone from anywhere. At MediaMath, they are responsible for globally deployed systems capable of handling peak traffic... Read More →


Sunday November 13, 2016 11:40am - 12:20pm PST
Naming

11:40am PST

Numbers every Scala Programmer should know
Writing highly performant code has always been a challenge for programmers. The correct techniques to create the best performance change and evolve in response to changes in the underlying hardware. Before the late 90s, creating faster code was often about crafting more efficient CPU instructions. For instance, it was faster to XOR a register with itself than to set a register to 0. Now, this is no longer the case and instead creating fast code is mostly about improving L2 and L3 cache hit rates, aligning data in memory and allowing hardware attached to the I/O bus to move data to main memory concurrently with the CPU. With the advent of Scala, use of immutable data has been touted as the "best practice" for how to write concurrent code. However, to systems programmers, this guideline seems strange. This talk is about why using immutablity might be the right choice for some programming teams and the wrong choice for others. When you should to follow the easier models of systems that rely on immutability verses when you might need to brave the difficulities of using synchronization and locks. And finally, when *gasp* blocking is the right thing to do, even inside of an actor.

Benchmarking code used to produce numbers referenced in this talk: https://github.com/creditkarma/scala-performance-benchmarks 

Speakers
avatar for Hunter Payne

Hunter Payne

Staff Software Engineer, Credit Karma
Hunter is a staff software engineer focused on data engineering and AI at Credit Karma. He specializes in natural language processing systems, systems engineering and distributed systems. Before joining Credit Karma, Hunter developed knowledge management frameworks, JDBC drivers... Read More →


Sunday November 13, 2016 11:40am - 12:20pm PST
Caching

1:10pm PST

Wikipedia RecSys
This talk is will be delivered the two sessions. Firstly, we will analyze various Wikipedia datasets and showcase recommendation engine for Wiki-editors.  We will primarily use Scala, Spark 2.0 ,Spark Community Edition. Secondly,we will focus on Stock market prediction with H2O and Spark 2.0 on Spark-notebooks platform.

Speakers
avatar for Deepesh Chaudhari

Deepesh Chaudhari

Data Scientist
I am passionate about turning data into meaningful stories. Currently I'm developing deep learning models for anomaly detection project. This project aims to improve fraud detection for mobile games played by millions. During my Masters in Data science at GalvanizeU, I worked with... Read More →


Sunday November 13, 2016 1:10pm - 1:30pm PST
Off by One

1:10pm PST

Automatic composition of fast data structures
Programmers often have trouble finding fast data structures to support a given API. I describe a system I'm writing to automatically compose data structures to efficiently support a given API. The system is built out of some modified graph algorithms and ideas from programming language theory.

Speakers
avatar for Buck Shlegeris

Buck Shlegeris

Software Engineer, Triplebyte
Buck Shlegeris works at Triplebyte by day. By night, he enjoys writing Scala, thinking about programming language theory and data structures, and writing music.


Sunday November 13, 2016 1:10pm - 1:30pm PST
Naming

1:10pm PST

A non-typical introduction to Akka remote actors with the Raspberry PI
Have some fun with with some Raspberry PI’s and Akka remote actors. In a pure software world it can be beneficial to take a step back and introduce a physical aspect to the work you do. This light and sound filled talk covers the fundamentals of remote actors with some exciting hardware interactions. Remote AKKA actors will be our first stop. We will discuss the configuration and setup of this valuable tool. Once our configuration is complete, we continue by setting up a small cluster of raspberry pi’s to use for our experiment. With our small cluster, we will look at creating and receiving messages. Then react to external stimuli and produce our own responses. With these tools, our creation will slowly come to life. Failure is always an option, so we look finally at handling failure and reacting to it. What happens when our machine breaks, and how can we best handle that event? This introduction should inspire you to think about the possibilities of Akka and give you the tools to get started on your own Akka projects.

Speakers
avatar for James Townley

James Townley

Developer, YoppWorks
Scala, Microservices, Agile, Event Driven, CQRS, DDD, D4


Sunday November 13, 2016 1:10pm - 1:30pm PST
Caching

1:40pm PST

Telco Fraud Detection in Kamanja
A telco company that covers South America has been having two pain points: 1) a specific type of fraud which is annoying customers and is costly, and 2) long reporting latencies that are strangling daily business decisions.  The solution, currently under development, involves Kamanja to clean and enrich 20 data sources, with 10-50 feeds / source and 50-1,000 fields / feed (including the implementation of the fraud detection logic). Kamanja written in Scala, and is an open-source real time decisioning engine. See www.Kamanja.org. Other systems in the solution stack include: Kafka, Kerberos, Zookeeper, HDFS, Parquet and HBase.

Speakers
avatar for Greg Makowski

Greg Makowski

Director of Data Science, Ligadata
I have been deploying data mining models since 1992. My most recent vertical focus has been around financial services and security, using Kafka, Kamanja, R, MLlib and StanfordNLP. See also www.Ligadata.com



Sunday November 13, 2016 1:40pm - 2:00pm PST
Off by One

1:40pm PST

Using CoProducts to stitch together algebras for scalable complexity in functional program design
Specifying an algebra, in the form of sealed case class hierarchies, to describe your program data domain, operations, and result conditions is a natural and common design pattern. But as your program does more, its algebra can become enormous and cumbersome. You will want to begin splitting your algebra into smaller domains to focus on specific concerns. By doing so you can reduce the scope of knowledge required to work in a given algebra, while building smaller, less monolithic interpreters. This talk goes through an example of designing a model for a realistic microservice, and then breaking it down into more discrete, focused concerns. Then we will use Cats Copproduct (with a mention to the Shapeless Coproduct) type to tie the separate algebras back together so the entire program can employ the capabilities of main different domains selectively. We will describe how this technique is also useful for stitching together different Free monad libraries when using the interpreter pattern. Finally we will talk about how this is one solution to the Expression Problem, but there are others out there.

Speakers
avatar for Scott Maher

Scott Maher

Sr Tech Lead, Twilio
Scala! Distributed systems! Communications protocols! Punctuation expressing excitement!


Sunday November 13, 2016 1:40pm - 2:00pm PST
Naming

1:40pm PST

Scaling Reliability
We choose microservice architectures for many different reasons, including, often, improving reliability. However, there is a dark side. Modular systems have many benefits, but as you add more components, each of which can fail, it makes the overall system more fragile, because there are new ways in which it can fail that previously didn't exist. From its inception, Finagle was designed as a toolkit for reliable systems. We'll discuss understanding the reliability of your system (monitoring! load tests! service level objectives!), how to recover when your system fails (alerts! rollbacks! rolling restarts! failovers!), how to make a system reliable (retries! backpressure! circuit breakers!), and how to debug a reliability problem (tracing! logs! persistence!).

Speakers
avatar for Moses Nakamura

Moses Nakamura

Software Engineer, Twitter


Sunday November 13, 2016 1:40pm - 2:00pm PST
Caching

2:10pm PST

New Metrics Engine to Help Drive UBER
Uber is very much a data driven company. In this talk we will be looking under the hood of Uber’s data engine. Specifically, we will talk about the new in-house metrics platform being developed at Uber, the rationale behind it, use cases and how it helps us grow at Uber speed. We will also discuss our creative use of Scala and Spark.

Speakers
avatar for Sasha Ovsankin

Sasha Ovsankin

Data & Analytics Engineer, Data Platform, UBER
Business Metrics, Data Pipelines, Scalable Systems, Domain-Specific Languages, Functional Programming for Big Data



Sunday November 13, 2016 2:10pm - 2:50pm PST
Off by One

2:10pm PST

Writing a dynamic x86_64 assembler in Scala
Is it possible to generate the native x86_64 code for a Scala function at runtime? Of course it is! In this live coding session I will create a dynamic x86_64 assembler in Scala. I will start by finding a good way to embed ASM instructions into plain Scala code, generating the machine code at runtime, loading it in memory, making it executable, and finally casting it to a proper Scala function type. Is it a good idea to do this? In any case, it is a very interesting journey through Scala, external DSL and the JVM FFI.

Speakers
avatar for Guillaume Bort

Guillaume Bort

Lead developer, Criteo
Creator of @playframework - Previously @Inria, @zengularity, @lightbend, @prismicio - Now working on the petabytes of analytics data at @Criteo


Sunday November 13, 2016 2:10pm - 2:50pm PST
Naming

2:10pm PST

Pure Functional Database Programming with Fixpoint Types
Recursive structures (file systems, family trees, and so on) abound in programming and are especially easy to express and manipulate in functional languages. But some operations, like decorating a tree with arbitrary values or folding/unfolding in an effectful context end up being tricky. As is often the case, it turns out that there are some very powerful abstractions that emerge from examining such problems in detail.
In this talk I will motivate the use of fixpoint types by showing how Fix and Cofree pop out naturally if you push on recursive data a little bit, and will relate these types to the better-known Free monad. The motivating example is serialization of a tree structure to and from Postgres using doobie, a pure functional database layer for Scala. Along the way we will review covariant and traversable functors, and will use equational reasoning and "follow the types" to manipulate database programs in the same way we work with everyday data.
The takeaway is a more general way of thinking about recursive types and recursive programs with effects, and renewed confidence in the power and ease of pure functional programming. This is an intermediate talk that assumes some familiarity with functional programming in Scala.

Speakers
avatar for Rob Norris

Rob Norris

Programmer, Gemini Observatory
Software Engineer


Sunday November 13, 2016 2:10pm - 2:50pm PST
Caching

3:00pm PST

Well Dressed: Scala with Pants
Engineers at Twitter created and open sourced the now healthy-OSS Pants build tool (http://www.pantsbuild.org/) to build a growing, multi-language, multi-million line codebase. While Pants builds a growing handful of languages, its tight integration with the zinc incremental Scala/Java compiler means that one of its specialities is large Scala projects. This talk will work to convince you that using Pants with a Scala monorepo is the right choice for either your cluster of open source projects or company codebase.

Speakers
avatar for Stu Hood

Stu Hood

Build Team Tech Lead, Twitter


Sunday November 13, 2016 3:00pm - 3:40pm PST
Off by One

3:00pm PST

better-files: Towards safe and sane I/O in Scala
Doing I/O in Scala (and Java) involves either invoking some magic "FileUtil" or browsing through StackOverflow. In this talk, we will introduce better-files (https://github.com/pathikrit/better-files) - a thin wrapper around Java NIO to enable simple, safe and sane I/O in Scala. We will also discuss problems with designing an I/O library that would make everyone happy and different schools of thoughts e.g. monadic vs non-blocking vs effect-based APIs

Speakers
avatar for Pathikrit Bhowmick

Pathikrit Bhowmick

Head of Engineering, Coatue Management
Pathikrit writes Scala full-time at a hedge fund. He is also the author of many widely used Scala libraries: https://github.com/pathikrit such as better-files and metarest and is a committee member of the Scala Platform.


Sunday November 13, 2016 3:00pm - 3:40pm PST
Naming

3:00pm PST

Dualistic Quotations in Quill
This talk will present the mechanism used by Quill to support compile-time query generation. The quotations produced by the QDSL (Quoted Domain Specific Language) are both compile-time and runtime values. This is a powerful concept that supports even compile-time high-order functions.

Speakers
avatar for Flavio W. Brasil

Flavio W. Brasil

Software Engineer, Twitter
Flavio W. Brasil is a software engineer at Twitter, working on the team responsible for maintaining the high-performance tweet backend service. He is an experienced developer that has specialized in Scala development and performance analysis on the JVM over the five last years. He... Read More →


Sunday November 13, 2016 3:00pm - 3:40pm PST
Caching

4:00pm PST

Logical Signatures for Spark
Dealing with problems that arise when running a long process over a large dataset can be one of the most time consuming parts of development. For this reason, many data engineers and scientists will save intermediate results and use them to quickly zero in on the sections which have issues and avoid rerunning sections that are working as intended. For data pipelines that have several sections, dealing with the saving and loading of intermediate results can become almost as complicated as the core problem that the developers are trying to solve. Changes that are made may require previously saved intermediate results to be invalidated and overwritten. This process is typically manual and it's very easy for a developer to mistakenly use outdated intermediate results. These problems can be even worse when multiple developers are sharing intermediate results. These issues can be addressed by the introduction of a logical signature for datasets. For each dataset, we'll compute a signature based on the identity of the input and on the logic applied. If the input and logic stay the same for some dataset between two executions, the signature will be consistent and we can safely load previously saved results. If either the input or the logic change then the signature will change and the dataset will be freshly computed. With these signatures, we can implement automatic checkpointing that works even among several concurrent users and other useful features as well.

Speakers
avatar for Nimbus Goehausen

Nimbus Goehausen

Software Engineer, Bloomberg LP


Sunday November 13, 2016 4:00pm - 4:40pm PST
Off by One

4:00pm PST

Building DSLS with Scala
DSLs are everywhere. Have you ever used SQL, Ant or maybe HTML? If so you were using a DSL, maybe without realizing it. Domain-Specific Languages, or DSLs, provide convenient syntactical means of expressing goals in a given problem domain. A well-crafted DSL communicates the essence and means of the domain it represents in a natural way, so that you don’t even think about its underlying technology. ​ Scala’s rich, flexible syntax combined with its OO and functional features makes writing DSLs a breeze. In this talk I'll introduce the concept of DSLs, where to best apply them, their pros and cons, and how to integrate DSLs into your core application. We will see a practical example of how to lever the tools Scala gives us and build our very own tax calculation DSL. ​

Speakers
avatar for Alon Muchnick

Alon Muchnick

backend team lead, WIX.COM
Alon Muchnick is a software engineer with background in networking security and Unix systems. For the last two years he has been working for Wix.com, developing Wix Stores, a robust microservices-based eCommerce platform, using Scala stack and CQRS with event sourcing.


Sunday November 13, 2016 4:00pm - 5:00pm PST
Naming

4:00pm PST

Doubt Truth to be a Liar: Non Triviality of Type Safety for Machine Learning
Feature vectors – sequences of heterogenous types – are the basic unit of any machine learning algorithm. Further, feature engineering involves manipulations of these feature vectors and is a fundamental step in optimizing the accuracy of machine learning models. These manipulations may take the form of regular Scala sequence operations that can also be distributed using frameworks such as Spark or Flink. When building a general purpose machine learning framework, the types of engineered features is not known in advance, which is a problem for statically typed languages. In this talk, I will walk through possible solutions for designing type-safe feature vectors in Scala that provide compile-time type safety for feature engineering and other machine learning use cases. The solutions will demonstrate applications of Shapeless, Scala Macros, and Quasiquotes.

Speakers
avatar for Matthew Tovbin

Matthew Tovbin

Principal Engineer, Salesforce Einstein, Salesforce
Matthew Tovbin is a Principal Member of Technical Staff at Salesforce, engineering Salesforce Einstein AI platform, which powers the world’s smartest CRM. Before joining Salesforce, he acted as a Director of Engineering at Badgeville, implementing scalable and highly available real-time... Read More →


Sunday November 13, 2016 4:00pm - 5:00pm PST
Caching

5:00pm PST

The Future of Functional Programming
This panel will discuss the best practices and key challenges for propagating FP in the industry. This is where the beauty meets the beast, and we'll see how they can have the most fun possible, while being at their nicest and most helpful to each other.

Speakers
avatar for Oscar Boykin

Oscar Boykin

Machine Learning Infrastructure, Stripe
Oscar is the creating of Scalding, Summingbird, and Algebird, and is an overall professor and mathematician turned software magician.
avatar for Jillian Crossley

Jillian Crossley

Software Engineer, Twitter
I work on Finagle @Twitter
avatar for John A. De Goes

John A. De Goes

Solution Architect, De Goes Consulting
John A. De Goes has been writing Scala software for more than eight years at multiple companies, and has assembled world-renowned Scala engineering teams, trained new developers in Scala, and developed several successful open source Scala projects.Known for his ability to take very... Read More →
avatar for Stu Hood

Stu Hood

Build Team Tech Lead, Twitter
avatar for Paul Snively

Paul Snively

Sr. Software Engineer, Formation
I've been a language nut my whole life. Common Lisp, Scheme, Oz, OCaml, Haskell, and Scala all have a home in my heart for different reasons. I've been fortunate enough to have worked with Apple, AOL, Virgin, VMware, Intel, Verizon, and Formation, among others. I've spoken at Strange... Read More →


Sunday November 13, 2016 5:00pm - 6:00pm PST
Caching
 
Filter sessions
Apply filters to sessions.