Scalæ By the Bay has ended
Back To Schedule
Saturday, November 12 • 10:40am - 11:20am
Query Generation Across Multiple Data Stores

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

In this talk, we’ll discuss how we define and query cubes across multiple data stores for reporting purposes. With a single definition, we are able to decide at query time the best table/data source to answer a given request. We must take into consideration things such as time zone conversion, data availability, supported fact/dim based operations, request granularity, defined constraints, time range of request, and etc. Ultimately, our request is answered using Hive or RDBMS or Druid. This allows us to take advantage of performance characteristics of each data store while also allowing for a single interface for querying. Our goal isn’t to create a unified SQL layer which can be used to query multiple data stores. Our goal is to define a single view of the data where we can define post aggregates or other derived expressions which can later be used to programmatically generate a query for the target data store.

avatar for Hiral Patel

Hiral Patel

Technologist, Yahoo Inc
Hiral's been working with Scala for the past 6 years and Big Data for the past 12 years. He's built data platform's, data intensive applications, and real-time analytics frameworks. Hiral is currently a Senior Principal Architect/Engineer at Yahoo Inc.

Saturday November 12, 2016 10:40am - 11:20am PST
Off by One