Loading…
Scalæ By the Bay has ended
Back To Schedule
Friday, November 11 • 11:10am - 11:30am
NLP in Action at SalesforceIQ

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

This talk will showcase an NLP pipeline we have built with Scala and Spark at SalesforceIQ to analyze and derive insights from large amounts (several hundreds of millions of examples) of text data. Our stack utilizes EMR, Spark, S3, Avro, Azkaban, OpenNLP, and many elements from the functional programming paradigm (read: semigroups, monoids, and foldMaps, oh my!) to build a scalable and powerful pipeline for extracting rich information from email content. This pipeline currently powers our suggested follow-up feature, which informs customers when they need to follow up with an important email or conversation, in addition to foundational features for other use cases. As data processing pipelines become ubiquitous, and more people turn to Spark and Scala to build such pipelines, this talk will answer some questions of how to effectively go about the task. Elements from functional programming and libraries such as Twitter's algebird, scalaz, or cats allow for natural and efficient implementations of many core aspects of distributed data processing. We couple this with the OpenNLP library to create a data pipeline for the linguistic analysis of text, primarily in the form of email content. This has given us a solid foundation for engineering text based features as well as training text-based models for a variety of supervised learning tasks that outperforms a similar pipeline in traditional Map/Reduce with Java, in a more maintainable and scalable way.

Speakers
avatar for Ascander Dost

Ascander Dost

Lead Software Engineer, Salesforce
Ascander is a lead engineer at SalesforceIQ, where he works on data processing infrastructure, extracting meaning from email messages, and creative cursing. He received a PhD in Linguistics from UC Santa Cruz a long time ago, but seems to have mostly recovered. He enjoys writing Scala... Read More →
avatar for Alexis Roos

Alexis Roos

Sr Engineering Manager, SalesforceIQ
Alexis has over 20 years of software engineering and management experience with emphasis in large scale data science and engineering along with application infrastructure. As an engineering manager at Salesforce, Alexis is managing all back-end engineering for Salesforce IQ CRM which... Read More →


Friday November 11, 2016 11:10am - 11:30am PST
Off by One