A telco company that covers South America has been having two pain points: 1) a specific type of fraud which is annoying customers and is costly, and 2) long reporting latencies that are strangling daily business decisions. The solution, currently under development, involves Kamanja to clean and enrich 20 data sources, with 10-50 feeds / source and 50-1,000 fields / feed (including the implementation of the fraud detection logic). Kamanja written in Scala, and is an open-source real time decisioning engine. See
www.Kamanja.org. Other systems in the solution stack include: Kafka, Kerberos, Zookeeper, HDFS, Parquet and HBase.