When: June 12, 2018
Where: Berlin, Germany
Robert Rodger at Berlin Buzzwords - Session abstract:
Indexes are what make efficient access for our data storage systems possible. Though traditionally implemented with highly-optimized tree-based data structures, this past December a group from Google proposed a novel idea: replace certain types of index structures with trained machine learning algorithms. After all, an index is nothing other than a model that maps a key to the position of a record; in this light, exchanging, say, your B-tree search with a deep neural network prediction seems at least possible, if not practical. Surprisingly, doing so can often lead to significant performance improvements, in terms of both time and memory consumption.
In this talk we discuss how learned indexes accomplish this. We focus on neural networks, and in particular how recent trends in processor architecture design make them computationally competitive against tree search. We then have a look at how machine learning algorithms can be applied to the task of range indexation, how they can deliver error bound guarantees, and how their accuracy can be honed by layering them recursively. We finish with a review of the Google group's results on three realistic datasets and a brief mention of how machine learning can be applied to other indexation tasks.
During the journey of building the Image Detection system we have used specific implementations that can be insightful and helpful to the audience. For example, our models are not only trained in parallel but transfer learning allows us to engineer a single 1st component for all models and then having the flow distribute over each of the DNN (~90% of the work is shared among the DNNs). Our models achieve above 95% accuracy and because of the component-like architecture it’s very flexible.
Robert Rodger - Data Scientist
Robert is a data miner, analyst, visualizer and scientist at GoDataDriven.
Academically, his background is in mathematics and theoretical physics, having earned a B.Sc. at Stanford University and an M.Sc. cum laude at Universiteit Utrecht. Professionally, though, his background is in risk measurement and management, first at a large American investment bank before he saw the light and moved to a machine learning-based FinTech shop in Amsterdam.
"I do data science consulting and machine learning training. Particularly interested in deep learning and all things Bayesian."
Robert’s interests aren’t limited to the financial world, though! He’s curious about all manner of data use cases, and in particular he’s eager to hear about what you are doing with big data.