Data lakes are becoming mainstream as organizations seek to get more value from their data and start to use more advanced analytics, but the technology can be daunting.
This presentation will be a walk-through of some of the challenges and successful approaches for launching a data lake built on open source projects like Apache Spark and cloud services like Amazon Web Services and Google Cloud Platform. This will include addressing some of the technical challenges of getting a Spark cluster up and running, organizing storage for a data lake, and some of the "watch outs" from a performance and scaling perspective.
Speakers
Demetrios Kotsikopoulos
Silectis
Friday September 13, 2019 3:00pm - 3:45pm
Alley Powered by Verizon 2055 L St NW suite 400, Washington, DC 20036, USA