A Git-like repository for your data lake

Presented by: Adi Polak

The first problem faced with big data was the feasibility of processing data at such a high scale. In solving the scale problem, people developed technologies we know today like Kafka, Spark, Presto, Snowflake, and many others powering big data operations today. ...

Tags: Big Data, Machine Learning, DevOpsFormat: General SessionLevel: Intermediate

Data Observability and Monitoring

Presented by: Roma Nawani

"When will I receive my data?" "What is causing delays?". If you have been receiving these messages from the business stakeholders, you are not alone. Most of data lines spend more than 30% of time tackling data pipeline issues, troubleshooting, missed SLAs, erroneous and inaccurate data. This will...

Tags: Big Data, DevOpsFormat: General SessionLevel: Intermediate

Sculpting Data for Machine Learning

Presented by: Jigyasa Grover

In the contemporary world of machine learning algorithms - “data is the new oil”. For the state-of-the-art ML algorithms to work their magic it’s important to lay a strong foundation with access to relevant data. Volumes of crude data are available on the web nowadays, and all we need are the...

Tags: Big Data, Python, Machine LearningFormat: General SessionLevel: Intermediate