Arts & Crafts with AWS Glue Workshop
Presented by: Lydia White, James Zhang
Constructing and administering a big data scale Extract, Transform & Load (ETL) and Business Intelligence (BI) pipeline can be expensive and complicated. Fortunately, Amazon Web Services (AWS) simplifies this with Glue and QuickSight services. Glue, a fully managed, serverless and cloud-optimized...
Format: PreCompilerLevel: Introductory and overviewHadoop Essentials
Presented by: Eric Richardson
Big Data and Cloud platforms have their origins in Hadoop. Learn the fundamentals of HDFS, Map Reduce and Yarn the three core components of Apache Hadoop. You will start a sandbox cluster, interact with HDFS, learn how HDFS saves data and why it does it that way. MapReduce is an important...
Format: PreCompilerLevel: IntermediateKafka: Build, Deploy, and Monitor Your First Real-World Application
Presented by: Justin Pihony
Companies nowadays are finding it critical to be able to gather and analyze massive amounts of disparately-sourced data, quickly and reliably. By diving into the fundamentals of Apache Kafka, you will learn how it can be more than a superior, distributed messaging system, but also a platform to...
Format: PreCompilerLevel: IntermediateHelp Cure the #1 Leading Cause of Death in America
Presented by: Luther Hill
The number one leading cause of death in America is Cardiovascular Disease. Doctors often spend years researching cures for this disease without any results. ...
Format: PreCompilerLevel: AdvancedMaking Faces: Image Reduction and Recognition
Presented by: Steve Crow
A 350px by 300px image contains 105,000 individual pixels. Comparing each pixel to tell whether or not two images are the same is not efficient. What if we could reduce the number of features, while still maintaining patterns and trends? What if we could perform this recognition by only comparing 25...
Format: General SessionLevel: Introductory and overviewIntro to Machine Learning on the cheap and without a PhD in math: Overview
Presented by: Jim Wilson
Machine learning has traditionally required a command of advanced mathematics, years of university training, and expensive hardware to implement. Now with better open source tools and online resources it’s easier than ever to create your own robust neural networks. Be it image recognition,...
Format: General SessionLevel: Introductory and overviewSupport Emoji 💁♀️ and Go Worldwide 🌐 with Unicode
Presented by: Joe Sewell
As the world becomes more and more connected, the way people communicate becomes more and more important. A prominent example is emoji: those little icons that are popular in text messages 📱 and on social media 🤝. Proposals for new emoji will often make the news. But did you know that emoji...
Format: General SessionLevel: Introductory and overview[CANCELLED] Adding Search Capabilities to your Applications with Elastic Search
Presented by: Rod Paddock
This session has been cancelled. We apologize for any trouble this causes.
Format: General SessionLevel: IntermediateHOW TO HACK AN ELECTION
Presented by: Gary Short
Think your democracy is safe? Better think again. In this session I'll demonstrate the techniques that the 'black hat data scientists' use to get the result they want from any election or referenda. In this entertaining talk, I'll demonstrate techniques, both at the macro level - how to attack an...
Format: General SessionLevel: Introductory and overviewLearning R through sports, or learning sports through R
Presented by: Mike Roznik
Big data has been a buzzword for years, but it’s only recently been hitting the world of professional sports. How do sports teams use this big data? Why with R! Together we’ll learn what kind of data they look for and how they present it. Along the way we’ll look at real world examples of R...
Format: General SessionLevel: Introductory and overviewHorizontal Scaling of Graph Databases
Presented by: Ryan Hendrickson
A very real problem is choosing what data to prioritize when storing it in a graph database. Most graph databases, including Neo4J, the number 1 graph database according to db-engines.com, do not scale horizontally. This leaves the size of the database constrained to how much disk space and ram is...
Format: General SessionLevel: IntermediateEvent processing without breaking production
Presented by: Nathan Zender
Transactional systems are built with the primary focus of processing many concurrent events, only keeping the current state of data. What if I told you there was a way to capture all the events that led to that current state? Capturing all events would allow for deeper analytics/machine learning,...
Format: General SessionLevel: Introductory and overviewWell.. that's ONNX-pected
Presented by: Ron Dagdag
The world of machine learning frameworks is complex. What if we can use the lightest framework for inferencing on edge devices? That’s the idea behind ONNX format. Attend this session and find out how to train models using the framework of your choice, save or convert models into ONNX, and deploy...
Format: General SessionLevel: IntermediateAPI Design for ETL - Lessons from Nearly 100 Data Integrations in the Wild
Presented by: Dan Mosora
In the API Economy, users increasingly want to analyze and connect more of their business data, from small startup shops to multinational corporations. Thinking about making data available through an API? There are innumerable ways to slice the problem. How do you implement pagination? What data...
Format: General SessionLevel: Intermediate