Prerequisite: None
Does the data you need to analyze get dumped on you in a jumble of diverse, messy files?
If so, you’re not alone. The majority of analytics projects focus on incorporating data from many small files with big diversity in terms of the file size, structure, and format. When you don’t control the source of the data, you often can’t control the analytics because there is so much manual data cleaning and standardization to make the data useful. This is the process of data preparation and it typically takes up over 80% of the time in any data project.
In this talk, we’ll dig into the role of data preparation in accelerating the analytics process and how it’s specifically suited to deal with messy files. We’ll also unveil an entirely new free product that allows anyone to connect to their files, clean them up, and then publish the resulting output to a variety of cloud-based data warehouses or analytics tools.