Acabar para siempre con los enemigos del aprendizaje automático

Technology   |   Bertrand Cariou   |   Oct 29, 2020 TIME TO READ: 4 MINS

What Is Machine Learning?

Machine learning and artificial intelligence are two of the big frontiers in data analytics and artificial intelligence. Now more than ever, organizations are setting serious goals around implementing machine learning models across all areas of business. But what does machine learning really mean? And what does machine learning look like in practice?

First, the question of “what is machine learning?” Put simply, machine learning allows computers to “learn” patterns and rules from historical data in order to perform tasks without explicit instructions. Because machine learning models replace manual programming with intelligent automation, data scientists are able to arrive at conclusions in a fraction of the time it would have taken them—and that’s if it were possible for the model to be manually recreated at all. Machine learning has been particularly attractive to customer behavior and fraud detection initiatives, but the applications for machine learning abound and show no sign of slowing. According to Deloitte, machine learning programs doubled from 2017 to 2018 and are expected to double again by 2020.

How It Works: In a Nutshell

Without getting into a serious course on ML, here are some of the basics behind this amazing file. Machine learning requires a model to begin, a set of data to analyze or “learn” with algorithms, and a problem to solve. The data the program will go through needs to be carefully selected with consideration for its limitations, and what data may be irrelevant. The data may need to be formatted, cleaned (removing/making anonymous sensitive information), and sampled (including only the relevant parts from the big data). Supervised data models involve a data analyst providing feedback to the system as it runs analysis. Unsupervised data models use “deep learning” to review data and drive results without anything more than testing data.

Enemy Number One to Machine Learning

Though machine learning has certainly generated a lot of buzz, the reality of machine learning projects in production is a different story. Many companies are still in the early stages of building machine learning models, much less seeing a return on their investment. The cause of the delay varies—a shortage in technology, talent or even a lack of identifying exactly how machine learning should be applied to particular business functions are all known pain points. But one of the most common delays is dirty data.

Harvard Business Review calls dirty data “enemy number one to the widespread, profitable use of machine learning.” That clean data is essential to producing sound analytics isn’t new to anyone that works with data. But, HBR warns, it’s especially true for machine learning models, which depend on huge volumes of training data and can cause a ripple effect when found in both historical data for predictive models and new data. In order to generate the value they promise, machine learning models must be fed with data that has been appropriately sourced and rigorously cleansed and structured to comply with set standards.

Designer Cloud: A Foundational Technology for Machine Learning

Data preparation, whether for machine learning or otherwise, is essential. But it’s also widely cited as the bottleneck of analytics, with up to 80% of total project time dedicated to the task. In the case of machine learning, where the amount of required data preparation doubles or triples in order to supply significant training data, data preparation is especially tedious.

To reduce the time spent in data preparation for machine learning, many machine learning project leads have turned to Designer Cloud, a data preparation platform. Designer Cloud accelerates the process of preparing data for machine learning by relying on built-in machine learning of its own, which learns from every user interaction and automatically suggests the most intelligent transformation at every instance. Designer Cloud also visually surfaces errors, outliers, and missing data so that nothing slips through to the eventual machine learning model.

In short, if you’re one of the many organizations that have invested or will be investing in data preparation and machine learning, we’d love to chat with you about how Designer Cloud can help defeat “enemy number one.” Schedule a demo of Designer Cloud today.