Glossary

AutoML: Automated Machine Learning


What Is AutoML?

Automated machine learning, or AutoML, makes ML accessible to non-experts by enabling them to build, validate, iterate, and explore ML models through an automated experience. AutoML automatically prepares and cleans data, creates and picks features, picks the correct model family, optimizes hyperparameters, and analyzes results. It also helps with data visualization, insight generation, model explainability, and model deployment.

Why Is AutoML Important?

ML models provide businesses with valuable insights, yet the responsibility to create the models often falls to those without extensive ML expertise. While AutoML doesn’t replace the data scientist, it makes them more productive and enables them (and others) to automate the code-intensive steps and focus on model testing and insights. Less experienced users (aka citizen data scientists) often use AutoML to generate insights and as a quick way to learn about data science.

How AutoML Works

AutoML usually includes the following:

  • Data evaluation and pre-processing

    Data evaluation and pre-processing: Data is prepared, cleansed, and transformed to create a useful model-training dataset.

  • Feature engineering

    Feature engineering: New columns of data are created in the existing model-training data, which may better represent predictors in the phenomenon described by the data or simply work better with the ML algorithms.

  • Feature selection

    Feature selection: After new features are built, AutoML picks only those that are useful in generating a model.

  • Algorithm selection

    Algorithm selection: Competing candidate models are reviewed to select the one that best performs in terms of desired metric (E.g., optimizing for accuracy, recall, balanced accuracy).

  • Hyperparameter tuning

    Hyperparameter tuning: A set of optimal hyperparameters is chosen for a learning algorithm.

AutoML Examples

AutoML can help solve a myriad of business challenges including:

Personalization

Talking to a consumer base is no longer enough. For a business to succeed, they need to be able to address each customer individually. AutoML makes personalization more scalable by learning individual preferences and behaviors, which allows companies to serve up personalized recommendations and content. The result is a more engaged consumer base and better sales.

Cleaning Customer Records

Spelling errors, updates, and inconsistent information can create duplicates in a company database. AutoML makes it easy to find and correct those outliers so data is clean, accurate, and usable.

Customer Churn

Attracting new customers is essential to any business, but so is keeping the ones they already have. AutoML can find patterns in customer activity to predict which ones are likely to switch to competitors. This information allows for targeted retention efforts that can grow profits and brand value.

Fraud Detection

Fraud costs the U.S. government about $80 billion a year. Nearly every federal agency is targeted, and there aren’t enough resources to investigate each claim. As criminals get smarter, solutions have, too. AutoML integrates into existing systems and utilizes data from past fraud cases to help find red flags and address issues quickly.

Getting Started with AutoML

Alteryx offers an accessible AutoML experience using a guided, educational approach that maintains the powerful technical capabilities used by traditional data scientists. With Alteryx Machine Learning, AutoML is integrated into every step of the data analysis process including preparation, blending, and enrichment.

At its most basic level, Alteryx Machine Learning can:

  • Automate steps of the data science and ML process
  • Train a number of predictive models on that data
  • Provide metrics about the performance of those models (E.g., receiver operating characteristics, precision, recall, accuracy, balance accuracy)

Beyond those functions, Alteryx features:

  • Interactive visualizations
  • Clear reporting for business stakeholders
  • The ability to deploy models to an operationalization system
  • Integrated lessons and glossaries
  • Automated training data evaluation
  • Suggestions to improve training data or automatically adjust that data