Making analytics and data science accessible to more people is essential to us at Alteryx.
The Alteryx APA Platform is all about that, of course — but you might not know that we’ve also released open-source Python libraries through Alteryx Open Source so everyone can achieve new breakthroughs with machine learning.
(In case you’re not familiar with the term “open-source software,” this means that anyone can view, modify, and use the source code of the software, usually free of charge. Cool, right?)
We currently have four Python libraries available for machine learning aficionados to use. Let’s take a quick look at each of them.
- Featuretools automates the feature engineering process, making this key part of your machine learning project easier, faster, and less prone to human error. Check out this demo of using Featuretools to help develop a machine learning model to predict customers’ next purchases. (documentation and Github)
- EvalML automates model building, includes data checks, and even offers tools for model understanding. This autoML library can be used for classification and regression models. Read more about EvalML or check out the demo below that shows how EvalML can be used to predict housing prices. (documentation and Github)
- Compose is a library for automating prediction engineering. Using labeling functions, it can generate training labels automatically. It also plays nicely with Featuretools and EvalML for a streamlined machine learning workflow. We’ve got more details on Compose for your reading pleasure. (documentation and Github)
- Woodwork makes it easy to infer data types for the data contained within a dataframe. It also helps out in managing typing information and manipulating data based on types. Woodwork complements the other tools here in helping EvalML build models intelligently. Check out how Woodwork can be used in conjunction with EvalML to easily construct a spam email classifier. (documentation and Github)
The Alteryx Open Source team also shares strategies for making the most of these libraries and insights they’ve learned along the way on the Innovation Labs blog. Some recent posts include:
- How to Troubleshoot Memory Problems in Python
- Visualizing Automated Feature Engineering
- Encode Smarter: How to Easily Integrate Categorical Encoding into Your Machine Learning Pipeline
Want to keep up with all the great open-source work at Alteryx?
And if you’d like to work on awesome tools like these every day, check out our careers site and get in touch.