Working with data in Excel can be simple, but it can also be challenging and time-consuming, especially when working with multiple sources and formats of data. Data has to fit into rows and columns, has size limitations, and joining data from multiple spreadsheets requires complex formulas that leave your results vulnerable to errors. What's more, whenever your data changes, you have to repeat each step in the data prep and blending process all over again.
Alteryx takes a different approach with a workflow-based environment that allows you to prep, blend, and analyze data from multiple data sources and types, including unstructured data. And because you spend your time constructing a repeatable workflow that documents exactly how you work with data, you simply have to re-run the workflow any time your data changes — resulting in less time spent analyzing data, fewer errors, and greater consistency.
We've listed some of the most common data-related tasks that many analysts attempt to do in Excel. Alongside, we show how you would do it using Alteryx. These examples are meant to help Excel users understand how to translate their Excel knowledge into an Alteryx tool and workflow environment.
To learn more about how Alteryx can help empower Excel users, download the e-book A Radical Guide to Data Analytics Mastery for Spreadsheet Users. Then, follow along by downloading a free trial of Alteryx.
To access data in Alteryx, you drag-and-drop an Input Data Tool onto the canvas, locate the database, and simply select.
Then click on the Connect a File or Database drop down in the configuration section.
Opening Different File Types/Structures
Alteryx offers users a wide range of data access to multiple sources and structures of data. Using the Input Tool, users can access data locally from Excel, Access, XML, SAS, SPSS, MapInfo, as well as data stored in databases or HDFS.
Input Multiple Files from a Folder
To do this in Alteryx, drag an Input Tool onto the canvas, locate the folder and insert a wildcard (*). In this case we are bringing in two .CSV files.
Using the Input Tool locate the directory and file you are looking for. In this case it is POS_Data.
Then once you bring in the POS_Data file, you go to the configuration section and edit the directory path by adding a (_*.csv). This will bring in all of the .CSV files in the same directory and join together as long as they are the same table structure and format.
It’s also easy in Alteryx. Using the Select Tool, you simply rename the field to whatever you want.
Change Data Types
In Alteryx, the Select Tool allows you to change the data type to whatever format you
In addition to the Select Tool, you can use the Auto Field Tool. As its name suggests this tool can interpret the data and assign the type and size appropriate for the content. This is especially useful for fields that contain text strings.
Some conversions are a little more complex. Dates can be awkward, because the Select Tool may not be able to understand your preferred date format. To satisfy this, there is a specific DateTime Tool.
In Alteryx this is not a problem. Just use the Select Tool and uncheck the box next to anything you don’t need or want to hide, and it will be removed from your workflow without affecting formulas.
If you can list the criteria, you can use the Filter Tool. If it’s complicated, you can use a formula to build IF expressions or use the switch () function.
Create New Columns
In Alteryx, rather than inserting a column into the table, you just add a new output field to your workflow using the Formula Tool. Simply type in the new field (Value Targets in this case) and your new field is calculated on every row.
Tip 1: If you need to concatenate with first name and last name together, use a formula with the + symbol. Most other text‐based formulas use the same name and syntax, for instance Left () or Right (). The Alteryx manual (press F1) will help you identify those which don’t.
Tip 2: If you’re splitting a field with a regular format (for instance, maybe splitting a comma-separated address field), use the Text To Columns Tool.
Everybody uses IF statements in Excel. It is one the most popular
decision-making functions in the platform. To create a new Sales
Potential column that buckets scores as High, Medium, or Low, you
utilize this IF statement in Excel:
=IF(V2>2500, "High", IF(V2>1499, "Medium", "Low"))
The Formula Tool is a
powerful processor of data and formulas. Use it to add a field to an
input table, to create new data fields based on an expression or data
relationship, or to update an existing field based on these same
premises. To create the same IF statement in Alteryx, simply create a
new output field called Sales Potential, and using the Total Sales
field, input the following expression:
IF [Total Sales] > 2500 THEN "High" ELSEIF [Total Sales] > 1499 && [Total Sales] < 2501 THEN "Medium" ELSE "Low" ENDIF
Creating an Auto Sum in Excel requires that you select the related fields and rows and then click the Auto Sum symbol. Or you can write a =SUMformula.
The Summarize Tool can do a host of summary processes, including grouping, summing, count, spatial object processing, string concatenation, and much more. In this example the Summarize Tool is used to group transactions by customer and to total up the transactions to see the customer’s total spend.
Calculate Cumulative Sum of Rows
To create the cumulative sum of rows in Excel, you must create a new column, then set the cumulative range. In this case the starting point is =S2, and then you add =S2 + R3 for the second row. Once this is established, you can copy the formula for the rest of the related information.
The Multi-Row Formula Tool takes the concept of the Formula Tool a step further, allowing you to utilize row data as part of the formula creation. This tool is useful for parsing complex data and creating running totals, averages, percentages, and other mathematical calculations. In this case, you create a new field, and then create an expression.
Calculate Percentages of a Total
Calculating percentages on each field based on a total in
Excel may require you to first build a table of your data, then
build a Pivot Table based on that data. The other option is to write
an SUMIF() statement to calculate the percentages for each
month and amount. Below is a Pivot Table example.
The Multi Field Formula Tool makes it easy to execute a single function on multiple fields. This example builds new columns/fields in the data that display a percentage of each month.
Filtering in Excel is easy. You simply select the worksheet or columns
you want to filter, and then select the filter drop-down.
The Filter Tool in Alteryx lets you do a basic filter to split data into two streams: True (data that satisfies the expression) or False (data that does not).
Sorting by multiple columns in Excel is easy. Just highlight the
worksheet and do a sort or custom sort.
The Sort Tool in Alteryx makes it just as easy, allowing you to organize your data exactly the way you need it. Just select the name of the column and choose between ascending or descending.
Pivoting data in Excel (such as transposing data or creating a cross tab) will typically require that you build a Pivot Table to get the data in the shape you want.
The Transpose Tool allows you to pivot the orientation of the data table. It transforms the data so you may view horizontal data fields on a vertical axis. This tool is useful for extracting non-conforming data. There is no limit to the amount of records or fields that can be transposed.
The Cross Tab Tool pivots the orientation of the data table. It transforms the data so vertical data fields can be viewed on a horizontal axis summarizing data where specified.
To reorder columns or fields in Excel, you have to highlight the columns, then use the Shift key to drag and drop the columns to the space you want.
To reorder columns in Alteryx, you use the Select Tool or you can do the same in the Formula Tool.
With Excel, you must ensure that the columns of the two tables are exactly the same and in the same order. Then you copy the entire contents of the second table, and paste it into the first empty row of the first table.
Alteryx makes this easier. The Union Tool allows you to combine multiple worksheets based on the field names while maintaining the position of each column. You can easily change the order of the column headers in each worksheet so they match up.
Blend Data: VLOOKUPS
To join multiple worksheets in Excel, you can use the VLOOKUP formula or INDEX MATCH formula to bring in information to the data sets from a different source(s) or table(s). INDEX MATCH is used instead of a VLOOKUP formula if the lookup value is not in the left-most column, which means you can’t use the VLOOKUP formula.
If you need multiple matches to have their own rows, a
join does a better job. You can input two data sets and join them
using either the Join Tool. If you’re
connecting to a compatible data source, you can join them in the Input
With the Join Tool you join two tables with a common field (primary key) to create the following:
- Inner Join (J): new columns to the right of the source table (L) in which there are the values of the table to join (R) that satisfy the collation
- Left Join (L): the records of the original table (L) that have not found a match in the table to join (R)
- Right Join: (R): the records of the table to be joined (R) that have not found a match in the source table (L).
You can also utilize the In-DB Join Tool if you are working with data in a database. The Alteryx
In-DB Join Tool enables blending and analysis
against large sets of data without moving the data out of a database,
and can provide significant performance improvements over traditional
The In-DB Join Tool lets you do inner, left, right, and even full joins.
- Inner Join (J): creates new columns to the right of the source table (L) in which there are the values of the table to join (R) that satisfy the collation.
- Left Join (L): creates the records of the original table (L) that have not found a match in the table to join (R)
- Right Join: (R): creates the records of the table to be joined (R) that have not found a match in the source table (L).
- Full Outer Join: creates all records contained in both the (L) and (R) tables
Learn more about Alteryx in-database tools here.
Grouping Rows and Aggregating
If you want to summarize data in Excel you build a Pivot Table.
The Summarize Tool in Alteryx works just like a Pivot Table in Excel. You can group data, then sum, count, or perform other calculations on other fields.
If you haven’t already downloaded Alteryx, get a free trial.
To learn more or get help with other features in Alteryx from Alteryx users, visit the Alteryx Community.
We would like to extend a special thanks to our partner the Information Lab for their contribution. Learn more about the Information Lab.