白い模様

Data Wrangling

白い模様
Content

What Is Data Wrangling?

Organizations deal with large amounts of raw data and preparing it for analysis can be timely and costly. Wrangling alleviates that burden by transforming, cleansing, and enriching data to make it more applicable, consumable, and useful. Unlike data pre-processing or preparation, wrangling happens throughout the analysis and model-building stages of the data analytics process.

Wrangling improves the quality of the data being analyzed, which means rather than waste time and resources dealing with the consequences of bad data, organizations can create accurate, meaningful analyses that allow for better solutions, decisions, and outcomes.

How Data Wrangling Works

Data Wrangling Process

Data wrangling follows five major steps: Explore, transform, cleanse, enrich, and store.

Explore: Data exploration or discovery is a way to identify patterns, trends, and missing or incomplete information in a dataset. The bulk of exploration happens before creating reports, data visualizations, or training models, but it’s common to uncover surprises and insights in a dataset during analysis too.
explore


Transform: Transforming or structuring data is important; if not done early on, it can compromise the rest of the wrangling process. Data transformation involves putting the data in the right shape and format that will be useful for a report, data visualization, or analytic or modeling process. It may involve creating new variables (aka features) and performing mathematical functions on the data.
data-wrangling-transform


Cleanse: Data often contains errors as a result of manual entry, incomplete data, data automatically collected from sensors, or even malfunctioning equipment. Data cleansing corrects those entry errors, removes duplicates and outliers (if appropriate), eliminates missing data, and imputes missing values based on statistical or conditional modeling to improve data quality.
data-wrangling-cleanse


Enrich: Enrichment or blending makes a dataset more useful by integrating additional sources such as authoritative third-party census, firmographic, or demographic data. The enrichment process may also help uncover additional insights from the data within an organization or spark new ideas for capturing and storing additional customer information in the future. This is an opportunity to think strategically about what additional data might contribute to a report, model, or business process.
data-wrangling-enrich


Store: The last part of the wrangling process is to store or preserve the final product, along with all the steps and transformations that took place so it can be audited, understood, and repeated in the future.
data-wrangling-store

The Future of Data Wrangling

Data wrangling used to be handled by developers and IT experts with extensive knowledge of database administration and fluency in SQL, R, and Python. Analytic Process Automation (APA) has changed that, getting rid of cumbersome spreadsheets and making it easy for data scientists, data analysts, and IT experts alike to wrangle and analyze complex data.

Getting Started With Data Wrangling

The Alteryx APA Platform™ uses a graphical interface, so it’s easy to document, share, and scale critical data wrangling work in a way that’s auditable and repeatable. No-code, low-code modes allow users to either drag-and-drop or tackle one line of programming at a time. Users can also save their work in formats similar to a spreadsheet file or as part of a larger data model to a shared platform.

Data wrangling tools are built into every step of the Alteryx APA Platform with:
  • Transformation tools, including Arrange, Summarize, and Transpose
  • Preparation and cleansing tools, such as Formula, Filter, and Cleanse
  • Data enrichment tools, including Location Insights, Business Insights, and Behavior Analysis
マクラーレンのレーシングカー
お客様事例
5 分で読む

データ分析の高速化により、勝利を手中に収めるマクラーレン・レーシング

週末に開催される F1 レースは年間 20 戦以上にも及び、1 レースあたり 1.5TB ものデータが生成されるため、こうした膨大な量のデータの効率的な収集、処理、活用を可能にするソリューションは欠くことのできない存在です。
マクラーレン F1 チームでは、Alteryx Analytics Automation Platform を使用して、サーキット内外で戦略的な意思決定を加速させています。

サプライチェーン
アナリティクスリーダー
ビジネスインテリジェンス/分析/データサイエンス
今すぐ読む
	財務計画・財務分析担当者の業務効率化に役立つ 5 つのユースケース
電子書籍
7 分で読む

財務計画・財務分析担当者の業務効率化に役立つ 5 つのユースケース

財務計画・財務分析業務での面倒な手作業に、貴重な時間を奪われていませんか?本電子書籍で、5 つの財務計画・財務分析業務を効率化し、時間短縮、作業精度の向上、リスクの低減を実現するためのノウハウをぜひご覧ください。

財務
財務計画と分析
今すぐ読む
階段を登る人
ブログ
5 分で読む

CFOs-Step-Up-As-AI-Leaders

CFO が AI 技術の推進を舵取りできるようになれば、自分自身や企業に新たな成功への道を切り開くことが可能となります。

財務
ビジネスリーダー
Alteryx Platform
今すぐ読む

Data Blending Starter Kit

Jumpstart your path to mastering data blending and automating repetitive workflow processes that blend data from diverse data sources.

画像