Nouveau : Google Cloud Dataprep by Trifacta

Technology   |   Bertrand Cariou   |   Oct 1, 2019 TIME TO READ: 4 MINS

Today, we’re excited to announce the general availability of Cloud Dataprep by Trifacta, an embedded version of the Trifacta data preparation technology within Google Cloud Platform. This is a major milestone in our continued collaboration with Google. Now, all Google Cloud Platform (GCP) users can structure, cleanse and blend their data in a fully supported, production-level data preparation service.     

The GA of Google Cloud Dataprep has been nearly two years in the making, during which we’ve worked closely with Google to customize and enhance Trifacta’s capabilities for the GCP ecosystem. We have also incorporated feedback from tens of thousands of beta users that collectively have executed over 700,000 data preparation jobs since the September 2017 beta launch. As a result, this GA release includes a comprehensive design refresh, capabilities that support collaborative team-based data preparation, and updates that specifically target end users who are more accustomed to data preparation in Excel or Google Sheets than in code. To learn more about the new features, read this recent blog post from Director of Product Management Sean Ma.

The seamless integration between Cloud Dataprep and the GCP ecosystem is unique, allowing users to leverage the benefits of a Google’s fully-managed cloud platform with the intuitive experience of data preparation with Trifacta. Users can access and manage data preparation from multiple data sources and data sets within Cloud Storage and BigQuery using Cloud Dataflow.  The published result can then be used by downstream GCP services like Google Data Studio and Cloud Machine Learning Engine for further analysis. Downstream GCP services benefit tremendously from the improved data quality and operationalized data pipelines that Cloud Dataprep enables. For example, users can operationalize their data pipelines by setting workflows to run on a scheduled basis in Cloud Dataprep, and produce more effective analytics in Cloud Data Studio. All of this is managed with Single-Sign-On through Google IAM, which provides administrators with enterprise grade security and centralized user management through Cloud Console. As adoption of Cloud Dataprep grows, we’re excited to offer organizations a technology that will enhance their overall experience on GCP and accelerate GCP data projects.

Already, we’re seeing organizations develop impressive GCP solutions to real-world data preparation use cases through Cloud Dataprep to accomplish everything from marketing analytics to machine learning to retail data onboarding. For example:

  • Marketing analytics
    Users are drawing upon advertising and interaction data from Google Marketing Platform and BigQuery to prepare in Cloud Dataprep. This type of data is complex and generated at such a fast clip that marketers typically struggle to find value from it; with Cloud Dataprep, users are now using data preparation to find new insights on advertising performance and customer behavior.
  • Preparing models for machine learning
    Users are leveraging log data from Cloud Storage to prepare in Cloud Dataprep for use in machine learning models managed in Cloud ML Engine.
  • Onboarding Retail Data
    Users are preparing diverse data from Excel files and Cloud Storage to publish to BigQuery in order to manage warehousing and shipments.

From a broader perspective, the general availability of Cloud Dataprep is a milestone achievement for the fast-growing data preparation market. As data preparation continues to become a critical component for thousands of organizations across many different environments, Cloud Dataprep offers greater flexibility as to where data preparation occurs, and how quickly they can get started. The process of building Cloud Dataprep has also strengthened us as a data preparation company to create the best architecture for cloud environments overall, from GCP to other leading cloud platforms. Add to this our ongoing support for on-premise environments, and the launch of Cloud Dataprep has further validated Trifacta as the leader in the data preparation market, powering the data preparation experience for thousands of users and growing fast. To better enable these users with the knowledge to power their data preparation efforts, we’ve dedicated a section of our community website where they can find myriad resources on data preparation. Learn more about our community by visiting the website.

If you’re interested in Google Cloud Dataprep, you can sign up with your own personal Google account for a free trial OR login using your company’s existing Google account. Visit to learn more. To read the announcement from Google, visit their blog here.