Schema Drift Detection

The flow view of this template

Transformations:
splitrows, header, $sourcerownumber, join

This template shows how you can validate your file data against expected schema, or when data has shifted in schema from what was expected. It makes use of Trifacta’s ability to import data as is without applying inferred row splitting technique, and comparing it to an expected schema’s headers through a join. The results are then split into two outputs, if the file input matches against the expected schema, then the Output – Valid Header output will contain the input data, otherwise you will find the data of the invalid input in the Output – Invalid Header output.

To customize this template for your use, you will need to create 3 distinct datasets to replace the existing datasets in this flow template.

1) A file that contains the expected schema by having the header metadata in the 1st row of the file. This file can contain some sample data as well. This file needs to be imported into Trifacta as an unstructured file (see below).

2) An input file to validate against the expected schema.This file should also have its header metadata in the first row of the file. This file needs to be imported into Trifacta twice, once as unstructured and once as a structured file.

3) Replace InvalidHeader-Source-Unstructured.csv with the unstructured dataset from step 2), and replace InvalidHeader-Source-Structured.csv with the structured dataset from step 2). Replace Expected-Target-Unstructured.csv with dataset from step 1).

A note on importing file as unstructured:

When you import a file into Trifacta, by default it will automatically try to infer how to split the data into records by automatically applying a splitrows transform. Normally you do not see this step nor are you able to modify it. But you can disable this by unchecking the “Detect structure” option in the import dataset settings page.

Use in Designer Cloud Use in Dataprep

POURQUOI TRAVAILLER AVEC NOUS ?

Programme Partenaire

Notre Trust Center

IA GÉNÉRATIVE

Alteryx AiDIN

VUE D'ENSEMBLE

FONCTIONNALITÉS DE LA PLATEFORME

Alteryx Analytics Cloud

Produits sur site

DÉPARTEMENT

SECTEUR

POSTE

VISITE GUIDÉE DE LA PLATEFORME

Découvrez Alteryx AI Platform for Enterprise Analytics

RESSOURCES

APPRENDRE

ÉVÉNEMENTS

Réaliser l'évaluation

Data Scorecard

L'ENTREPRISE

LA VIE CHEZ ALTERYX

ACTUALITÉS

ESSAI GRATUIT

Transformez vos capacités analytiques

Data Quality Template:

Validate File Data with Schema Drifts

New user?

Transformez vos capacités analytiques

À propos d'Alteryx

Ressources

Assistance

Communauté

Tendances du moment

Populaire

Entreprise

Data Quality Template:

Validate File Data with Schema Drifts

New user?

Inspire 2024 n'est plus qu'à quelques semaines !

Du 13 au 16 mai | L'événement analytique de l'année