Data Transformation
Also known as: ETL Transformation, Data Mapping, Field Transformation
Converting data from one format, structure, or value system to another as it moves between systems — the T in ETL.
Definition
Data transformation is the process of converting data from one format, structure, or value system to another as it moves between systems. It's the 'T' in ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform). Common transformations include changing data types (string to number), normalizing values ('Yes/No' to true/false), enriching with derived fields (calculating LTV from order history), and restructuring nested data into flat records.
Transformations operate at multiple levels: field-level (changing a single value), record-level (combining or splitting records), and dataset-level (aggregating, joining, or filtering across many records). Each level uses different tools and patterns.
Modern data-transformation tools include dbt (analytical SQL transformations), Hightouch and Census (reverse-ETL transformations for activating data in operational tools), Airbyte and Fivetran (ingestion-side transformations), and platform-specific tools (Zapier formatter steps, n8n function nodes).
Why It Matters
Bad transformation logic is a silent data-quality killer. A transformation that's supposed to map 'United States' to 'US' but accidentally drops anything that doesn't match exactly loses data quietly. Every transformation step needs explicit validation: what comes in, what goes out, what edge cases are handled.
The biggest mistake is doing transformations in code rather than in a dedicated transformation tool. Custom transformation code becomes invisible business logic — nobody knows it exists, nobody knows what it does, and changes break things in surprising ways. Use purpose-built tools where possible.
Examples in Practice
A CRM-to-data-warehouse pipeline transforms contacts: 'Job Title' (free text) gets normalized to standard role buckets ('VP of X' / 'V.P. X' / 'Vice President of X' all map to 'VP'), country names get normalized to ISO codes, dates get converted to UTC, custom field values get unioned across multiple sources.
A reverse-ETL pipeline syncs warehouse data back to the marketing platform with transformation: 'high-value customer' is calculated in the warehouse (lifetime spend > $1000 AND order in last 90 days) and synced to the marketing platform as a boolean flag. The transformation moves complex logic from the marketing tool to the warehouse where it can be expressed cleanly in SQL.
A B2B agency uses dbt to transform raw CRM exports into analytics-ready tables: deduplicating contacts by email, calculating account-level aggregates from contact-level data, joining marketing engagement with sales pipeline data. The transformations are version-controlled and documented in dbt models.