Tableau Prep: Tableau's ETL tool
Tableau Prep might be a bit confusing to get started. However, essentially, there are only 7 steps. Those steps should make it easier to break down in your mind what you want:
1 - Input
2 - Clean (probably most of our work)
3 - Union
4 - Aggregation
5 - Join
6 - Pivot
7 - Output
Tableau Prep uses visual indicators to represent steps, field types, and notifications within a flow. However, we don't have to remember them by heart! When we encounter an indicator, we can always hover over it to see what it means.
PRO TIP: Data Design
It's important to think about data design before we begin to clean or transform our data, as design needs will vary based on our audience, use case, and performance needs.
Row-heavy data is the most flexible structure for Tableau Desktop.
Ideal combo of good performance & dynamic aggregation.
Commonly used with transactional data.
Highly dimensional data with many columns.
Allows for deep analysis.
Most common with survey data and unique record data sets.
Highly-aggregated and curated views for best performance.
Ideal for executive-level visualizations and specific high-level use cases.
Joins can also be created at the input stage for certain database connections; if table relationships are present, Linked Keys will be available to specify which fields to use for the join.
One of the first steps in evaluating data is to examine data size, field types, and unique values; this can be done at several stages, but the simplest approach is to add a clean step.
The profile pane allows us to visualize the distribution of our data, by plotting the frequency of each distinct value as bins in a histogram; this is a great way to identify outliers and null values!
Highlighting is a quick way to trace fields back through flow steps, see related values across fields, and pinpoint identical values in our data.
There are several filtering methods in Tableau Prep, based on the field type and step chosen:
Automatically group text values using fuzzy matching algorithms based on pronunciation, common characters, or spelling.
Split text-based values on automatic or custom-defined delimiters.
Convert dates to modify formats without the need for calculated fields or parsing functions.
Calculated fields can be created via standard editor or visual editor, depending on the function.
Reusable flow steps can be created, saved and imported into other flows, and are commonly used for steps which are used frequently or leveraged by other users.
Data can be transformed and combined using several types of flow steps in Tableau Prep, including Union, Join, Aggregate, and Pivot. For example, the union step appends (or "stacks") records from multiple tables, based on matching columns.
Union Results & Common Issues
Review the union results in the profile pane to identify and resolve common union issues, including data type differences and mismatched fields.
Aggregate allows us to change the granularity of your data by summarizing values at higher levels.
In Tableau Prep, we can use joins to combine data between tables which share common or related fields.
We can review the join results in the profile pane to identify and resolve common issues, including mismatched values or incorrect join types or clauses.
PRO TIP: use the Preview in Tableau Desktop option to preview the output while developing a flow.
We can create local extracts in Tableau Prep to output as either .csv or .hyper file formats.
Prep can write to external databases as a new table or append/replace data in an existing table.
Publish data sources to Tableau Server to grant user access to data and enable automated refreshes. That's tedious manual work and can be replaced by Tableau Prep Conductor which is used to automate and optimize flows in Tableau Server/Online.