Drawing a data model for Tableau with draw.io
A well-designed star schema model works well with Tableau because they have well-defined granularity, measures, and dimensions.
A diagram of a star schema resembles a star, with the fact table (in this case "Hospital Visit") at the center.
Creating the following simple star schema is the goal of this blog using the free software draw.io with a center (orange), primary keys, foreign keys and data types.
My blog is based on the star schema used by Joshua N. Milligan in his excellent book "Learning Tableau 2019." I start Draw.io and pick the "Entity Relationship Diagram (ERD)." For those who have studied computer science probably know, that an ERD is a snapshot of a data structure. It shows entities (tables) in a database and relationships between tables within that database.
Draw.io gives me immediately an example layout I can work with. I personally don't like the split between PK and the rest, so I remove it later.
The SQL data types: SQL data types define the type of value that can be stored in a table column. Broadly speaking, SQL data types can be divided into the following categories: 1. Numeric data types such as int, tinyint, biging, float etc. 2. Data and time data types such as date, time, datetime etc.
3. Character and string data types such as char, varchar, text etc.
4. Unicode character string data types, for example nchar, nvarchar, ntext etc.
5. Binary data types such as binary, varbinary etc.
6. Miscellaneous data types - xml, cursor, clob, blob etc.
Not all data types are supported by every relational database vendors - they are just the most popular used data types. If you're like me and coming from a Data Science background, but want to learn more about data types, I recommend DataCamp's course "Introduction to Relational Databases in SQL." You will learn enough about data types to work effectively as a Data Scientist and Tableau Developer.
I'm not going to pretend that I have strong SQL skills, but this is what it looks like when adding a PRIMARY KEY when creating a table.
Finally, we should have a data model with tables, primary keys, foreign keys and data types.
The FOREIGN KEY is a key used to link two tables together and refers to the PRIMARY KEY in another table. The table that contains the foreign key is called a child table.
We can think of relational databases as a collection of tables. A table is just a set of rows and columns, like a spreadsheet, which represents exactly one type of entity. For example, a table might represent doctors in a hospital or patients, but not both.
Each row, or record, of a table, contains information about a single entity. For example, in a table representing patients, each row represents a single patient.
The foreign key should reinforce referential integrity. Meaning, that we need a matching row for each entry.
Joining in Tableau
A well-designed star schema allows us to create inner joins on every key (this can be foreign keys, surrogate keys or super/compound keys):
So, fact tables are joined to the related dimensions that reference a single dimension record. Conclusion: I believe that working through a data model helps a Data Scientist or Tableau Developer tremendously in understanding relational databases. And with draw.io, we have an excellent tool in formalizing (visualizing) the data model.