Regular expressions (RegEx) use special characters to define search patterns and find/replace text that originated in the 1950s. They are now standard features in a wide range of programming languages (e.g., Python, R, Java, C# ...).
Example use cases:
- Help execute complex queries of text data, such as addresses or birthdates.
- Extract patterns of text, similar to wildcard notations, but offer additional functionality.
For most Tableau developers/analysts, it's probably not required to master RegEx. Besides, mastering RegEx is pretty hard. However, basic knowledge of RegEx can get us pretty far.
Tableau offers four RegEx pre-defined functions, which represent probably the most common RegEx use cases. A RegEx function in Tableau has the following structure (source: Advanced Tableau by Dustin Cabral):
The four pre-defined RegEx functions in Tableau are:
1 - REGEXP_REPLACE
2 - REXEXP-MATCH
4 - REGEXP_EXTRACT_NTH
Tableau's RegEx functions are based on the International Components of Unicode. Why is that important? The main reason is, that we can test our RegEx function first on www.Regexr.com. Let's say we want to grab email addresses. The image below shows regexr.com. (1) I passed in a sample text. (2) I test the RegEx function and see (3) the highlighted text (the correct one).
For a deeper dive into RegEx functions in Tableau, I recommend Dustin Cabal's Tableau course on Udemy "Advanced Tableau." Additionally, the Flerage twins published a deep dive into Tableau's RegEx functions.
However, for our purposes, we have enough information to dive into a case study: BALTIMORE HOMICIDE ANALYTICS.
The dashboard can be found in my Tableau Public repository.