The Split By Field

The Split field sits to the right of the X and Y Axis inputs in the dataset window.

Split By

Example use cases for splitting your dataset include:

  1. Splitting the count of purchases for each store by payment type

  2. Splitting the sum of new account activations by campaign source

  3. Splitting average employee retention over time by department

Technically, this field allows you to quickly group the results of your query by an additional column rather than using multiple datasets with filters to achieve the same effect. To illustrate, we'll use an example from the stackoverflow.com public dataset. First, let's take a look at a query that simply asks for the number of posts by week.

posts by week

Now, let's split posts by whether they are questions or answers. Here, we are querying for Posts to stackoverflow over time and grouping the results by an additional column called "Question or Answer," which specifies whether a given post is a question or an answer. The visual query looks like this:

Example Query

To examine the SQL generated, we'll switch over to SQL mode.

SQL Mode

The split by adds the column to both the Select and Group By statements. The effect of this grouping is to create multiple datasets, one for each rolled up value in the Split By column. Let's take a look at what this split looks like visually.

split graph

If the third column had instead contained not only questions and answers, but additional values, the split would have produced a line on the graph for each value. For example, if the split column had contained questions, answers and comments.

3 columns

Notes

  1. This field works with columns that contain values of type int or text.

  2. When using the Split By field for line and bar charts, we recommend using it only for columns with fewer than 10 unique values to avoid cluttering the visualization.