Using Multiple Data Sources

Because the tag data we want to use for our bar chart is stored in a different file, we need to know how to work with multiple data sources.

Learning Objectives

  • Connect to multiple data sources.
  • Join datasets.

Connecting to Data

There are three ways to connect to a new data source.

  1. Use the keyboard shortcut ⌘ + D (Mac) or ctrl + D (Windows).
  2. Click on Data from the top menubar and choose New Data Source.
  3. Find the cylinder and plus sign icon on the Tableau toolbar. Hovering over it will provide a tooltip stating New Data Source as well as the keyboard shortcut. (In the image below it shows ⌘D because I am using a Mac). The New Data Source button

Whichever method we choose, we will receive the same connect options we did at the beginning of the tutorial. As we did then, choose Text file and navigate to the data folder downloaded from GitHub. This time, choose tags.csv. Tableau will open up the data from tags.csv as it did when we connected to mods.csv. Many of the columns are the same as what we have seen before, but there is now a column for Tag. There are also many instances of the same mod. For example, the mod with ID 11 has multiple rows, once for each tag associated with the mod.

tags.csv data

Instead of going on to a new worksheet, however, this time we need to do one more step. If we only use the data from this file, while we can make a bar chart of tags, it will be disconnected from the rest of the data. Instead, we will join our two datasets. This will allow us additional filter functionality later on.

  1. Find the Files section on the left of the application. Also note that to the right of it there is a large space with a box labeled tags.csv inside it. Files section
  2. Double click the tags.csv box to move from the default Relationships view to the Join view.
  3. Locate mods.csv under Files and drag it onto the empty space near the tags.csv box. This will add a mods.csv box, connected to the tags box by two overlapping circles. mods.csv connected to tags.csv

    The overlapping circles that connect tags.csv and mods.csv indicate the type of join we are using. By default only the space between them is shaded in, meaning we are doing an inner join. An inner join would consist of only records that have matching values in both datasets.

  4. Click on the overlapping circles to examine our join options.
  5. Choose Left. A left join means that for every record in the left dataset (tags) the relevant data from the right dataset (mods) will be added.
  6. In order to make sure that Tableau knows which row from the mods dataset holds the correct data for a given row from tags, we need to choose a field to join on. Because Mod ID is a unique identifier it is the perfect choice for something like this. Choose Mod ID as the join field on the left and Mod ID (mods.csv) as the join field on the right. Joining tags.csv and mods.csv using a left join on Mod ID