In this tutorial we will be visualizing Skyrim mod data. For reference, Skyrim is a roleplaying video game set in a vast fantasy world and often heavily modded. Mods are special modifications that can be added to a game. They can do anything from totally overhauling the alchemy system (Complete Alchemy and Cooking Overhaul), redoing the user interface (SkyUI), replacing all dragons with Thomas the Tank Engine characters (Really Useful Dragons), or simply adding a single weapon (Longclaw) to the game. (If you only follow one of those links, I recommend Really Useful Dragons.)
This data has been scraped from Nexus Mods, the primary site for hosting and downloading Skyrim mods. The data has been cleaned/modified slightly to make it easier to user. For more information on how the data was collected, potential issues, and what cleanup was performed, view the dataset notes.
Before we start working with the data, it is important to look over it so we know what we are working with. In the folder you downloaded there should be three files: README.md, mods.csv, and tags.csv. Find and open mods.csv. CSV stands for comma-separated values, a common format for storing large quantities of data. Each new line in the file marks a new record, and each comma represents a new piece of data about that record. Most computers can recognize a CSV and will open it using spreadsheet software like Microsoft Excel or LibreOffice Calc. Note that in mods.csv, each record/row is a mod. The file contains the 5,000 highest endorsed mods at the time of collection.
Some columns of interest:
Once you are satisfied looking over mods.csv you can close it. There is no need to open up tags.csv unless you really want to. It has some of the same data, but each mod has a record in tags.csv for each tag it has been given. For the 5,000 mods, we have 24,364 records of tags.
As we go through this tutorial you can imagine that you work for a news/entertainment website. You have been given this data and instructed to create a visualization that will allow users to explore the dataset as part of an entertainment article about video games and/or mods.
Tableau has already created a blank worksheet for us. Click the Orange Sheet 1 tab at the bottom left of the screen to view it. Note the three buttons next to it for new worksheets, new dashboards, and new stories.
We will now look over some of the basic parts of the worksheet.
The Data tab on the left side of the screen provides a list labeled Tables. This is a list of columns from mods.csv. Note familiar names from the dataset, like Country or Endorsements.
There are actually two lists here, separated by a faint line. The top list shows columns Tableau recognizes as Dimensions. These generally contain more categorical or non-numerical types of data. The second list shows Measures, which are columns with numerical, continuous data - anything you could take a sum or an average of and have it mean something. Tableau uses the terms Dimension and Measure elsewhere, so it is useful to know what it is referring to.
Next to each column name, there is an icon indicating the data type. The globe icon indicates that it is geographic data, the # indicates that it is numeric data, Abc indicates that it is textual data, and the calendar icon indicates that it is a date.
Tableau uses color to help differentiate between Dimensions and Measures. Dimensions are associated with blue, and Measures with green. You can see this at work in the colors of the symbols.
A few of the names here are in italics. These are not columns from the dataset, but are automatically generated by Tableau.
These three sections, found in a column to the right of the Data tab, are typically either called shelves or cards.
We will not be working with the Pages shelf in this tutorial, but it is useful to make note of. You can place a dimension (or measure) here to create a new "page" for each category in that dimension. Users can then cycle through these pages. An example would be showing change over time and having users cycle through years.
We will work with the Filters shelf and the Marks shelf, so for now make a note of their existence.
These are found at the top of the worksheet and will help define how our data is displayed. Fortunately, we will usually be able to let Tableau handle the specifics.
Also called the canvas, this is where our data will be displayed. Since we currently have not told Tableau to display anything, the canvas should contain the text Drop field here.
This button at the top right can be toggled on and off to show a list of the various types of charts we can use and allow us to switch between them. It will highlight various options as they come available. It also shows what dimensions/measures are required to be able to use a given type of chart.
Sheet 1 is not a very informative name. This will become more of an issue as we add sheets - it is difficult to remember what Sheet 1 is as opposed to Sheet 2 or Sheet 3. Therefore, we should go ahead and rename our first worksheet.
Find the Sheet 1 tab at the bottom of the application window and right click on it. Then, type in whatever name you like. For this example we will use Map. Press Enter to save the new name.
It is useful to know that Tableau does support the use of Undo and Redo. There are arrows at the very top left that you can use, or you can use the keyboard shortcuts. Undo is ⌘ + Z (Mac) or ctrl + Z (Windows). Redo is ⌘ + Y (Mac) or ctrl + Y (Windows).