Creating an Approval Ratio Scatter Plot

As we can see, endorsements are indeed closely correlated with total downloads. While there is some variation, there is not a lot. It could be more interesting to see how the percentage of people who endorsed the mod compares to total downloads. We do not have a field that provides this percentage, so we will have to create a new field. We can do so by dividing Endorsements (number of people who have endorsed the mod) by Unique Downloads (the number of people who downloaded the mod). This will work because a user can only endorse a mod once, and they can only do so after they have downloaded the mod. Unlike total downloads, unique downloads will not be affected by any subsequent downloads by the same user.

Learning Objectives

  • Create a calculated field.
  • Duplicate a worksheet.
  • Format axis number display.

Calculated Field

  1. If the Format tab is still displayed on the left, click the "X" at the top right to remove it. X out of Format Lines
  2. Right click on the Dimensions and Measures area.
  3. Choose Create Calculated Field.... Creating a calculated field
  4. In the text box at the top of the popup menu, change the name to Approval Ratio. The larger text area in the popup menu is where the calculation for the field will go. If we wanted to create something very fancy, we could look up how to write code in Tableau, but that kind of manipulation will likely never be required. For simple calculations, Tableau's drag-and-drop interface comes to the rescue once more.
  5. Locate Endorsements under Measures and drag it onto the box. The text [Endorsements] will be shown. The Calculate Field dialog box

    Technically, typing "Endorsements" with brackets is not particularly complicated, but I would not have remembered the required format without dragging and dropping Endorsements. Most likely I would have wanted to default to how Tableau references fields in the tooltip. If you are confused as to why the required format for the tooltip is different from that of the calculated field, consider what is being referenced in each case. The tooltip is not referencing the field from the dataset itself, but rather the field as it relates to the particular worksheet. For example, the tooltip might provide the sum or average endorsements, calculations that will be defined by aspects of the worksheet. Is the sum of endorsements the sum of all endorsements or the sum of all endorsements for a given category or country? A calculated field, however, is not unique to the worksheet we are on when we create it, which is why we did not start by creating a new worksheet. It must therefore reference the field itself.

  6. Type the "/" symbol to indicate division.
  7. Locate Unique Downloads under Measures and drag it onto the box as well. The formula should be: [Endorsements]/[Unique Downloads]. Calculate field: [Endorsements]/[Unique Downloads].
  8. Click OK.

Creating the Scatter Plot

Now that we have a field to plot, we need to create a new scatter plot. As there is no need to reinvent the wheel, so to speak, we will duplicate our current scatter plot instead of starting from a blank slate. It would be a shame to have to redo our formatting for the new plot.

  1. Right click on the Endorsements worksheet tab at the bottom of the application and select Duplicate. Duplicating the Endorsements worksheet
  2. Right click on the duplicated worksheet tab and rename it to Approval Ratio.

Our approval ratio scatter plot is still showing endorsements and total downloads. We want to replace endorsements with our new calculated field. One method of doing this could be to remove Endorsements from Rows and then add Approval Ratio. Alternatively, we could add Approval Ratio first and then remove Endorsements. We will go with an even simpler option.

  1. Locate Approval Ratio under Measures.
  2. Drag it on top of Endorsements in Rows. The orange arrow/triangle that normally appears when we add dimensions and measures should not be visible. Instead, Endorsements should have a black outline indicating that we are replacing it with Approval Ratio. Replacing SUM(Endorsements) with SUM(Approval Ratio)

The plot will have a very different distribution than the Endorsements scatter plot. Instead of diagonal line indicating strong correlation between total downloads and endorsements, we see that approval ratio has quite a bit of variation with low download values, then levels off.

Scatter plot of approval ratio and downloads


Formatting

We have a few things that we need to clean up after replacing Endorsements with Approval Ratio as some of our formatting options have been undone. Fortunately, most of the formatting we applied in the first scatter plot remains, so we do not have too much work to finish this one.

  1. Right click the y-axis (Approval Ratio) and choose Format... this time instead of Edit Axis.... Formatting the y-axis
  2. Locate the Scale section on the Format tab that appears on the left.
  3. Click on the Numbers dropdown. The Numbers dropdown for formatting the y-axis
  4. Change from Automatic to Percentage, since our approval ratio is the percentage of users who endorsed the mod after downloading it.
  5. Change the number of decimal places from 2 to 0. Changing Numbers to a percentage with zero decimal places
  6. Right click on the axis again and this time choose Edit Axis....
  7. Set the Range to fixed. Editing approval ratio axis - general
  8. Switch to the Tick Marks tab and set Major Tick Marks to fixed as well. The tick marks can be placed at intervals of 0.2. Editing approval ratio axis - tick marks
  9. Close the window.

Finished Scatter Plot

The formatted scatter plot