Alex, a Democratic candidate, is evaluating three models to determine how to target her door-to-door canvassing effort. Alex surveyed 100 people in her district and is now using this information to evaluate three models.
- Climate Score
- Cat Favorability Score
- Party Score
Click on a score button to change which score voters are arranged by.
Alex's poll asked each person whether they supported her, or her opponent. Blue dots represent Alex’s supporters. Red dots represent people who support her opponent.
Continue clicking on different score buttons to compare each model to the others.
The distribution chart shows how many individuals are at each point in the score range.
Validation charts are useful to see how well a model is measuring the probability a person is a supporter. In a perfect validation, 50% of the people with a score of 50 are supporters, and 25% of the people with a score of 25 are supporters.
We want the validation to show a stairstep pattern, where the % of individuals in the positive class aligns with the score range.
For Alex, this model is primarily used for classification. Everyone to the left of the line is classified as a supporter, everyone to the right is classified as supporting her opponent.
Alex can make make the classification cut at different points. This universe contains 0% of the population
0% of this universe is classified incorrectly
Move the cursor left to right across the chart to classify the chart at different spots.
To get a complete picture of this model performance, Alex could look at a confusion matrix, to evaluate how this model is performing at different confusion-matrix.
Click around on the graph to view the confusion matrix at different cuts.
Another useful way to look at this is True Positive Rate, and
the False Positive Rate.
The True Positive Rate is defined as
TP / (TP + FN), or in other words, out of all the supporters, how many are currently
classified as supporters.
The False Positive Rate measures out of all the non-supporters,
how many are classified as supporters, or FN / (TN + FP).
True Positive Rate = 0.10%
False Positive Rate = 0.01%
Click around on the chart to measure the TPR and the FPR at different cuts.
Plotting these numbers against each other creates a Reciever Operating Characteristic, or ROC. At each cut, you can plot the false positive rate on the X axis, and the true positive rate on the Y axis. This method to measure how well a score rank orders predictions.
True Positive Rate = 0.10%
False Positive Rate = 0.01%
Click around on the chart to plot the TPR and the FPR at different cuts.
To measure how well a model rank orders voters, we look at the Area Under the Curve (AUC). For a very accurate model, the ROC will fill the entire graph, scoring an AUC of 1. But if the model isn't performing well, it will only cover about half the chart, with an AUC around 0.5.