Compute the AUC of Precision-Recall Curve

2020-04-25
Compute the AUC of Precision-Recall Curve

After the theory behind precision-recall curve is understood (previous post), the way to compute the area under the curve (AUC) of precision-recall curve for the models being developed becomes important. Thanks to the well-developed scikit-learn package, lots of choices to calculate the AUC of the precision-recall curves (PR AUC) are provided, which can be easily integrated to the existing pipeline of models.

Which function computes the PR AUC?

At first glance of the list in the metrics module in scikit learn, the only function that seems related to precision-recall curve is metrics.precision_recall_curve. However, it computes the values of the curve rather than the area under the curve (AUC). First, the plot will have to be constructed, and next step is to compute the PR AUC using metrics.auc. Is there any other way to get the PR AUC in simply one step? After intense search on the internet, I found that metrics.average_precision_score actually is the function I am looking for. But why? Is average precision equal to PR AUC? Let’s explore it in the following section.

What’s average precision?

Average precision computes the average value of precision over the interval from recall = 0 to recall = 1. precision = p(r), a function of r - recall:

\[Average\ Precision = \int_{0}^{1} p(r)dr\]

Does this formula give clues about what average precision stands for? Let’s look into a precision-recall curve.

The integral computes the area under the precision-recall curve - the yellow area. It means that the average precision is equal to PR AUC. However, the integral in practice is computed as a finite sum across every threshold in the precision-recall curve.

\[Average \ Precision = \sum_{k = 1}^{n} P(k)\Delta r(k)\] where k is the rank of all data points, n is the number of data points, \(P(k)\) is the precision at the k-th threshold, \(\Delta r(k)\) is the difference between recall@k and recall@k-1. This method does not interpolate within points and is slightly different from the computation of the area under the curve of the precision-recall curve. If the auc function is chosen to compute AUC, the impact of wiggles in the curve using average precision can be reduced.

Note: the paragraph above is summarized from Wikipedia & scikit-learn documentation

Compute PR AUC, average precision, in scikit learn

Let’s look into real examples of average precision computation. A classification data set is generated using datasets.make_classification and split into training & testing set using model_selection.train_test_split. A logistic regression is fitted on the data set for demonstration.

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import average_precision_score, precision_recall_curve
from sklearn.metrics import auc, plot_precision_recall_curve
import matplotlib.pyplot as plt

random_state = 416
# Create dataset for binary classification with 5 predictors
X, y = datasets.make_classification(n_samples=1000,
                                    n_features=5,
                                    n_informative=3,
                                    n_redundant=2,
                                    random_state=random_state)

# Split into training and test
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    test_size=.4,
                                                    random_state=random_state)

# Create classifier using logistic regression
classifier = LogisticRegression(random_state=random_state)
classifier.fit(X_train, y_train)
# Get the predicited probability of testing data
y_score = classifier.predict_proba(X_test)[:, 1]

Use average_precision_score

The average precision (PR AUC) is returned by passing the true label & the probability estimate.

# Average precision score
average_precision = average_precision_score(y_test, y_score)
print(average_precision)
## 0.9057007040716777

Use precision_recall_curve & auc

When using auc function to compute the area under the precision-recall curve, as mentioned earlier, the result is not the same as the value from average_precision_score, but it does not differ too much since the number of data points are large enough to mitigate the effect of wiggles.

# Data to plot precision - recall curve
precision, recall, thresholds = precision_recall_curve(y_test, y_score)
# Use AUC function to calculate the area under the curve of precision recall curve
auc_precision_recall = auc(recall, precision)
print(auc_precision_recall)
## 0.9053487302244206
plt.plot(recall, precision)
plt.show()

Use build-in function to plot precision-recall curve

In version 0.22.0 of scikit learn, plot_precision_recall_curve is added into the metrics module. It is easy to plot the precision-recall curve with sufficient information by using the classifier without any extra steps to generate the prediction of probability,

disp = plot_precision_recall_curve(classifier, X_test, y_test)
disp.ax_.set_title('Binary class Precision-Recall curve: '
                   'AP={0:0.2f}'.format(average_precision))

If you need to compute the area under the curve of precision-recall plot, don’t forget to use average_precision_score to help you get robust result quickly.