Monday, 5 March 2012

Precision-at-1 and Reciprocal Rank

There are retrieval scenarios where the first relevant document is all that matters. For instance, when a user searches the Web for the home page of a particular company, when seeking a document to answer a factual question or when trying to identify semi-structured document that lists the features of a product. In these and many other cases an IR system needs to provide only one relevant result. There is no need to list more relevant documents, as the first relevant completely satisfies the user's information need. Furthermore, you obviously want that first relevant document to be ranked very high -- preferably at rank one.

Coming across such a scenario, I had to take consider how to evaluate and compare retrieval models in such a case. Browsing through established evaluation metrics, two solutions with a slightly different angle emerged: Precision-at-1 and Reciprocal Rank.

Precision-at-1: one aspect in the evaluation is if the first relevant document is listed at rank one. Or more generic and across several queries: how often is the highest ranked result relevant? Precision-at-1 (P@1 for short) addresses exactly this question. Technically it takes the first entry in the result list, and checks if this document is relevant. Hence, P@1 has a value of either 0 (first document irrelevant) or 1 (first document relevant). Therefore, we can directly relay the metric to the relevance values of the first entry in the result list: e[1].
If then we a set Q of have several queries, we can average the P@1 values and get an idea of how often the highest ranked result is relevant:

MeanP@1 = 1/Q Σq ∈ Q e[1]

Reciprocal Rank: P@1 ignores all results that are not ranked at position one. Assuming, that no system is perfect, you might also be interested in how far to the top the first relevant document is ranked. The Reciprocal Rank (RR for short) metric is designed for exactly this purpose. If r is the position of the first relevant result then RR for this result list is 1/r.
Again, if we have an entire set of queries, we can average over these values to obtain the Mean Reciprocal Rank (MRR) by:


MRR = 1/Q Σq ∈ Q 1/r


In combination the two metrics P@1 and RR can give you a good impression of how often to expect the first entry in the results list to be relevant and at which position to expect the first relevant document on average.