Patent Retrieval Evaluation Score
(Download PRESeval, a script to calculate PRES score for IR results in TREC format)
PRES is an evaluation score that is especially design for recall-oriented information retrieval tasks. PRES is derived from the normalized recall measure (Rnorm). It measures the ability of a system to retrieve all known relevant documents earlier in the ranked list. Unlike MAP and Recall, PRES is dependent on the relative effort exerted by users to find relevant documents. This is mapped by Nmax (see Equation), which is an adjustable parameter that can be set by users and indicates the maximum number of documents they are willing to check in the ranked list. PRES measures the effectiveness of ranking documents relative to the best and worst ranking cases, where the best ranking case is retrieving all relevant documents at the top of the list, and the worst is to retrieve all the relevant documents just after the maximum number of documents to be checked by the user (Nmax). The idea behind this assumption is that getting any relevant document after Nmax leads to it being missed by the user, and getting all relevant documents after Nmax leads to zero Recall, which is the theoretical worst case scenario. Figure below shows an illustrative graph of how to calculate PRES, where PRES is the area between the actual and worst cases (A2) divided by the area between the best and worst cases (A1+A2).
Nmax introduces a new definition to the quality of ranking of relevant results, as the ranks of results are relative to the value of Nmax. For example, getting a relevant document at rank 10 will be very good when Nmax=1000, good when Nmax=100, but bad when Nmax = 15, and very bad when Nmax=10. Systems with higher Recall can achieve a lower PRES value when compared to systems with lower Recall but better average ranking. The PRES value varies from R to nR2/Nmax, where R is the Recall, according to the average quality of ranking of relevant documents.
PRES curve is bounded between the best case and the new defined worst case
where ri is the rank at which the ith relevant document is retrieved, Nmax is the maximum number of retrieved documents to be checked by the user, i.e. the cut-off number of retrieved documents, and n is the total number of relevant documents.
For more details about PRES and its robustness, please check (Magdy and Jones, SIGIR 2010) and (Magdy and Jones, CLEF 2010) in publications.
Download Perl script for calculating PRES: PRESeval
[ Home | Publications | Patents | Presentations]
Last Modified: July 2010