Quantification is a supervised learning task in which we must predict, for each class of interest, the percentage of data items that belong to the class. It also goes under the name of “supervised prevalence estimation”, and has a number of applications in market research, epidemiology, the social sciences, and political science, among others. Quantification differs from classification, since in classification we are interested in predicting the class of each unlabelled item, while in quantification we are only interested in predicting the fractions of unlabelled items that belong to each class. While quantification may be solved by classifying each unlabelled item and counting how many items have been attributed the class, this method has been shown to be suboptimal. Research in quantification has to do with devising new supervised algorithms for quantification, in devising appropriate measures and protocols for evaluating quantification accuracy, each of these for different types of quantification (binary, single-label multi-class, multi-label multi-class, ordinal).

This page describes the work done at QCRI on Quantification.

Related publications

Further Material

  • Slides of a course on "Text Quantification", that Fabrizio Sebastiani gave at the 2015 Russian Summer School on Information Retrieval (RussIR 2015), St. Petersburg, Russia.