SIGLEX has currently identified 59 lexical resources as having special interest to SIGLEX members. You can search this list by resource name, type, language, or keywords. You can also suggest a lexical resource to be added to this list. Also, check the ACL List of Resources by Language. (A prior set of links to SIGLEX Online Reources is currently being integrated into the SIGLEX Lexical Resources database, but some older links may be of interest.)

Name or Keywords:
Resource type:
Other types:
Language:

show all hide all


The Preposition Project (TPP) Corpora

Primary resource type: Lexicons:Research resources:Word-sense disambiguation; Other resource tags: Research resources; Resource language: English; Availability: Public; Sponsor: CL Research

Three preposition corpora are available from The Preposition Project: (1) the training and test sets (over 25,000 sentences) used in the SemEval-2007 task on preposition disambiguation, drawn from FrameNet (FN), (2) a set of 7,650 sentences from the Oxford English Corpus (OEC) as examples for senses in the Oxford Dictionary of English (ODE), and (3) a set of 48,000 sentences from the written portion of the British National Corpus, drawn with methodology used in the Corpus Pattern Analysis project (CPA). The first corpus covers 34 prepositions, while the latter two include all single-word prepositions and many phrasal prepositions. Each corpus consists of sentences following the SemEval format. In addition, each sentence has been lemmatized, part-of-speech tagged, and parsed with a dependency parser. (66)