Evaluation
Evaluation Measures
Structural Measures:
Comparison against gold standard:
Manual quality assessment of novel edges
Baseline
A basic string inclusion approach that covers relations between compound terms such as network science -> science.
Gold Standard
The gold standard taxonomies (.taxo) are tab-separated fields:
relation_id <TAB> term <TAB> hypernym
where:
- relation_id: is a relation identifier;
- term: is a term of the taxonomy;
- hypernym: is a hypernym for the term.
e.g
0<TAB>cat<TAB>animal
1<TAB>dog<TAB>animal
2<TAB>car<TAB>animal
....
Gold Standard Structure
Language | Domain | |V| | |E| | #i.i. | #c.c. | Cycles |
---|---|---|---|---|---|---|
English | Environment (Eurovoc) | 261 | 261 | 60 | 1 | no |
Food | 1556 | 1587 | 70 | 1 | no | |
Food (Wordnet) | 1486 | 1576 | 302 | 1 | no | |
Science | 453 | 465 | 54 | 1 | no | |
Science (Eurovoc) | 125 | 124 | 31 | 1 | no | |
Science (Wordnet) | 429 | 452 | 117 | 1 | no | |
Dutch | Environment (Eurovoc) | 267 | 267 | 59 | 1 | no |
Food | 1429 | 1446 | 66 | 3 | no | |
Food (Wordnet) | 1299 | 1340 | 259 | 3 | no | |
Science | 445 | 449 | 54 | 1 | no | |
Science (Eurovoc) | 125 | 124 | 32 | 1 | no | |
Science (Wordnet) | 399 | 399 | 105 | 1 | no | |
French | Environment (Eurovoc) | 267 | 266 | 61 | 1 | no |
Food | 1418 | 1441 | 64 | 1 | no | |
Food (Wordnet) | 1329 | 1358 | 263 | 2 | no | |
Science | 449 | 451 | 54 | 1 | no | |
Science (Eurovoc) | 125 | 124 | 31 | 1 | no | |
Science (Wordnet) | 390 | 389 | 101 | 1 | no | |
Italian | Environment (Eurovoc) | 267 | 266 | 59 | 1 | no |
Food | 1274 | 1304 | 60 | 3 | no | |
Food (Wordnet) | 1277 | 1332 | 254 | 1 | yes | |
Science | 442 | 444 | 54 | 1 | no | |
Science (Eurovoc) | 125 | 124 | 32 | 1 | no | |
Science (Wordnet) | 396 | 396 | 105 | 1 | no |
Submissions
(with author-provided descriptions)
The following archive contains the submissions received from the 6 participating systems.
JUNLP
The system is based on two hypernym detection modules. The first one deals with available semantic relations that can be found for a term. Instead of analysing the huge Wikipedia dump for pattern-based hypernym discovery, we opted for a significant reduction of execution time by extracting Wikipedia based hypernym relations from Babelnet (rich semantic network which connects concepts and named entities in a very large network of semantic relations, called Babel synsets). The second module tries to find out the subterm(s) (a subterm can be another term from the list or multiple term overlaps) present in the termlist which can be a possible hypernym for that term.
TAXI - TAXonomy Induction
The methods used in the TAXonomy Induction system (TAXI) rely on two sources of evidence: substring matching and Hearst-like patterns. The Hearst patterns for all languages are extracted from Wikipedia and focused crawls with seed pages that are Wikipedia pages. In addition, for English, we rely on several additional corpora: GigaWord, ukWaC, a news corpus and the CommonCrawl. For French, Italian and Dutch the method is completely unsupervised and relies on KNN approach. For English, we train an SVM classifier on the trial data. For all languages the features are the same: substrings and ISA relations extracted with lexico-syntactic patterns. No databases or linguistic resources beyond trial data and raw text corpora mentioned above were used.
NUIG-UNLP
The system implements a semi-supervised method that finds hypernym candidates for the provided noun phrases by representing them as distributional vectors. Roughly, this method assumes that hypernyms may be induced by adding a vector offset [1,2] to the corresponding hyponym representation generated by GloVe over a Wikipedia dump. The vector offset is obtained as the average offset between 200 pairs of hyponym-hypernym in the same vector space.
[1] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic regularities in continuous space word representations. In HLT-NAACL, pages 746–751, 2013.
[2] Marek Rei and Ted Briscoe. 2014. Looking for Hyponyms in Vector Space. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pages 68–77.
USAAR
Often multi-word hyponyms are endocentric constructions which contains a word that fulfills the same function as one part of its word. E.g. an "apple pie" is essentially a "pie". We explore the number of multi-words terms that are endocentric in English and whether we can use this endocentric property to generate entity links to connect terms in wikipedia list of list.
QASSIT
We use a semisupervised methodology for the acquisition of lexical taxonomies based on genetic algorithms. It is based on the theory of pretopology that offers a powerful formalism to model semantic relations and transforms a list of terms into a structured term space by combining different discriminant criteria. In particular, rare but accurate pieces of knowledge are used to parameterize the different criteria defining the pretopological term space. Then, a structuring algorithm is used to transform the pretopological space into a lexical taxonomy.
Final Ranking
Overall Ranking
Subtask | Measure | JUNLP | TAXI | NUIG-UNLP | USAAR | QASSIT |
---|---|---|---|---|---|---|
Monolingual (EN) Taxonomy Construction |
Cyclicity | 3 | 1 | 4 | 1 | 2 |
Structure (F&M) | 3 | 2 | 4 | 5 | 1 | |
Categorisation (i.i.) | 1 | 3 | 2 | 4 | 5 | |
Connectivity (c.c.) | 3 | 1 | 2 | 4 | 1 | |
Gold standard comparison (Fscore)* | 4 | 1 | 5 | 2 | 3 | |
Domains* | 1 | 1 | 2 | 1 | 2 | |
Manual Evaluation (Precision)* | 4 | 2 | 5 | 1 | 3 | |
Total | 19 | 11 | 24 | 18 | 17 | |
Ranking | 4 | 1 | 5 | 3 | 2 | |
Monolingual (EN) Hypernym Identification* |
Total* | 9 | 4 | 12 | 4 | 8 |
Ranking* | 3 | 1 | 4 | 1 | 2 | |
Multilingual (NL,FR,IT) Taxonomy Construction
|
Cyclicity | 1 | 1 | n.a. | n.a. | n.a. |
Structure (F&M) | 2 | 1 | ||||
Categorisation (i.i.) | 1 | 2 | ||||
Connectivity (c.c.) | 2 | 1 | ||||
Gold standard comparison (Fscore)^ | 2 | 1 | ||||
Manual Evaluation (Precision)^ | 2 | 1 | ||||
Total | 10 | 7 | ||||
Ranking | 2 | 1 | ||||
Multilingual (EN)
Hypernym Identification^ |
Total^ | 4 | 2 | |||
Ranking^ | 2 | 1 |
Only measures marked with * and ^ were used for ranking in the hypernym identification subtasks.
Overall scores
Subtask | Measure | Baseline | JUNLP | TAXI | NUIG-UNLP | USAAR | QASSIT |
---|---|---|---|---|---|---|---|
Monolingual (EN) |
Cyclicity | 0 | 3 | 0 | 4 | 0 | 1 |
Structure (F&M) | 0.0046 | 0.1498 | 0.2908 | 0.0410 | 0.0013 | 0.4064 | |
Categorisation (i.i.) | 77.67 | 377 | 104.5 | 213 | 96.33 | 59.5 | |
Connectivity (c.c.) | 36.83 | 53.17 | 1 | 44.75 | 76.67 | 1 | |
Gold standard comparison (Fscore) | 0.33 | 0.20 | 0.32 | 0.19 | 0.26 | 0.22 | |
Domains | 6 | 6 | 6 | 4 | 6 | 4 | |
Manual Evaluation (Precision) | n.a. | 0.09 | 0.20 | 0.07 | 0.49 | 0.10 | |
Multilingual (NL,FR,IT) |
Cyclicity | 0 | 0 | 0 | n.a. | n.a. | n.a. |
Structure (F&M) | 0.0087 | 0.0155 | 0.1885 | ||||
Categorisation (i.i.) | 64.28 | 178.22 | 64.94 | ||||
Connectivity (c.c.) | 40.5 | 34.89 | 1 | ||||
Gold standard comparison (Fscore) | 0.3133 | 0.1921 | 0.2815 | ||||
Manual Evaluation (Precision) | n.a. | 0.2983 | 0.6252 |
These scores are obtained by averaging the results over domains (environment, science, food) and languages (NL, FR, IT) for the multilingual setting.
Detailed Evaluation Results
Structural Evaluation
Language | Domain | Measure | Baseline | JUNLP | TAXI | NUIG-UNLP | USAAR | QASSIT |
---|---|---|---|---|---|---|---|---|
English | Environment (Eurovoc) | |V| | 123 | 321 | 148 | 312 | 57 | 261 |
|E| | 112 | 463 | 207 | 456 | 47 | 365 | ||
#i.i. | 27 | 123 | 50 | 176 | 10 | 88 | ||
#c.c. | 17 | 19 | 1 | 58 | 10 | 1 | ||
cycles | no | no | no | yes | no | no | ||
Food | |V| | 636 | 1802 | 781 | n.a | 3716 | n.a. | |
|E| | 627 | 3015 | 1118 | 4347 | ||||
#i.i. | 130 | 581 | 132 | 323 | ||||
#c.c. | 40 | 48 | 1 | 217 | ||||
cycles | no | yes | no | no | ||||
Food (Wordnet) | |V| | 826 | 1748 | 1122 | n.a. | 675 | n.a. | |
|E| | 812 | 3607 | 2067 | 540 | ||||
#i.i. | 205 | 866 | 259 | 146 | ||||
#c.c. | 79 | 123 | 1 | 135 | ||||
cycles | no | yes | no | no | ||||
Science | |V| | 232 | 602 | 294 | 595 | 371 | 452 | |
|E| | 214 | 1046 | 418 | 1656 | 312 | 708 | ||
#i.i. | 41 | 255 | 73 | 409 | 60 | 58 | ||
#c.c. | 28 | 24 | 1 | 99 | 59 | 1 | ||
cycles | no | no | no | yes | no | yes | ||
Science (Eurovoc) | |V| | 50 | 186 | 100 | 97 | 37 | 125 | |
|E| | 42 | 342 | 139 | 218 | 30 | 164 | ||
#i.i. | 11 | 133 | 25 | 72 | 7 | 25 | ||
#c.c. | 9 | 15 | 1 | 13 | 7 | 1 | ||
cycles | no | yes | no | yes | no | no | ||
Science (Wordnet) | |V| | 217 | 424 | 290 | 251 | 136 | 370 | |
|E| | 174 | 690 | 459 | 929 | 104 | 647 | ||
#i.i. | 52 | 304 | 88 | 195 | 32 | 67 | ||
#c.c. | 48 | 90 | 1 | 9 | 32 | 1 | ||
cycles | no | no | no | yes | no | no | ||
Dutch | Environment (Eurovoc) | |V| | 116 | 317 | 85 | n.a. | n.a. | n.a. |
|E| | 100 | 379 | 92 | |||||
#i.i. | 24 | 73 | 24 | |||||
#c.c. | 20 | 8 | 1 | |||||
cycles | no | no | no | |||||
Food | |V| | 459 | 1616 | 350 | ||||
|E| | 399 | 1974 | 395 | |||||
#i.i. | 84 | 329 | 85 | |||||
#c.c. | 68 | 71 | 1 | |||||
cycles | no | no | no | |||||
Food (Wordnet) | |V| | 610 | 1433 | 515 | ||||
|E| | 500 | 1868 | 573 | |||||
#i.i. | 147 | 320 | 145 | |||||
#c.c. | 122 | 63 | 1 | |||||
cycles | no | no | no | |||||
Science | |V| | 192 | 143 | 46 | ||||
|E| | 166 | 145 | 49 | |||||
#i.i. | 40 | 40 | 15 | |||||
#c.c. | 30 | 26 | 1 | |||||
cycles | no | no | no | |||||
Science (Eurovoc) | |V| | 40 | 561 | 193 | ||||
|E| | 31 | 774 | 203 | |||||
#i.i. | 11 | 163 | 40 | |||||
#c.c. | 10 | 21 | 1 | |||||
cycles | no | no | no | |||||
Science (Wordnet) | |V| | 213 | 427 | 221 | ||||
|E| | 169 | 452 | 230 | |||||
#i.i. | 55 | 89 | 59 | |||||
#c.c. | 50 | 52 | 1 | |||||
cycles | no | no | no | |||||
French | Environment (Eurovoc) | |V| | 130 | 327 | 128 | n.a. | n.a. | n.a. |
|E| | 117 | 415 | 181 | |||||
#i.i. | 23 | 79 | 37 | |||||
#c.c. | 17 | 7 | 1 | |||||
cycles | no | no | no | |||||
Food | |V| | 531 | 1649 | 522 | ||||
|E| | 500 | 2224 | 699 | |||||
#i.i. | 109 | 352 | 125 | |||||
#c.c. | 49 | 53 | 1 | |||||
cycles | no | no | no | |||||
Food (Wordnet) | |V| | 712 | 1533 | 707 | ||||
|E| | 679 | 2157 | 964 | |||||
#i.i. | 172 | 373 | 193 | |||||
#c.c. | 68 | 45 | 1 | |||||
cycles | no | no | no | |||||
Science | |V| | 201 | 587 | 249 | ||||
|E| | 181 | 885 | 298 | |||||
#i.i. | 43 | 186 | 58 | |||||
#c.c. | 24 | 18 | 1 | |||||
cycles | no | no | no | |||||
Science (Eurovoc) | |V| | 48 | 145 | 83 | ||||
|E| | 38 | 158 | 113 | |||||
#i.i. | 11 | 45 | 24 | |||||
#c.c. | 10 | 21 | 1 | |||||
cycles | no | no | no | |||||
Science (Wordnet) | |V| | 208 | 419 | 265 | ||||
|E| | 169 | 458 | 336 | |||||
#i.i. | 54 | 89 | 76 | |||||
#c.c. | 43 | 40 | 1 | |||||
cycles | no | no | no | |||||
Italian | Environment (Eurovoc) | |V| | 102 | 341 | 70 | n.a. | n.a. | n.a. |
|E| | 91 | 474 | 69 | |||||
#i.i. | 18 | 87 | 14 | |||||
#c.c. | 13 | 4 | 1 | |||||
cycles | no | no | no | |||||
Food | |V| | 459 | 1490 | 328 | ||||
|E| | 417 | 1864 | 332 | |||||
#i.i. | 103 | 314 | 66 | |||||
#c.c. | 49 | 58 | 1 | |||||
cycles | no | no | no | |||||
Food (Wordnet) | |V| | 644 | 1505 | 471 | ||||
|E| | 589 | 2053 | 486 | |||||
#i.i. | 153 | 354 | 107 | |||||
#c.c. | 73 | 53 | 1 | |||||
cycles | no | no | no | |||||
Science | |V| | 211 | 564 | 197 | ||||
|E| | 194 | 773 | 199 | |||||
#i.i. | 42 | 172 | 35 | |||||
#c.c. | 23 | 22 | 1 | |||||
cycles | no | no | no | |||||
Science (Eurovoc) | |V| | 56 | 144 | 54 | ||||
|E| | 45 | 149 | 55 | |||||
#i.i. | 13 | 46 | 12 | |||||
#c.c. | 12 | 24 | 1 | |||||
cycles | no | no | no | |||||
Science (Wordnet) | |V| | 211 | 430 | 208 | ||||
|E| | 165 | 463 | 209 | |||||
#i.i. | 55 | 97 | 54 | |||||
#c.c. | 48 | 42 | 1 | |||||
cycles | no | no | no |
Gold Standard Evaluation
Language | Domain | Measure | Baseline | JUNLP | TAXI | NUIG-UNLP | USAAR | QASSIT |
---|---|---|---|---|---|---|---|---|
English | Environment (Eurovoc) | Precision | 0.5 | 0.1296 | 0.3382 | 0.1579 | 0.8085 | 0.1479 |
Recall | 0.2146 | 0.2299 | 0.2682 | 0.2759 | 0.1456 | 0.2069 | ||
Fscore | 0.3003 | 0.1658 | 0.2992 | 0.2008 | 0.2468 | 0.1725 | ||
F&M | 0.0 | 0.0814 | 0.2384 | 0.0007 | 0.0007 | 0.4349 | ||
Food | Precision | 0.4705 | 0.1320 | 0.3372 | n.a. | 0.0603 | n.a. | |
Recall | 0.1859 | 0.2508 | 0.2376 | 0.1651 | ||||
Fscore | 0.2665 | 0.1730 | 0.2787 | 0.0883 | ||||
F&M | 0.0019 | 0.2608 | 0.2021 | 0.0 | ||||
Food (Wordnet) | Precision | 0.5 | 0.1475 | 0.2583 | n.a. | 0.7056 | n.a. | |
Recall | 0.2576 | 0.3376 | 0.3388 | 0.2418 | ||||
Fscore | 0.34 | 0.2053 | 0.2932 | 0.3601 | ||||
F&M | 0.0022 | 0.1925 | 0.3260 | 0.0021 | ||||
Science | Precision | 0.6262 | 0.1377 | 0.3876 | 0.0984 | 0.3814 | 0.1794 | |
Recall | 0.2882 | 0.3097 | 0.3484 | 0.3505 | 0.2559 | 0.2731 | ||
Fscore | 0.3947 | 0.1906 | 0.3669 | 0.1537 | 0.3063 | 0.2165 | ||
F&M | 0.0163 | 0.1774 | 0.3634 | 0.0090 | 0.0020 | 0.5757 | ||
Science (Eurovoc) | Precision | 0.6190 | 0.1316 | 0.2950 | 0.1330 | 0.6333 | 0.2134 | |
Recall | 0.2097 | 0.3629 | 0.3306 | 0.2339 | 0.1532 | 0.2823 | ||
Fscore | 0.3133 | 0.1931 | 0.3118 | 0.1696 | 0.2468 | 0.2431 | ||
F&M | 0.0056 | 0.1373 | 0.3893 | 0.1517 | 0.0023 | 0.3893 | ||
Science (Wordnet) | Precision | 0.6897 | 0.2058 | 0.3747 | 0.1755 | 0.8173 | 0.2025 | |
Recall | 0.2655 | 0.3142 | 0.3805 | 0.3606 | 0.1881 | 0.2898 | ||
Fscore | 0.3834 | 0.2487 | 0.3776 | 0.2361 | 0.3058 | 0.2384 | ||
F&M | 0.0016 | 0.0494 | 0.2255 | 0.0027 | 0.0008 | 0.2255 | ||
Dutch | Environment (Eurovoc) | Precision | 0.53 | 0.1425 | 0.5543 | n.a. | n.a. | n.a. |
Recall | 0.1985 | 0.2022 | 0.1910 | |||||
Fscore | 0.2888 | 0.1672 | 0.284 | |||||
F&M | 0.0007 | 0.0097 | 0.1910 | |||||
Food | Precision | 0.5363 | 0.1292 | 0.4608 | ||||
Recall | 0.1470 | 0.1763 | 0.1259 | |||||
Fscore | 0.2320 | 0.1491 | 0.1977 | |||||
F&M | - | 0.0 | 0.1226 | |||||
Food (Wordnet) | Precision | 0.526 | 0.1601 | 0.3857 | ||||
Recall | 0.1963 | 0.2231 | 0.1649 | |||||
Fscore | 0.2859 | 0.1864 | 0.2311 | |||||
F&M | 0.0008 | 0.0009 | 0.1165 | |||||
Science | Precision | 0.6024 | 0.1655 | 0.5306 | ||||
Recall | 0.2227 | 0.1935 | 0.2097 | |||||
Fscore | 0.3252 | 0.1784 | 0.3006 | |||||
F&M | 0.0057 | 0.0206 | 0.2215 | |||||
Science (Eurovoc) | Precision | 0.7742 | 0.1486 | 0.4778 | ||||
Recall | 0.1935 | 0.2561 | 0.2160 | |||||
Fscore | 0.3097 | 0.1881 | 0.2975 | |||||
F&M | 0.0 | 0.0 | 0.1987 | |||||
Science (Wordnet) | Precision | 0.5976 | 0.2257 | 0.4739 | ||||
Recall | 0.2531 | 0.2556 | 0.2732 | |||||
Fscore | 0.3556 | 0.2397 | 0.3466 | |||||
F&M | 0.0026 | 0.0020 | 0.1699 | |||||
French | Environment (Eurovoc) | Precision | 0.5043 | 0.1373 | 0.2928 | n.a. | n.a. | n.a. |
Recall | 0.2218 | 0.2143 | 0.1992 | |||||
Fscore | 0.3081 | 0.1674 | 0.2371 | |||||
F&M | 0.0051 | 0.0110 | 0.1836 | |||||
Food | Precision | 0.466 | 0.1210 | 0.3433 | ||||
Recall | 0.1619 | 0.1867 | 0.1666 | |||||
Fscore | 0.2401 | 0.1468 | 0.2243 | |||||
F&M | - | 0.0 | 0.1398 | |||||
Food (Wordnet) | Precision | 0.4153 | 0.1312 | 0.2562 | ||||
Recall | 0.2077 | 0.2084 | 0.1819 | |||||
Fscore | 0.2769 | 0.1610 | 0.2127 | |||||
F&M | 0.0006 | 0.0006 | 0.2068 | |||||
Science | Precision | 0.5856 | 0.1435 | 0.3993 | ||||
Recall | 0.2350 | 0.2816 | 0.2639 | |||||
Fscore | 0.3354 | 0.1901 | 0.3178 | |||||
F&M | 0.0114 | 0.0748 | 0.3042 | |||||
Science (Eurovoc) | Precision | 0.8684 | 0.2089 | 0.3363 | ||||
Recall | 0.2661 | 0.2661 | 0.3065 | |||||
Fscore | 0.4074 | 0.2340 | 0.3207 | |||||
F&M | 0.0 | 0.0 | 0.3192 | |||||
Science (Wordnet) | Precision | 0.6568 | 0.2664 | 0.3780 | ||||
Recall | 0.2853 | 0.3136 | 0.3265 | |||||
Fscore | 0.3979 | 0.2881 | 0.3503 | |||||
F&M | 0.0022 | 0.1462 | 0.2597 | |||||
Italian | Environment (Eurovoc) | Precision | 0.5604 | 0.0970 | 0.7536 | n.a. | n.a. | n.a. |
Recall | 0.1917 | 0.1729 | 0.1955 | |||||
Fscore | 0.2857 | 0.1243 | 0.3104 | |||||
F&M | 0.0011 | 0.0011 | 0.1776 | |||||
Food | Precision | 0.4365 | 0.1019 | 0.4277 | ||||
Recall | 0.1396 | 0.1457 | 0.1089 | |||||
Fscore | 0.2115 | 0.1199 | 0.1736 | |||||
F&M | - | 0.0 | 0.1311 | |||||
Food (Wordnet) | Precision | 0.4363 | 0.1305 | 0.4218 | ||||
Recall | 0.1929 | 0.2012 | 0.1539 | |||||
Fscore | 0.2676 | 0.1583 | 0.2255 | |||||
F&M | 0.0868 | 0.0 | 0.0868 | |||||
Science | Precision | 0.5670 | 0.0095 | 0.5176 | ||||
Recall | 0.2477 | 0.1552 | 0.2320 | |||||
Fscore | 0.3448 | 0.2703 | 0.3204 | |||||
F&M | 0.0095 | 0.0095 | 0.1933 | |||||
Science (Eurovoc) | Precision | 0.7556 | 0.2282 | 0.6 | ||||
Recall | 0.2742 | 0.2742 | 0.2661 | |||||
Fscore | 0.4024 | 0.2491 | 0.3687 | |||||
F&M | 0.0034 | 0.0033 | 0.1976 | |||||
Science (Wordnet) | Precision | 0.6182 | 0.2224 | 0.5024 | ||||
Recall | 0.2576 | 0.2601 | 0.2652 | |||||
Fscore | 0.3636 | 0.2398 | 0.3471 | |||||
F&M | 0.0 | 0.0 | 0.1735 |
Gold Standard Evaluation (Average results for each language across domains)
Language | Measure | Baseline | JUNLP | TAXI | NUIG-UNLP | USAAR | QASSIT |
---|---|---|---|---|---|---|---|
English | Average Precision | 0.57 | 0.15 | 0.33 | 0.14 | 0.57 | 0.19 |
Average Recall | 0.24 | 0.30 | 0.32 | 0.30 | 0.19 | 0.26 | |
Average Fscore | 0.33 | 0.20 | 0.32 | 0.19 | 0.26 | 0.22 | |
Dutch | Average Precision | 0.59 | 0.16 | 0.48 | n.a. | n.a. | n.a. |
Average Recall | 0.20 | 0.22 | 0.20 | ||||
Average Fscore | 0.30 | 0.19 | 0.28 | ||||
French | Average Precision | 0.58 | 0.17 | 0.33 | |||
Average Recall | 0.23 | 0.25 | 0.24 | ||||
Average Fscore | 0.33 | 0.20 | 0.28 | ||||
Italian | Average Precision | 0.56 | 0.13 | 0.54 | |||
Average Recall | 0.22 | 0.20 | 0.20 | ||||
Average Fscore | 0.31 | 0.19 | 0.29 | ||||
Overall Multilingual (Other than English) |
Average Precision | 0.58 | 0.15 | 0.45 | |||
Average Recall | 0.22 | 0.22 | 0.21 | ||||
Average Fscore | 0.31 | 0.19 | 0.28 |
Manual Evaluation
(Precision for maximum 100 random novel relations)
Language | Domain | JUNLP | TAXI | NUIG-UNLP | USAAR | QASSIT |
---|---|---|---|---|---|---|
English | Environment (Eurovoc) | 0.02 | 0.11 | 0.08 | 0.22 | 0.07 |
Food | 0.2 | 0.36 | n.a. | 0.73 | n.a. | |
Food (Wordnet) | 0.18 | 0.32 | n.a. | 0.81 | n.a. | |
Science | 0.06 | 0.14 | 0.09 | 0.71 | 0.07 | |
Science (Eurovoc) | 0.02 | 0.02 | 0.04 | 0.0 | 0.05 | |
Science (Wordnet) | 0.06 | 0.22 | 0.05 | 0.47 | 0.22 | |
Dutch | Environment (Eurovoc) | 0.27 | 0.24 | n.a. | n.a. | n.a. |
Food | 0.22 | 0.69 | ||||
Food (Wordnet) | 0.28 | 0.71 | ||||
Science | 0.41 | 0.8 | ||||
Science (Eurovoc) | 0.26 | 0.43 | ||||
Science (Wordnet) | 0.18 | 0.88 | ||||
French | Environment (Eurovoc) | 0.24 | 0.23 | |||
Food | 0.21 | 0.46 | ||||
Food (Wordnet) | 0.23 | 0.58 | ||||
Science | 0.32 | 0.58 | ||||
Science (Eurovoc) | 0.53 | 0.6 | ||||
Science (Wordnet) | 0.66 | 0.56 | ||||
Italian | Environment (Eurovoc) | 0.18 | 0.41 | |||
Food | 0.18 | 0.83 | ||||
Food (Wordnet) | 0.32 | 0.69 | ||||
Science | 0.39 | 0.90 | ||||
Science (Eurovoc) | 0.25 | 0.82 | ||||
Science (Wordnet) | 0.24 | 0.84 |