Official Results
- English STS (posted May 20, 2016) New!
- Cross-lingual STS (posted Feb 23, 2016, updated Feb 24, 2016 3:45 CST)
English STS
Results from the 2016 English Semantic Textual Similarity (STS) shared task.
Results from the 2016 English Semantic Textual Similarity (STS) shared task.
The baseline run by the organizers is marked with a † symbol (at rank 100). Late or corrected systems
are marked with a * symbol.
are marked with a * symbol.
Team | Run | ALL | Ans.-Ans. | HDL | Plagiarism | Postediting | Ques.-Ques. | Run Rank | Team Rank |
---|---|---|---|---|---|---|---|---|---|
Samsung Poland NLP Team | EN1 | 0.77807 | 0.69235 | 0.82749 | 0.84138 | 0.83516 | 0.68705 | 1 | 1 |
UWB | sup-general | 0.75731 | 0.62148 | 0.81886 | 0.82355 | 0.82085 | 0.70199 | 2 | 2 |
MayoNLPTeam | Run3 | 0.75607 | 0.61426 | 0.77263 | 0.80500 | 0.84840 | 0.74705 | 3 | 3 |
Samsung Poland NLP Team | EN2 | 0.75468 | 0.69235 | 0.82749 | 0.81288 | 0.83516 | 0.58567 | 4 | |
NaCTeM | micro+macro | 0.74865 | 0.60237 | 0.80460 | 0.81478 | 0.82858 | 0.69367 | 5 | 4 |
ECNU | S1-All | 0.75079 | 0.56979 | 0.81214 | 0.82503 | 0.82342 | 0.73116 | 5* | 4* |
UMD-TTIC-UW | Run1 | 0.74201 | 0.66074 | 0.79457 | 0.81541 | 0.80939 | 0.61872 | 6 | 5 |
SimiHawk | Ensemble | 0.73774 | 0.59237 | 0.81419 | 0.80566 | 0.82179 | 0.65048 | 7 | 6 |
MayoNLPTeam | Run2 | 0.73569 | 0.57739 | 0.75061 | 0.80068 | 0.82857 | 0.73035 | 8 | |
Samsung Poland NLP Team | AE | 0.73566 | 0.65769 | 0.81801 | 0.81288 | 0.78849 | 0.58567 | 9 | |
DLS@CU | Run1 | 0.73563 | 0.55230 | 0.80079 | 0.82293 | 0.84258 | 0.65986 | 10 | 7 |
DLS@CU | Run3 | 0.73550 | 0.54528 | 0.80334 | 0.81949 | 0.84418 | 0.66657 | 11 | |
DTSim | Run1 | 0.73493 | 0.57805 | 0.81527 | 0.83757 | 0.82286 | 0.61428 | 12 | 8 |
NaCTeM | macro | 0.73391 | 0.58484 | 0.79756 | 0.78949 | 0.82614 | 0.67039 | 13 | |
DLS@CU | Run2 | 0.73297 | 0.55992 | 0.80334 | 0.81227 | 0.84418 | 0.64234 | 14 | |
Stasis | xgboost | 0.73050 | 0.50628 | 0.77824 | 0.82501 | 0.84861 | 0.70424 | 15 | 9 |
IHS-RD-Belarus | Run1 | 0.72966 | 0.55322 | 0.82419 | 0.82634 | 0.83761 | 0.59904 | 16 | 10 |
USFD | COMB-Features | 0.72869 | 0.50850 | 0.82024 | 0.83828 | 0.79496 | 0.68926 | 17 | 11 |
USFD | CNN | 0.72705 | 0.51096 | 0.81899 | 0.83427 | 0.79268 | 0.68551 | 18 | |
saarsheff | MT-Metrics-xgboost | 0.72693 | 0.47716 | 0.78848 | 0.83212 | 0.84960 | 0.69815 | 19 | 12 |
MayoNLPTeam | Run1 | 0.72646 | 0.58873 | 0.73458 | 0.76887 | 0.85020 | 0.69306 | 20 | |
UWB | unsup | 0.72622 | 0.64442 | 0.79352 | 0.82742 | 0.81209 | 0.53383 | 21 | |
UMD-TTIC-UW | Run2 | 0.72619 | 0.64427 | 0.78708 | 0.79894 | 0.79338 | 0.59468 | 22 | |
SERGIOJIMENEZ | Run2 | 0.72617 | 0.55257 | 0.78304 | 0.81505 | 0.81634 | 0.66630 | 23 | 13 |
IHS-RD-Belarus | Run2 | 0.72465 | 0.53722 | 0.82539 | 0.82558 | 0.83654 | 0.59072 | 24 | |
DTSim | Run3 | 0.72414 | 0.56189 | 0.81237 | 0.83239 | 0.81498 | 0.59103 | 25 | |
ECNU | U-SEVEN | 0.72427 | 0.47748 | 0.76681 | 0.83013 | 0.84239 | 0.71914 | 25* | |
SERGIOJIMENEZ | Run1 | 0.72411 | 0.50182 | 0.78646 | 0.83654 | 0.83638 | 0.66519 | 26 | |
NaCTeM | Micro | 0.72361 | 0.55214 | 0.79143 | 0.83134 | 0.82660 | 0.61241 | 27 | |
SERGIOJIMENEZ | Run3 | 0.72215 | 0.49068 | 0.77725 | 0.82926 | 0.84807 | 0.67291 | 28 | |
DTSim | Run2 | 0.72016 | 0.55042 | 0.79499 | 0.82815 | 0.81508 | 0.60766 | 29 | |
DCU-SEManiacs | Fusion | 0.71701 | 0.58328 | 0.76392 | 0.81386 | 0.84662 | 0.56576 | 30 | 14 |
DCU-SEManiacs | Synthetic | 0.71334 | 0.68762 | 0.72227 | 0.81935 | 0.80900 | 0.50560 | 31 | |
RICOH | Run-b | 0.71165 | 0.50871 | 0.78691 | 0.82661 | 0.86554 | 0.56245 | 32 | 15 |
ECNU | S2 | 0.71175 | 0.57158 | 0.79036 | 0.77338 | 0.74968 | 0.67635 | 32* | |
HHU | Overlap | 0.71134 | 0.50435 | 0.77406 | 0.83049 | 0.83846 | 0.60867 | 33 | 16 |
UMD-TTIC-UW | Run3 | 0.71112 | 0.64316 | 0.77801 | 0.78158 | 0.77786 | 0.55855 | 34 | |
University of Birmingham | CombineFeatures | 0.70940 | 0.52460 | 0.81894 | 0.82066 | 0.81272 | 0.56040 | 35 | 17 |
University of Birmingham | MethodsFeatures | 0.70911 | 0.52028 | 0.81894 | 0.81958 | 0.81333 | 0.56451 | 36 | |
SimiHawk | F | 0.70647 | 0.44003 | 0.77109 | 0.81105 | 0.81600 | 0.71035 | 37 | |
UWB | sup-try | 0.70542 | 0.53333 | 0.77846 | 0.74673 | 0.78507 | 0.68909 | 38 | |
Stasis | boostedtrees | 0.70496 | 0.40791 | 0.77276 | 0.82903 | 0.84635 | 0.68359 | 39 | |
RICOH | Run-n | 0.70467 | 0.50746 | 0.77409 | 0.82248 | 0.86690 | 0.54261 | 40 | |
Stasis | linear | 0.70461 | 0.36929 | 0.76660 | 0.82730 | 0.83917 | 0.74615 | 41 | |
RICOH | Run-s | 0.70420 | 0.51293 | 0.78000 | 0.82991 | 0.86252 | 0.52319 | 42 | |
University of Birmingham | CombineNoFeatures | 0.70168 | 0.55217 | 0.82352 | 0.82406 | 0.80835 | 0.47904 | 43 | |
MathLingBudapest | Run1 | 0.70025 | 0.40540 | 0.81187 | 0.80752 | 0.83767 | 0.64712 | 44 | 18 |
ISCAS_NLP | S1 | 0.69996 | 0.49378 | 0.79763 | 0.81933 | 0.81185 | 0.57218 | 45 | 19 |
ISCAS_NLP | S3 | 0.69996 | 0.49378 | 0.79763 | 0.81933 | 0.81185 | 0.57218 | 46 | |
UNBNLP | Regression | 0.69940 | 0.55254 | 0.71353 | 0.79769 | 0.81291 | 0.62037 | 47 | 20 |
DCU-SEManiacs | task-internal | 0.69924 | 0.62702 | 0.71949 | 0.80783 | 0.80854 | 0.51580 | 48 | |
MathLingBudapest | Run2 | 0.69853 | 0.40540 | 0.80367 | 0.80752 | 0.83767 | 0.64712 | 49 | |
MathLingBudapest | Run3 | 0.69853 | 0.40540 | 0.80366 | 0.80752 | 0.83767 | 0.64712 | 50 | |
ISCAS_NLP | S2 | 0.69756 | 0.49651 | 0.79041 | 0.81214 | 0.81181 | 0.57181 | 51 | |
UNBNLP | Average | 0.69635 | 0.58520 | 0.69006 | 0.78923 | 0.82540 | 0.58605 | 52 | |
NUIG-UNLP | m5all3 | 0.69528 | 0.40165 | 0.75400 | 0.80332 | 0.81606 | 0.72228 | 53 | 21 |
wolvesaar | xgboost | 0.69471 | 0.49947 | 0.72410 | 0.79076 | 0.84093 | 0.62055 | 54 | 22 |
wolvesaar | lotsa-embeddings | 0.69453 | 0.49415 | 0.71439 | 0.79655 | 0.83758 | 0.63509 | 55 | |
Meiji-WSL-A | Run1 | 0.69435 | 0.58260 | 0.74394 | 0.79234 | 0.85962 | 0.47030 | 56 | 23 |
saarsheff | MT-Metrics-boostedtrees | 0.69259 | 0.37717 | 0.77183 | 0.81529 | 0.84528 | 0.66825 | 57 | |
wolvesaar | DLS-replica | 0.69244 | 0.48799 | 0.71043 | 0.80605 | 0.84601 | 0.61515 | 58 | |
saarsheff | MT-Metrics-linear | 0.68923 | 0.31539 | 0.76551 | 0.82063 | 0.83329 | 0.73987 | 59 | |
EECS | Run2 | 0.68430 | 0.48013 | 0.77891 | 0.76676 | 0.82965 | 0.55926 | 60 | 24 |
NUIG-UNLP | m5dom1 | 0.68368 | 0.41211 | 0.76778 | 0.75539 | 0.80086 | 0.69782 | 61 | |
EECS | Run3 | 0.67906 | 0.47818 | 0.77719 | 0.77266 | 0.83744 | 0.51840 | 62 | |
PKU | Run1 | 0.67852 | 0.47469 | 0.77881 | 0.77479 | 0.81472 | 0.54180 | 63 | 25 |
EECS | Run1 | 0.67711 | 0.48110 | 0.77739 | 0.76747 | 0.83270 | 0.51479 | 64 | |
PKU | Run2 | 0.67503 | 0.47444 | 0.77703 | 0.78119 | 0.83051 | 0.49892 | 65 | |
HHU | SameWordsNeuralNet | 0.67502 | 0.42673 | 0.75536 | 0.79964 | 0.84514 | 0.54533 | 66 | |
PKU | Run3 | 0.67209 | 0.47271 | 0.77367 | 0.77580 | 0.81185 | 0.51611 | 67 | |
NSH | Run1 | 0.66181 | 0.39962 | 0.74549 | 0.80176 | 0.79540 | 0.57080 | 68 | 26 |
RTM | SVR | 0.66847 | 0.44865 | 0.66338 | 0.80376 | 0.81327 | 0.62374 | 68* | 26* |
Meiji-WSL-A | Run2 | 0.65871 | 0.51675 | 0.58561 | 0.78700 | 0.81873 | 0.59035 | 69 | |
BIT | Align | 0.65318 | 0.54530 | 0.78140 | 0.80473 | 0.79456 | 0.29972 | 70 | 27 |
UNBNLP | tf-idf | 0.65271 | 0.45928 | 0.66593 | 0.75778 | 0.77204 | 0.61710 | 71 | |
UTA_MLNLP | 100-1 | 0.64965 | 0.46391 | 0.74499 | 0.74003 | 0.71947 | 0.58083 | 72 | 28 |
RTM | FS+PLS-SVR | 0.65237 | 0.35333 | 0.65294 | 0.80488 | 0.82304 | 0.64803 | 72* | |
RTM | PLS-SVR | 0.65182 | 0.34401 | 0.66051 | 0.80641 | 0.82314 | 0.64544 | 72* | |
SimiHawk | LSTM | 0.64840 | 0.44177 | 0.75703 | 0.71737 | 0.72317 | 0.60691 | 73 | |
BIT | VecSim | 0.64661 | 0.48863 | 0.62804 | 0.80106 | 0.79544 | 0.51702 | 74 | |
NUIG-UNLP | m5dom2 | 0.64520 | 0.38303 | 0.76485 | 0.74351 | 0.76549 | 0.57263 | 75 | |
UTA_MLNLP | 150-1 | 0.64500 | 0.43042 | 0.72133 | 0.71620 | 0.74471 | 0.62006 | 76 | |
SimiHawk | TreeLSTM | 0.64140 | 0.52277 | 0.74083 | 0.67628 | 0.70655 | 0.55265 | 77 | |
NORMAS | SV-2 | 0.64078 | 0.36583 | 0.68864 | 0.74647 | 0.80234 | 0.61300 | 78 | 29 |
UTA_MLNLP | 150-3 | 0.63698 | 0.41871 | 0.72485 | 0.70296 | 0.69652 | 0.65543 | 79 | |
LIPN-IIMAS | SOPA | 0.63087 | 0.44901 | 0.62411 | 0.69109 | 0.79864 | 0.59779 | 80 | 30 |
NORMAS | ECV-3 | 0.63072 | 0.27637 | 0.72245 | 0.72496 | 0.79797 | 0.65312 | 81 | |
JUNITMZ | Backpropagation-1 | 0.62708 | 0.48023 | 0.70749 | 0.72075 | 0.77196 | 0.43751 | 82 | 31 |
NSH | Run2 | 0.62941 | 0.34172 | 0.74977 | 0.75858 | 0.82471 | 0.46548 | 82* | |
LIPN-IIMAS | SOPA1000 | 0.62466 | 0.44893 | 0.59721 | 0.75936 | 0.76157 | 0.56285 | 83 | |
USFD | Word2Vec | 0.62254 | 0.27675 | 0.64217 | 0.78755 | 0.75057 | 0.68833 | 84 | |
HHU | DeepLDA | 0.62078 | 0.47211 | 0.58821 | 0.62503 | 0.84743 | 0.57099 | 85 | |
ASOBEK | T11 | 0.61782 | 0.52277 | 0.63741 | 0.78521 | 0.84245 | 0.26352 | 86 | 32 |
ASOBEK | M11 | 0.61430 | 0.47916 | 0.68652 | 0.77779 | 0.84089 | 0.24804 | 87 | |
LIPN-IIMAS | SOPA100 | 0.61321 | 0.43216 | 0.58499 | 0.74727 | 0.75560 | 0.55310 | 88 | |
Telkom University | WA | 0.60912 | 0.28859 | 0.69988 | 0.69090 | 0.74654 | 0.64009 | 89 | 33 |
BIT | WeightedVecSim | 0.59560 | 0.37565 | 0.55925 | 0.75594 | 0.77835 | 0.51643 | 90 | |
3CFEE | grumlp | 0.59603 | 0.36521 | 0.72092 | 0.74210 | 0.76327 | 0.37179 | 90* | 34* |
ASOBEK | F1 | 0.59556 | 0.42692 | 0.67898 | 0.75717 | 0.81950 | 0.26181 | 91 | |
JUNITMZ | Recurrent-1 | 0.59493 | 0.44218 | 0.66120 | 0.73708 | 0.69279 | 0.43092 | 92 | |
VRep | withLeven | 0.58292 | 0.30617 | 0.68745 | 0.69762 | 0.73033 | 0.49639 | 93 | 34 |
Meiji_WSL_teamC | Run1 | 0.58169 | 0.53250 | 0.64567 | 0.74233 | 0.54783 | 0.42797 | 94 | 35 |
JUNITMZ | FeedForward-1 | 0.58109 | 0.40859 | 0.66524 | 0.76752 | 0.66522 | 0.38711 | 95 | |
VRep | withStopRem | 0.57805 | 0.29487 | 0.68185 | 0.69730 | 0.72966 | 0.49029 | 96 | |
VRep | noStopRem | 0.55894 | 0.34684 | 0.67856 | 0.69768 | 0.74088 | 0.30908 | 97 | |
UNCC | Run-3 | 0.55789 | 0.31111 | 0.59592 | 0.64672 | 0.75544 | 0.48409 | 98 | 36 |
Telkom University | CS | 0.51602 | 0.06623 | 0.72668 | 0.50534 | 0.73999 | 0.56194 | 99 | |
NORMAS | RF-1 | 0.50895 | 0.16095 | 0.58800 | 0.62134 | 0.72016 | 0.46743 | 100 | |
STS Organizers | baseline | 0.51334 | 0.41133 | 0.54073 | 0.69601 | 0.82615 | 0.03844 | 100† | 37† |
meijiuniversity_teamb | 4features_^LCS_pos | 0.45748 | 0.26182 | 0.60223 | 0.55931 | 0.66989 | 0.16276 | 101 | 37 |
meijiuniversity_teamb | 5features | 0.45656 | 0.26498 | 0.60000 | 0.55231 | 0.66894 | 0.16518 | 102 | |
meijiuniversity_teamb | 4features_^LCS | 0.45621 | 0.26588 | 0.59868 | 0.55687 | 0.66690 | 0.16103 | 103 | |
Amrita_CEN | SEWE-2 | 0.40951 | 0.30309 | 0.43164 | 0.63336 | 0.66465 | -0.03174 | 104 | 38 |
VENSESEVAL | Run1 | 0.44258 | 0.41932 | 0.62738 | 0.64538 | 0.27274 | 0.22581 | 104* | 38* |
UNCC | Run-1 | 0.40359 | 0.19537 | 0.49344 | 0.38881 | 0.59235 | 0.34548 | 105 | |
UNCC | Run-2 | 0.37956 | 0.18100 | 0.50924 | 0.26190 | 0.56176 | 0.38317 | 106 | |
WHU_NLP | CNN20 | 0.11100 | -0.02488 | 0.16026 | 0.16854 | 0.24153 | 0.00176 | 107 | 39 |
3CFEE | bowkst | 0.33174 | 0.20076 | 0.42288 | 0.50150 | 0.19111 | 0.35970 | 107* | |
WHU_NLP | CNN10 | 0.09814 | 0.05402 | 0.19650 | 0.08949 | 0.16152 | -0.02989 | 108 | |
DalGTM | Run1 | 0.05354 | -0.00557 | 0.29896 | -0.07016 | -0.04933 | 0.08924 | 109 | 40 |
DalGTM | Run2 | 0.05136 | -0.00810 | 0.28962 | -0.07032 | -0.04818 | 0.08987 | 110 | |
DalGTM | Run3 | 0.05027 | -0.00780 | 0.28501 | -0.07134 | -0.05014 | 0.09223 | 111 | |
WHU_NLP | CNN5 | 0.04713 | 0.00041 | 0.13780 | 0.03264 | 0.12669 | -0.08105 | 112 | |
IHS-RD-Belarus | Run3 | 0.83761 | 113 |
Cross-lingual STS
Results from the 2016 Cross-lingual Semantic Textual Similarity (STS) shared task.
Results from the 2016 Cross-lingual Semantic Textual Similarity (STS) shared task.
The main evaluation column is "mean". The rank column gives the rank of the submission as
ordered by the "mean" result.
Team | News | Multi Source | Mean | Run Ranking | Team Ranking |
UWB-sup | 0.90621 | 0.81899 | 0.86311 | 1 | 1 |
UWB-unsup | 0.91237 | 0.80818 | 0.86089 | 2 | 1 |
SERGIOJIMENEZ-run1 | 0.88724 | 0.81836 | 0.85321 | 3 | 2 |
SERGIOJIMENEZ-run3 | 0.89651 | 0.80737 | 0.85246 | 4 | 2 |
DCU-SEManiacs-run2 | 0.89739 | 0.79262 | 0.84562 | 5 | 3 |
DCU-SEManiacs-run1 | 0.89408 | 0.76927 | 0.83241 | 6 | 3 |
SERGIOJIMENEZ-run2 | 0.82911 | 0.81266 | 0.82098 | 7 | 2 |
GWU_NLP-run2_xWMF | 0.86768 | 0.73189 | 0.80058 | 8 | 4 |
GWU_NLP-run1_xWMF | 0.86626 | 0.6985 | 0.78337 | 9 | 4 |
GWU_NLP-run3_bWMF | 0.83474 | 0.72427 | 0.78015 | 10 | 4 |
CNRC-MT1 | 0.87562 | 0.64583 | 0.76208 | 11 | 5 |
CNRC-MT2 | 0.87754 | 0.63141 | 0.75592 | 12 | 5 |
CNRC-EMAP | 0.71943 | 0.41054 | 0.5668 | 13 | 5 |
RTM-FS+PLS-SVR | 0.59154 | 0.52044 | 0.55641 | 14 | 6 |
RTM-FS-SVR | 0.53602 | 0.52839 | 0.53225 | 15 | 6 |
RTM-SVR | 0.49849 | 0.52935 | 0.51374 | 16 | 6 |
FBK_HLT-MT-run3 | 0.25507 | 0.53892 | 0.39533 | 17 | 7 |
FBK_HLT-MT-run1 | 0.24318 | 0.53465 | 0.3872 | 18 | 7 |
FBK_HLT-MT-run2 | 0.24372 | 0.5142 | 0.37737 | 19 | 7 |
LIPM-IIMAS-sopa | 0.08648 | 0.15931 | 0.12247 | 20 | 8 |
WHU_NLP-CNN10 | 0.03337 | 0.04083 | 0.03706 | 21 | 9 |
WHU_NLP-CNN5 | 0 | 0.05512 | 0.02724 | 22 | 9 |
WHU_NLP-CNN20 | 0.02428 | -0.06355 | -0.01912 | 23 | 9 |
JUNITMZ-backpropagation | -0.05676 | -0.48389 | -0.26781 | 24 | 10 |
JUNITMZ-backpropagation | -0.34725 | -0.39867 | -0.37266 | 25 | 10 |
JUNITMZ-backpropagation | -0.52951 | -0.39891 | -0.46498 | 26 | 10 |