Official Results

 

 

 

 

 

 

 

 

 

English STS
Results from the 2016 English Semantic Textual Similarity (STS) shared task.
The baseline run by the organizers is marked with a † symbol (at rank 100). Late or corrected systems
are marked with a * symbol.
Team Run ALL Ans.-Ans. HDL Plagiarism Postediting Ques.-Ques. Run Rank Team Rank
Samsung Poland NLP Team EN1 0.77807 0.69235 0.82749 0.84138 0.83516 0.68705 1 1
UWB sup-general 0.75731 0.62148 0.81886 0.82355 0.82085 0.70199 2 2
MayoNLPTeam Run3 0.75607 0.61426 0.77263 0.80500 0.84840 0.74705 3 3
Samsung Poland NLP Team EN2 0.75468 0.69235 0.82749 0.81288 0.83516 0.58567 4  
NaCTeM micro+macro 0.74865 0.60237 0.80460 0.81478 0.82858 0.69367 5 4
ECNU S1-All 0.75079 0.56979 0.81214 0.82503 0.82342 0.73116   5*   4*
UMD-TTIC-UW Run1 0.74201 0.66074 0.79457 0.81541 0.80939 0.61872 6 5
SimiHawk Ensemble 0.73774 0.59237 0.81419 0.80566 0.82179 0.65048 7 6
MayoNLPTeam Run2 0.73569 0.57739 0.75061 0.80068 0.82857 0.73035 8  
Samsung Poland NLP Team AE 0.73566 0.65769 0.81801 0.81288 0.78849 0.58567 9  
DLS@CU Run1 0.73563 0.55230 0.80079 0.82293 0.84258 0.65986 10 7
DLS@CU Run3 0.73550 0.54528 0.80334 0.81949 0.84418 0.66657 11  
DTSim Run1 0.73493 0.57805 0.81527 0.83757 0.82286 0.61428 12 8
NaCTeM macro 0.73391 0.58484 0.79756 0.78949 0.82614 0.67039 13  
DLS@CU Run2 0.73297 0.55992 0.80334 0.81227 0.84418 0.64234 14  
Stasis xgboost 0.73050 0.50628 0.77824 0.82501 0.84861 0.70424 15 9
IHS-RD-Belarus Run1 0.72966 0.55322 0.82419 0.82634 0.83761 0.59904 16 10
USFD COMB-Features 0.72869 0.50850 0.82024 0.83828 0.79496 0.68926 17 11
USFD CNN 0.72705 0.51096 0.81899 0.83427 0.79268 0.68551 18  
saarsheff MT-Metrics-xgboost 0.72693 0.47716 0.78848 0.83212 0.84960 0.69815 19 12
MayoNLPTeam Run1 0.72646 0.58873 0.73458 0.76887 0.85020 0.69306 20  
UWB unsup 0.72622 0.64442 0.79352 0.82742 0.81209 0.53383 21  
UMD-TTIC-UW Run2 0.72619 0.64427 0.78708 0.79894 0.79338 0.59468 22  
SERGIOJIMENEZ Run2 0.72617 0.55257 0.78304 0.81505 0.81634 0.66630 23 13
IHS-RD-Belarus Run2 0.72465 0.53722 0.82539 0.82558 0.83654 0.59072 24  
DTSim Run3 0.72414 0.56189 0.81237 0.83239 0.81498 0.59103 25  
ECNU U-SEVEN 0.72427 0.47748 0.76681 0.83013 0.84239 0.71914   25*  
SERGIOJIMENEZ Run1 0.72411 0.50182 0.78646 0.83654 0.83638 0.66519 26  
NaCTeM Micro 0.72361 0.55214 0.79143 0.83134 0.82660 0.61241 27  
SERGIOJIMENEZ Run3 0.72215 0.49068 0.77725 0.82926 0.84807 0.67291 28  
DTSim Run2 0.72016 0.55042 0.79499 0.82815 0.81508 0.60766 29  
DCU-SEManiacs Fusion 0.71701 0.58328 0.76392 0.81386 0.84662 0.56576 30 14
DCU-SEManiacs Synthetic 0.71334 0.68762 0.72227 0.81935 0.80900 0.50560 31  
RICOH Run-b 0.71165 0.50871 0.78691 0.82661 0.86554 0.56245 32 15
ECNU S2 0.71175 0.57158 0.79036 0.77338 0.74968 0.67635   32*  
HHU Overlap 0.71134 0.50435 0.77406 0.83049 0.83846 0.60867 33 16
UMD-TTIC-UW Run3 0.71112 0.64316 0.77801 0.78158 0.77786 0.55855 34  
University of Birmingham CombineFeatures 0.70940 0.52460 0.81894 0.82066 0.81272 0.56040 35 17
University of Birmingham MethodsFeatures 0.70911 0.52028 0.81894 0.81958 0.81333 0.56451 36  
SimiHawk F 0.70647 0.44003 0.77109 0.81105 0.81600 0.71035 37  
UWB sup-try 0.70542 0.53333 0.77846 0.74673 0.78507 0.68909 38  
Stasis boostedtrees 0.70496 0.40791 0.77276 0.82903 0.84635 0.68359 39  
RICOH Run-n 0.70467 0.50746 0.77409 0.82248 0.86690 0.54261 40  
Stasis linear 0.70461 0.36929 0.76660 0.82730 0.83917 0.74615 41  
RICOH Run-s 0.70420 0.51293 0.78000 0.82991 0.86252 0.52319 42  
University of Birmingham CombineNoFeatures 0.70168 0.55217 0.82352 0.82406 0.80835 0.47904 43  
MathLingBudapest Run1 0.70025 0.40540 0.81187 0.80752 0.83767 0.64712 44 18
ISCAS_NLP S1 0.69996 0.49378 0.79763 0.81933 0.81185 0.57218 45 19
ISCAS_NLP S3 0.69996 0.49378 0.79763 0.81933 0.81185 0.57218 46  
UNBNLP Regression 0.69940 0.55254 0.71353 0.79769 0.81291 0.62037 47 20
DCU-SEManiacs task-internal 0.69924 0.62702 0.71949 0.80783 0.80854 0.51580 48  
MathLingBudapest Run2 0.69853 0.40540 0.80367 0.80752 0.83767 0.64712 49  
MathLingBudapest Run3 0.69853 0.40540 0.80366 0.80752 0.83767 0.64712 50  
ISCAS_NLP S2 0.69756 0.49651 0.79041 0.81214 0.81181 0.57181 51  
UNBNLP Average 0.69635 0.58520 0.69006 0.78923 0.82540 0.58605 52  
NUIG-UNLP m5all3 0.69528 0.40165 0.75400 0.80332 0.81606 0.72228 53 21
wolvesaar xgboost 0.69471 0.49947 0.72410 0.79076 0.84093 0.62055 54 22
wolvesaar lotsa-embeddings 0.69453 0.49415 0.71439 0.79655 0.83758 0.63509 55  
Meiji-WSL-A Run1 0.69435 0.58260 0.74394 0.79234 0.85962 0.47030 56 23
saarsheff MT-Metrics-boostedtrees 0.69259 0.37717 0.77183 0.81529 0.84528 0.66825 57  
wolvesaar DLS-replica 0.69244 0.48799 0.71043 0.80605 0.84601 0.61515 58  
saarsheff MT-Metrics-linear 0.68923 0.31539 0.76551 0.82063 0.83329 0.73987 59  
EECS Run2 0.68430 0.48013 0.77891 0.76676 0.82965 0.55926 60 24
NUIG-UNLP m5dom1 0.68368 0.41211 0.76778 0.75539 0.80086 0.69782 61  
EECS Run3 0.67906 0.47818 0.77719 0.77266 0.83744 0.51840 62  
PKU Run1 0.67852 0.47469 0.77881 0.77479 0.81472 0.54180 63 25
EECS Run1 0.67711 0.48110 0.77739 0.76747 0.83270 0.51479 64  
PKU Run2 0.67503 0.47444 0.77703 0.78119 0.83051 0.49892 65  
HHU SameWordsNeuralNet 0.67502 0.42673 0.75536 0.79964 0.84514 0.54533 66  
PKU Run3 0.67209 0.47271 0.77367 0.77580 0.81185 0.51611 67  
NSH Run1 0.66181 0.39962 0.74549 0.80176 0.79540 0.57080 68 26
RTM SVR 0.66847 0.44865 0.66338 0.80376 0.81327 0.62374   68*   26*
Meiji-WSL-A Run2 0.65871 0.51675 0.58561 0.78700 0.81873 0.59035 69  
BIT Align 0.65318 0.54530 0.78140 0.80473 0.79456 0.29972 70 27
UNBNLP tf-idf 0.65271 0.45928 0.66593 0.75778 0.77204 0.61710 71  
UTA_MLNLP 100-1 0.64965 0.46391 0.74499 0.74003 0.71947 0.58083 72 28
RTM FS+PLS-SVR 0.65237 0.35333 0.65294 0.80488 0.82304 0.64803   72*  
RTM PLS-SVR 0.65182 0.34401 0.66051 0.80641 0.82314 0.64544   72*  
SimiHawk LSTM 0.64840 0.44177 0.75703 0.71737 0.72317 0.60691 73  
BIT VecSim 0.64661 0.48863 0.62804 0.80106 0.79544 0.51702 74  
NUIG-UNLP m5dom2 0.64520 0.38303 0.76485 0.74351 0.76549 0.57263 75  
UTA_MLNLP 150-1 0.64500 0.43042 0.72133 0.71620 0.74471 0.62006 76  
SimiHawk TreeLSTM 0.64140 0.52277 0.74083 0.67628 0.70655 0.55265 77  
NORMAS SV-2 0.64078 0.36583 0.68864 0.74647 0.80234 0.61300 78 29
UTA_MLNLP 150-3 0.63698 0.41871 0.72485 0.70296 0.69652 0.65543 79  
LIPN-IIMAS SOPA 0.63087 0.44901 0.62411 0.69109 0.79864 0.59779 80 30
NORMAS ECV-3 0.63072 0.27637 0.72245 0.72496 0.79797 0.65312 81  
JUNITMZ Backpropagation-1 0.62708 0.48023 0.70749 0.72075 0.77196 0.43751 82 31
NSH Run2 0.62941 0.34172 0.74977 0.75858 0.82471 0.46548   82*  
LIPN-IIMAS SOPA1000 0.62466 0.44893 0.59721 0.75936 0.76157 0.56285 83  
USFD Word2Vec 0.62254 0.27675 0.64217 0.78755 0.75057 0.68833 84  
HHU DeepLDA 0.62078 0.47211 0.58821 0.62503 0.84743 0.57099 85  
ASOBEK T11 0.61782 0.52277 0.63741 0.78521 0.84245 0.26352 86 32
ASOBEK M11 0.61430 0.47916 0.68652 0.77779 0.84089 0.24804 87  
LIPN-IIMAS SOPA100 0.61321 0.43216 0.58499 0.74727 0.75560 0.55310 88  
Telkom University WA 0.60912 0.28859 0.69988 0.69090 0.74654 0.64009 89 33
BIT WeightedVecSim 0.59560 0.37565 0.55925 0.75594 0.77835 0.51643 90  
3CFEE grumlp 0.59603 0.36521 0.72092 0.74210 0.76327 0.37179   90*   34*
ASOBEK F1 0.59556 0.42692 0.67898 0.75717 0.81950 0.26181 91  
JUNITMZ Recurrent-1 0.59493 0.44218 0.66120 0.73708 0.69279 0.43092 92  
VRep withLeven 0.58292 0.30617 0.68745 0.69762 0.73033 0.49639 93 34
Meiji_WSL_teamC Run1 0.58169 0.53250 0.64567 0.74233 0.54783 0.42797 94 35
JUNITMZ FeedForward-1 0.58109 0.40859 0.66524 0.76752 0.66522 0.38711 95  
VRep withStopRem 0.57805 0.29487 0.68185 0.69730 0.72966 0.49029 96  
VRep noStopRem 0.55894 0.34684 0.67856 0.69768 0.74088 0.30908 97  
UNCC Run-3 0.55789 0.31111 0.59592 0.64672 0.75544 0.48409 98 36
Telkom University CS 0.51602 0.06623 0.72668 0.50534 0.73999 0.56194 99  
NORMAS RF-1 0.50895 0.16095 0.58800 0.62134 0.72016 0.46743 100  
STS Organizers baseline 0.51334 0.41133 0.54073 0.69601 0.82615 0.03844   100† 37†
meijiuniversity_teamb 4features_^LCS_pos 0.45748 0.26182 0.60223 0.55931 0.66989 0.16276 101 37
meijiuniversity_teamb 5features 0.45656 0.26498 0.60000 0.55231 0.66894 0.16518 102  
meijiuniversity_teamb 4features_^LCS 0.45621 0.26588 0.59868 0.55687 0.66690 0.16103 103  
Amrita_CEN SEWE-2 0.40951 0.30309 0.43164 0.63336 0.66465 -0.03174 104 38
VENSESEVAL Run1 0.44258 0.41932 0.62738 0.64538 0.27274 0.22581   104*   38*
UNCC Run-1 0.40359 0.19537 0.49344 0.38881 0.59235 0.34548 105  
UNCC Run-2 0.37956 0.18100 0.50924 0.26190 0.56176 0.38317 106  
WHU_NLP CNN20 0.11100 -0.02488 0.16026 0.16854 0.24153 0.00176 107 39
3CFEE bowkst 0.33174 0.20076 0.42288 0.50150 0.19111 0.35970   107*  
WHU_NLP CNN10 0.09814 0.05402 0.19650 0.08949 0.16152 -0.02989 108  
DalGTM Run1 0.05354 -0.00557 0.29896 -0.07016 -0.04933 0.08924 109 40
DalGTM Run2 0.05136 -0.00810 0.28962 -0.07032 -0.04818 0.08987 110  
DalGTM Run3 0.05027 -0.00780 0.28501 -0.07134 -0.05014 0.09223 111  
WHU_NLP CNN5 0.04713 0.00041 0.13780 0.03264 0.12669 -0.08105 112  
IHS-RD-Belarus Run3         0.83761   113  

 

Cross-lingual STS
Results from the 2016 Cross-lingual Semantic Textual Similarity (STS) shared task.

The main evaluation column is "mean". The rank column gives the rank of the submission as
ordered by the "mean" result.

Team News Multi Source Mean Run Ranking Team Rankingg
UWB-sup  0.90621 0.81899 0.86311 1 1
UWB-unsup  0.91237 0.80818 0.86089 2 1
SERGIOJIMENEZ-run1  0.88724 0.81836 0.85321 3 2
SERGIOJIMENEZ-run3  0.89651 0.80737 0.85246 4 2
DCU-SEManiacs-run2  0.89739 0.79262 0.84562 5 3
DCU-SEManiacs-run1  0.89408 0.76927 0.83241 6 3
SERGIOJIMENEZ-run2  0.82911 0.81266 0.82098 7 2
GWU_NLP-run2_xWMF  0.86768 0.73189 0.80058 8 4
GWU_NLP-run1_xWMF  0.86626 0.6985 0.78337 9 4
GWU_NLP-run3_bWMF  0.83474 0.72427 0.78015 10 4
CNRC-MT1  0.87562 0.64583 0.76208 11 5
CNRC-MT2  0.87754 0.63141 0.75592 12 5
CNRC-EMAP  0.71943 0.41054 0.5668 13 5
RTM-FS+PLS-SVR  0.59154 0.52044 0.55641 14 6
RTM-FS-SVR  0.53602 0.52839 0.53225 15 6
RTM-SVR  0.49849 0.52935 0.51374 16 6
FBK_HLT-MT-run3  0.25507 0.53892 0.39533 17 7
FBK_HLT-MT-run1  0.24318 0.53465 0.3872 18 7
FBK_HLT-MT-run2  0.24372 0.5142 0.37737 19 7
LIPM-IIMAS-sopa  0.08648 0.15931 0.12247 20 8
WHU_NLP-CNN10  0.03337 0.04083 0.03706 21 9
WHU_NLP-CNN5  0 0.05512 0.02724 22 9
WHU_NLP-CNN20  0.02428 -0.06355 -0.01912 23 9
JUNITMZ-backpropagation-1  -0.05676 -0.48389 -0.26781 24 10
JUNITMZ-backpropagation-3  -0.34725 -0.39867 -0.37266 25 10
JUNITMZ-backpropagation-2  -0.52951 -0.39891 -0.46498 26 10

 

Contact Info

STS Core

Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, and Aitor Gonzalez-Agirre

Cross-lingual STS

Carmen Banea, Daniel Cer, Rada Mihalcea, Janyce Wiebe

Wiki: STS Wiki
Discussion Group : STS-semeval

Other Info

Announcements

  • The official cross-lingual STS results have been posted! New!
  • The gold standard cross-lingual STS files have been released! New!