Results
Official results
The main evaluation column is "mean". The rank column gives the rank of the submission as ordered by the "mean" result.
English STS
RUN | answers-forums | answers-students | belief | headlines | images | Mean | Rank |
---|---|---|---|---|---|---|---|
Baseline-tokencos | 0.4453 | 0.6647 | 0.6517 | 0.5312 | 0.6039 | 0.5871 | 61 |
A96T-RUN1 | 0.6686 | 0.7192 | 0.7117 | 0.7357 | 0.7896 | 0.7337 | 29 |
ASAP-FIRSTRUN | 0.2304 | 0.6503 | 0.3928 | 0.6614 | 0.6548 | 0.5695 | 63 |
ASAP-SECONDRUN | 0.2374 | 0.7095 | 0.3986 | 0.7039 | 0.7294 | 0.6152 | 56 |
**ASAP-THIRDRUN | 0.2303 | 0.6719 | 0.4342 | 0.7156 | 0.7250 | 0.6112 | 57 |
AZMAT-RUNABS | 0.3099 | 0.4282 | 0.3568 | 0.5280 | 0.5118 | 0.4503 | 70 |
AZMAT-RUNCAP | 0.2932 | 0.4282 | 0.3526 | 0.5350 | 0.5186 | 0.4512 | 69 |
AZMAT-RUNSCALE | 0.2933 | 0.4293 | 0.3587 | 0.5264 | 0.5145 | 0.4490 | 71 |
BLCUNLP-1stRUN | 0.4231 | 0.5152 | 0.5510 | 0.5651 | 0.7163 | 0.5709 | 62 |
BLCUNLP-2ndRUN | 0.5725 | 0.6586 | 0.5510 | 0.7238 | 0.8271 | 0.6928 | 44 |
BLCUNLP-3rdRUN | 0.5725 | 0.5753 | 0.4462 | 0.7309 | 0.8070 | 0.6556 | 49 |
BUAP-RUN1 | 0.5564 | 0.6901 | 0.6473 | 0.7167 | 0.7658 | 0.6936 | 43 |
DalGTM-run1 | 0.2902 | -0.0534 | 0.0625 | 0.0598 | 0.0663 | 0.0623 | 74 |
DalGTM-run2 | 0.3537 | 0.1189 | 0.0625 | 0.2354 | 0.2042 | 0.1917 | 72 |
DalGTM-run3 | 0.1533 | 0.1189 | -0.1319 | -0.0395 | 0.2021 | 0.0731 | 73 |
DCU-RUN1 | 0.5556 | 0.6582 | 0.5464 | 0.8284 | 0.8394 | 0.7192 | 34 |
DCU-RUN2 | 0.5628 | 0.6233 | 0.7549 | 0.8187 | 0.8350 | 0.7340 | 28 |
DCU-RUN3 | 0.6530 | 0.6108 | 0.6977 | 0.8181 | 0.8434 | 0.7369 | 26 |
DLS@CU-S1 | 0.7390 | 0.7725 | 0.7491 | 0.8250 | 0.8644 | 0.8015 | 1 |
DLS@CU-S2 | 0.7241 | 0.7569 | 0.7223 | 0.8250 | 0.8631 | 0.7921 | 3 |
DLS@CU-U | 0.6821 | 0.7879 | 0.7325 | 0.8238 | 0.8485 | 0.7919 | 5 |
ECNU-1stSVMALL | 0.7145 | 0.7122 | 0.7282 | 0.7980 | 0.8467 | 0.7696 | 19 |
ECNU-2ndSVMONE | 0.6865 | 0.7329 | 0.6977 | 0.8196 | 0.8358 | 0.7701 | 18 |
ECNU-3rdMTL | 0.6919 | 0.7515 | 0.6951 | 0.8049 | 0.8575 | 0.7769 | 16 |
ExBThemis-default | 0.6946 | 0.7505 | 0.7521 | 0.8245 | 0.8527 | 0.7878 | 8 |
ExBThemis-themis | 0.6946 | 0.7505 | 0.7482 | 0.8245 | 0.8527 | 0.7873 | 9 |
ExBThemis-themisexp | 0.6946 | 0.7784 | 0.7482 | 0.8245 | 0.8527 | 0.7942 | 2 |
FBK-HLT-RUN1 | 0.7131 | 0.7442 | 0.7327 | 0.8079 | 0.8574 | 0.7831 | 12 |
FBK-HLT-RUN2 | 0.7101 | 0.7410 | 0.7377 | 0.8008 | 0.8545 | 0.7801 | 13 |
FBK-HLT-RUN3 | 0.6555 | 0.7362 | 0.7460 | 0.7083 | 0.8389 | 0.7461 | 23 |
FCICU-Run1 | 0.6152 | 0.6686 | 0.6109 | 0.7418 | 0.7853 | 0.7022 | 41 |
FCICU-Run2 | 0.3659 | 0.6460 | 0.5896 | 0.6448 | 0.6194 | 0.5970 | 59 |
FCICU-Run3 | 0.7091 | 0.7096 | 0.7184 | 0.7922 | 0.8223 | 0.7595 | 20 |
IITNLP-FirstRun | 0.3728 | 0.6605 | 0.7717 | 0.5996 | 0.8523 | 0.6712 | 47 |
MathLingBudapest-embedding | 0.7039 | 0.7004 | 0.7325 | 0.7690 | 0.8038 | 0.7478 | 22 |
MathLingBudapest-hybrid | 0.7231 | 0.7513 | 0.7473 | 0.8037 | 0.8442 | 0.7836 | 11 |
MathLingBudapest-machines | 0.6977 | 0.7455 | 0.7363 | 0.8046 | 0.8414 | 0.7771 | 15 |
MiniExperts-Run1 | 0.6781 | 0.7304 | 0.6294 | 0.6912 | 0.8109 | 0.7216 | 33 |
MiniExperts-Run2 | 0.6454 | 0.7093 | 0.5165 | 0.6084 | 0.7999 | 0.6746 | 45 |
MiniExperts-Run3 | 0.6179 | 0.6977 | 0.3236 | 0.5775 | 0.7954 | 0.6353 | 55 |
NeRoSim-R1 | 0.5260 | 0.7251 | 0.6311 | 0.8131 | 0.8585 | 0.7438 | 24 |
NeRoSim-R2 | 0.6940 | 0.7446 | 0.7512 | 0.8077 | 0.8647 | 0.7849 | 10 |
NeRoSim-R3 | 0.6778 | 0.7357 | 0.7220 | 0.8123 | 0.8570 | 0.7762 | 17 |
RTM-DCU-1stPLS.svr | 0.5484 | 0.5549 | 0.6223 | 0.7281 | 0.7189 | 0.6468 | 50 |
RTM-DCU-2ndST.svr | 0.5484 | 0.5549 | 0.6223 | 0.7281 | 0.7189 | 0.6468 | 51 |
RTM-DCU-3rdST.rr | 0.5484 | 0.5549 | 0.6223 | 0.7281 | 0.7189 | 0.6468 | 52 |
Samsung-alpha | 0.6589 | 0.7827 | 0.7029 | 0.8342 | 0.8701 | 0.7920 | 4 |
Samsung-beta | 0.6586 | 0.7819 | 0.6995 | 0.8342 | 0.8713 | 0.7916 | 7 |
Samsung-delta | 0.6639 | 0.7825 | 0.6952 | 0.8417 | 0.8634 | 0.7918 | 6 |
SemantiKLUE-RUN1 | 0.4913 | 0.7005 | 0.5617 | 0.6681 | 0.7915 | 0.6717 | 46 |
SopaLipnIimas-MLP | 0.6178 | 0.5864 | 0.6886 | 0.8121 | 0.8184 | 0.7175 | 36 |
SopaLipnIimas-RF | 0.6709 | 0.5914 | 0.7238 | 0.8123 | 0.8414 | 0.7356 | 27 |
SopaLipnIimas-SVM | 0.5918 | 0.5718 | 0.7028 | 0.7985 | 0.8104 | 0.7070 | 39 |
T2a-TrWP-run1 | 0.6857 | 0.6618 | 0.6769 | 0.7709 | 0.7865 | 0.7251 | 31 |
T2a-TrWP-run2 | 0.6857 | 0.6618 | 0.7245 | 0.7709 | 0.7865 | 0.7311 | 30 |
T2a-TrWP-run3 | 0.6857 | 0.6612 | 0.6772 | 0.7710 | 0.7865 | 0.7250 | 32 |
TATO-1stWTW | 0.6796 | 0.6853 | 0.7206 | 0.7667 | 0.8167 | 0.7422 | 25 |
UBC-RUN1 | 0.4764 | 0.5459 | 0.6788 | 0.6368 | 0.7852 | 0.6364 | 53 |
UMDuluth-BlueTeam-Run1 | 0.6561 | 0.7816 | 0.7363 | 0.8085 | 0.8236 | 0.7775 | 14 |
UQeResearch-AllRuns-run1 | 0.5923 | 0.6876 | 0.5904 | 0.7521 | 0.7817 | 0.7032 | 40 |
UQeResearch-AllRuns-run2 | 0.6132 | 0.6882 | 0.6229 | 0.7602 | 0.7855 | 0.7130 | 37 |
UQeResearch-AllRuns-run3 | 0.6188 | 0.6757 | 0.7178 | 0.7549 | 0.7769 | 0.7189 | 35 |
USAAR_SHEFFIELD-modelx | 0.3706 | 0.3609 | 0.4767 | 0.5183 | 0.5436 | 0.4616 | 68 |
USAAR_SHEFFIELD-modely | 0.6264 | 0.7386 | 0.7050 | 0.7927 | 0.8162 | 0.7533 | 21 |
USAAR_SHEFFIELD-modelz | 0.4237 | 0.6757 | 0.6994 | 0.5239 | 0.6833 | 0.6111 | 58 |
WSL-run1 | 0.3759 | 0.5269 | 0.6387 | 0.5462 | 0.5710 | 0.5379 | 66 |
WSL-run2 | 0.4287 | 0.6028 | 0.5231 | 0.6029 | 0.4879 | 0.5424 | 65 |
WSL-run3 | 0.3709 | 0.5437 | 0.6478 | 0.5752 | 0.6407 | 0.5672 | 64 |
Yamraj-1stRUNNAME | 0.5634 | 0.6727 | 0.6387 | 0.6067 | 0.7425 | 0.6558 | 48 |
Yamraj-2ndRUNNAME | 0.4367 | 0.4716 | 0.4890 | 0.5533 | 0.4799 | 0.4919 | 67 |
Yamraj-3rdRUNNAME | 0.5168 | 0.5835 | 0.6540 | 0.5861 | 0.6097 | 0.5912 | 60 |
yiGou-midbaitu | 0.5797 | 0.6571 | 0.6473 | 0.7115 | 0.8036 | 0.6964 | 42 |
yiGou-xiaobaitu | 0.6102 | 0.6872 | 0.6065 | 0.7369 | 0.8133 | 0.7114 | 38 |
*UBC-RUN1 | 0.4764 | 0.5459 | 0.6788 | 0.6368 | 0.7852 | 0.6364 | 54 |
Spanish STS
RUN | Wikipedia | Newswire | Mean | Rank |
---|---|---|---|---|
Baseline-tokencos | 0.52869 | 0.49493 | 0.50621 | 12 |
BUAP-run1 | 0.48873 | 0.40451 | 0.43266 | 15 |
ExBThemis-trainEn | 0.67630 | 0.67054 | 0.67247 | 3 |
ExBThemis-trainEs | 0.70545 | 0.68295 | 0.69047 | 1 |
ExBThemis-trainMini | 0.70550 | 0.68113 | 0.68927 | 2 |
RTM-DCU-1stST.tree | 0.58233 | 0.52513 | 0.54425 | 8 |
RTM-DCU-2ndST.rr | 0.58233 | 0.52513 | 0.54425 | 7 |
RTM-DCU-3rdST.SVR | 0.58233 | 0.52513 | 0.54425 | 6 |
SopaLipnIimas-MLP | 0.25257 | 0.53416 | 0.44005 | 13 |
SopaLipnIimas-RF | 0.56371 | 0.56545 | 0.56487 | 5 |
SopaLipnIimas-SVM | 0.41941 | 0.40067 | 0.40693 | 16 |
UMDuluth-BlueTeam-run1 | 0.59364 | 0.65471 | 0.63430 | 4 |
MiniExperts-run1 | 0.52390 | 0.50760 | 0.51305 | 11 |
MiniExperts-run2 | 0.46707 | 0.54370 | 0.51809 | 9 |
MiniExperts-run3 | 0.44015 | 0.55243 | 0.51490 | 10 |
Yamraj-1stNoConfidence | 0.57681 | 0.36541 | 0.43606 | 14 |
Yamraj-1stWithConfidence | 0.53240 | 0.34154 | 0.40533 | 17 |
Pilot on Interpretable STS
GOLD CHUNKS
RUN | F1 ALI | F1 TYPE | F1 SCORE | F1 TYP + SCO | F1 ALI | F1 TYPE | F1 SCORE | F1 TYP + SCO |
---|---|---|---|---|---|---|---|---|
baseline | 0.8448 | 0.5556 | 0.7551 | 0.5556 | 0.8388 | 0.4328 | 0.721 | 0.4326 |
ExBThemis__avgScorer | 0.8146 | 0.4943 | 0.7171 | 0.4885 | 0.8057 | 0.4413 | 0.6992 | 0.4246 |
ExBThemis__mostFreqScorer | 0.8146 | 0.4943 | 0.714 | 0.4884 | 0.8057 | 0.4413 | 0.7007 | 0.4296 |
ExBThemis__regressionScorer | 0.8146 | 0.4943 | 0.7158 | 0.4883 | 0.8052 | 0.4406 | 0.6989 | 0.4288 |
FCICU__Run1 | 0.8455 | 0.448 | 0.716 | 0.4325 | 0.8457 | 0.474 | 0.7273 | 0.4482 |
NeRoSim__R1 | 0.8984 | 0.6543 | 0.8262 | 0.6389 | 0.887 | 0.6143 | 0.7877 | 0.5841 |
NeRoSim__R2 | 0.8972 | 0.6558 | 0.8263 | 0.6401 | 0.88 | 0.5854 | 0.7818 | 0.5619 |
NeRoSim__R3 | 0.8976 | 0.6666 | 0.8157 | 0.6426 | 0.8834 | 0.6035 | 0.7837 | 0.5759 |
**RTM-DCU__1stIBM2Alignment | 0.4914 | 0.3712 | 0.455 | 0.3712 | 0.354 | 0.2283 | 0.3187 | 0.2282 |
SimCompass__combined | 0.871 | 0.5813 | 0.7651 | 0.5239 | 0.849 | 0.4555 | 0.7294 | 0.3965 |
SimCompass__prefix | 0.836 | 0.5834 | 0.7474 | 0.5338 | 0.8361 | 0.4708 | 0.7269 | 0.4157 |
SimCompass__word2vec | 0.8716 | 0.5806 | 0.7654 | 0.5253 | 0.8624 | 0.4599 | 0.7405 | 0.4017 |
UMDuluth_BlueTeam__1 | 0.8861 | 0.5962 | 0.796 | 0.5887 | 0.8853 | 0.5842 | 0.7932 | 0.5729 |
UMDuluth_BlueTeam__2 | 0.8861 | 0.5962 | 0.7968 | 0.5883 | 0.8853 | 0.6095 | 0.7968 | 0.5964 |
UMDuluth_BlueTeam__3 | 0.8861 | 0.59 | 0.798 | 0.5834 | 0.8853 | 0.5964 | 0.7909 | 0.5822 |
*UBC__RUN1 | 0.8991 | 0.5882 | 0.8031 | 0.5882 | 0.8846 | 0.4749 | 0.7709 | 0.4746 |
*UBC__RUN2 | 0.8991 | 0.6402 | 0.8211 | 0.6185 | 0.8846 | 0.6557 | 0.8085 | 0.6159 |
SYSTEM CHUNKS
RUN | F1 ALI | F1 TYPE | F1 SCORE | F1 TYP + SCO | F1 ALI | F1 TYPE | F1 SCORE | F1 TYP + SCO |
---|---|---|---|---|---|---|---|---|
baseline | 0.6701 | 0.4571 | 0.6066 | 0.4571 | 0.706 | 0.3696 | 0.6092 | 0.3693 |
ExBThemis__avgScorer | 0.7032 | 0.4331 | 0.6224 | 0.429 | 0.6966 | 0.397 | 0.6068 | 0.3806 |
ExBThemis__mostFreqScorer | 0.7032 | 0.4331 | 0.62 | 0.4288 | 0.6966 | 0.397 | 0.6106 | 0.387 |
ExBThemis__regressionScorer | 0.7032 | 0.4331 | 0.6209 | 0.4284 | 0.6966 | 0.397 | 0.6092 | 0.3867 |
**RTM-DCU__1stIBM2Alignment | 0.4914 | 0.3712 | 0.455 | 0.3712 | 0.354 | 0.2283 | 0.3187 | 0.2282 |
SimCompass__combined | 0.6467 | 0.4333 | 0.5636 | 0.387 | 0.5433 | 0.2854 | 0.4545 | 0.2421 |
SimCompass__prefix | 0.631 | 0.4284 | 0.5526 | 0.3872 | 0 | 0 | 0 | 0 |
SimCompass__word2vec | 0.6461 | 0.4334 | 0.5619 | 0.3878 | 0.5428 | 0.2831 | 0.4561 | 0.2427 |
UMDuluth_BlueTeam__1 | 0.782 | 0.5058 | 0.6968 | 0.5004 | 0.8336 | 0.5529 | 0.7498 | 0.5431 |
UMDuluth_BlueTeam__2 | 0.782 | 0.5109 | 0.6986 | 0.5049 | 0.8336 | 0.5759 | 0.7511 | 0.5634 |
UMDuluth_BlueTeam__3 | 0.782 | 0.5154 | 0.7024 | 0.5098 | 0.8336 | 0.5605 | 0.7456 | 0.5473 |
*UBC__RUN1 | 0.7709 | 0.5019 | 0.6892 | 0.5019 | 0.8388 | 0.445 | 0.728 | 0.4447 |
*UBC__RUN2 | 0.7709 | 0.4865 | 0.7014 | 0.4705 | 0.8388 | 0.6019 | 0.7634 | 0.5643 |
* Marks submissions which involve organizers of the task.
** Post-deadlines submissions/fixes.