====================  PIT 2015  ====================
  SemEval-2015 Task 1:
  Paraphrase and Semantic Similarity in Twitter
====================================================
			    
ORGANIZERS

  Wei Xu, University of Pennsylvania
  Chris Callison-Burch, University of Pennsylvania
  Bill Dolan, Microsoft Research


TRAIN/DEV DATA

  The dataset contains the following files:
  
    ./data/readme.txt (this file)
    ./data/train.data (13063 sentence pairs)
    ./data/dev.data   (4727 sentence pairs)

  Notice that the train and dev data is collected from the same time period and
  same trending topics. In the evaluation later, we will test the system on data
  collected from a different time period.

  Both data files come in the tab-separated format. Each line contains 6 columns:
    
    | Trending_Topic_Name | Sent_1 | Sent_2 | Label | Sent_1_tag | Sent_2_tag |
 
  The "Trending_Topic_Name" are the names of trends provided by Twitter, which are
  not hashtags.
  
  The "Sent_1" and "Sent_2" are the two sentences, which are not necessarily full 
  tweets. Tweets were tokenized (thanks to Brendan O'Connor et al.) and 
  split into sentences. 
 
  The "Label" column is in a format such like "(1, 4)", which means among 5 votes 
  from Amazon Mechanical turkers only 1 is positive and 4 are negative. We would 
  suggest map them to binary labels as follows:
    
    paraphrases: (3, 2) (4, 1) (5, 0)
    non-paraphrases: (1, 4) (0, 5)
    debatable: (2, 3)  which you may discard if training binary classifier

  The "Sent_1_tag" and "Sent_2_tag" are the two sentences with part-of-speech 
  and named entity tags (thanks to Alan Ritter). 
        
  
BASELINE
  
  A logistic regression model using simple lexical overlap features:
    ./script/baseline_logisticregression.py
    
  Example output, if training on train.data and test on dev.data will look like:
    
    Read in 11513 training data ...  (after discarding the data with debatable cases)
    Read in 4139 test data ...       (see details in TRAIN/DEV DATA section)
    PRECISION: 0.704069050555
    RECALL:    0.389229720518
    F1:        0.501316944688
    ACCURACY:  0.725537569461 
  

REFERENCES 

   (details about how this data was collected and some analysis is in Chapter 6)
   Wei Xu (2014). Data-Drive Approaches for Paraphrasing Across Language Variations. 
   PhD thesis, Department of Computer Science, New York University.  
   http://www.cis.upenn.edu/~xwe/files/thesis-wei.pdf