Sentiment-140 dataset has 800,000 tweets with positive emoticons, and 800,000 tweets with negative emoticons, for a total of 1,600,000 training tweets as well as a test set of 177 negative tweets and 182 positive tweets with only some data containing emoticons. This dataset is useful for consumers or companies to automatically classify the sentiment of their brands, product, or topic on Twitter as either positive or negative with respect to a query term. The dataset has only tweets in English.