10 Open-Source Datasets For Text Classification


One of the popular fields of research, text classification is the method of analysing textual data to gain meaningful information. According to sources, the global text analytics market is expected to post a CAGR of more than 20% during the period 2020-2024. Text classification can be used in a number of applications such as automating CRM tasks, improving web browsing, e-commerce, among others.

Check out 10 open-source datasets, which can be used for text classification. The Amazon Review dataset, for instance, consists of a few million Amazon customer reviews (input text) and star ratings (output labels) for learning how to train fastText for sentiment analysis. The size of the dataset is 493MB.

[Source: Analytics India Magazine]