10 Open-Source Datasets For Text Classification

March 13, 2020

2094

One of the popular fields of research, text classification is the method of analysing textual data to gain meaningful information. According to sources, the global text analytics market is expected to post a CAGR of more than 20% during the period 2020-2024. Text classification can be used in a number of applications such as automating CRM tasks, improving web browsing, e-commerce, among others.

Check out 10 open-source datasets, which can be used for text classification. The Amazon Review dataset, for instance, consists of a few million Amazon customer reviews (input text) and star ratings (output labels) for learning how to train fastText for sentiment analysis. The size of the dataset is 493MB.

[Source: Analytics India Magazine]

RELATED ARTICLESMORE FROM AUTHOR

Celebrating the Second Year of Linux Man-Pages Maintenance Sponsorship

How to Deploy Lightweight Language Models on Embedded Linux with LiteLLM

Automating Compliance Management with UTMStack’s Open Source SIEM & XDR

Using OpenTelemetry and the OTel Collector for Logs, Metrics, and Traces

Xen 4.19 is released

RELATED ARTICLES MORE FROM AUTHOR