数据集:

edarchimbaud/news-sp500

中文

Dataset Card for "news-sp500"

Dataset Summary

The news-sp500 dataset provides news articles related to companies in the S&P 500 index.

Supported Tasks and Leaderboards

The dataset can be used for various natural language processing tasks such as text classification, sentiment analysis, information extraction, etc. It does not have a specific leaderboard associated with it.

Languages

The dataset contains news articles in multiple languages.

Dataset Structure

Data Instances

The dataset consists of [1563] data instances.

Data Fields

  • symbol (string): A string representing the ticker symbol or abbreviation used to identify the company.
  • body (string): The main content of the news article.
  • publisher (string): The name of the publisher or news agency.
  • publish_time (timestamp[ns, tz=GMT]): A timestamp indicating the publication time of the news article in GMT timezone.
  • title (string): The title or headline of the news article.
  • url (string): The URL or link to the original news article.
  • uuid (string): A unique identifier for the news article.

Data Splits

The dataset consists of a single split called train.

Dataset Creation

Curation Rationale

The news-sp500 dataset was created to provide a collection of news articles related to companies in the S&P 500 index for research and analysis purposes.

Source Data

Initial Data Collection and Normalization

The data was collected from various online news sources and normalized for consistency.

Annotations

Annotation process

[N/A]

Who are the annotators?

[N/A]

Personal and Sensitive Information

[N/A]

Considerations for Using the Data

Social Impact of Dataset

[N/A]

Discussion of Biases

[N/A]

Other Known Limitations

[N/A]

Additional Information

Dataset Curators

The news-sp500 dataset was collected by https://edarchimbaud.substack.com .

Licensing Information

The news-sp500 dataset is licensed under the MIT License.

Citation Information

https://edarchimbaud.substack.com , news-sp500 dataset, GitHub repository, https://github.com/edarchimbaud

Contributions

Thanks to @edarchimbaud for adding this dataset.