数据集:

Anthropic/llm_global_opinions

预印本库:

arxiv:2306.16388

大小:

1K<n<10K

语言:

en
中文

Dataset Card for GlobalOpinionQA

Dataset Summary

The data contains a subset of survey questions about global issues and opinions adapted from the World Values Survey and Pew Global Attitudes Survey .

The data is further described in the paper: Towards Measuring the Representation of Subjective Global Opinions in Language Models .

Purpose

In our paper, we use this dataset to analyze the opinions that large language models (LLMs) reflect on complex global issues. Our goal is to gain insights into potential biases in AI systems by evaluating their performance on subjective topics.

Data Format

The data is in a CSV file with the following columns:

  • question: The text of the survey question.
  • selections: A dictionary where the key is the country name and the value is a list of percentages of respondents who selected each answer option for that country.
  • options: A list of the answer options for the given question.
  • source: GAS/WVS depending on whether the question is coming from Global Attitudes Survey or World Value Survey.

Usage

from datasets import load_dataset
# Loading the data
dataset = load_dataset("Anthropic/llm_global_opinions")

Disclaimer

We recognize the limitations in using this dataset to evaluate LLMs, as they were not specifically designed for this purpose. Therefore, we acknowledge that the construct validity of these datasets when applied to LLMs may be limited.

Contact

For questions, you can email esin at anthropic dot com

Citation

If you would like to cite our work or data, you may use the following bibtex citation:

@misc{durmus2023measuring,
      title={Towards Measuring the Representation of Subjective Global Opinions in Language Models}, 
      author={Esin Durmus and Karina Nyugen and Thomas I. Liao and Nicholas Schiefer and Amanda Askell and Anton Bakhtin and Carol Chen and Zac Hatfield-Dodds and Danny Hernandez and Nicholas Joseph and Liane Lovitt and Sam McCandlish and Orowa Sikder and Alex Tamkin and Janel Thamkul and Jared Kaplan and Jack Clark and Deep Ganguli},
      year={2023},
      eprint={2306.16388},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}