数据集:
biglam/hansard_speech
语言:
en计算机处理:
monolingual大小:
1M<n<10M语言创建人:
expert-generated批注创建人:
no-annotation源数据集:
original许可:
cc-by-4.0包含从1979年5月至2020年7月英国下议院的每篇讲话的数据集。引用于数据集主页。
如果您在数据集中发现任何错误,请与我联系。公众Hansard记录的完整性有时值得怀疑,虽然我已经改进了它,但数据是“原样”呈现的。
en:GB
{ 'id': 'uk.org.publicwhip/debate/1979-05-17a.390.0', 'speech': "Since the Minister for Consumer Affairs said earlier that the bread price rise would be allowed, in view of developing unemployment in the baking industry, and since the Mother's Pride bakery in my constituency is about to close, will the right hon. Gentleman give us a firm assurance that there will be an early debate on the future of the industry, so that the Government may announce that, thanks to the price rise, those workers will not now be put out of work?", 'display_as': 'Eric Heffer', 'party': 'Labour', 'constituency': 'Liverpool, Walton', 'mnis_id': '725', 'date': '1979-05-17', 'time': '', 'colnum': '390', 'speech_class': 'Speech', 'major_heading': 'BUSINESS OF THE HOUSE', 'minor_heading': '', 'oral_heading': '', 'year': '1979', 'hansard_membership_id': '5612', 'speakerid': 'uk.org.publicwhip/member/11615', 'person_id': '', 'speakername': 'Mr. Heffer', 'url': '', 'government_posts': [], 'opposition_posts': [], 'parliamentary_posts': ['Member, Labour Party National Executive Committee'] }
Variable | Description |
---|---|
id | The ID as assigned by mysociety |
speech | The text of the speech |
display_as | The standardised name of the MP. |
party | The party an MP is member of at time of speech |
constituency | Constituency represented by MP at time of speech |
mnis_id | The MP's Members Name Information Service number |
date | Date of speech |
time | Time of speech |
colnum | Column number in hansard record |
speech_class | Type of speech |
major_heading | Major debate heading |
minor_heading | Minor debate heading |
oral_heading | Oral debate heading |
year | Year of speech |
hansard_membership_id | ID used by mysociety |
speakerid | ID used by mysociety |
person_id | ID used by mysociety |
speakername | MP name as appeared in Hansard record for speech |
url | link to speech |
government_posts | Government posts held by MP (list) |
opposition_posts | Opposition posts held by MP (list) |
parliamentary_posts | Parliamentary posts held by MP (list) |
训练集:2694375
该数据集包含了在英国下议院发表的所有演讲,可用于多项深度学习任务,如检测语言和社会观点在40多年间的变化。该数据集还提供了更贴近一个精英英国机构所使用的口语。
该数据集是通过获取来自 data.parliament.uk 的数据创建的。没有进行规范化。
谁是源语言的生成者?[N/A]
无
谁是注释者?[N/A]
这是公开信息,不应包含任何个人和敏感信息。
该数据集的目的是了解语言使用和社会观点随时间的变化。
因为这个数据集跨越了漫长的时间段,可能包含在现代社会不可接受的语言和观点。
[需要更多信息]
该数据集是在 parlparse 的基础上由 Evan Odell 创建的。
知识共享署名4.0国际许可证
@misc{odell, evan_2021, title={Hansard Speeches 1979-2021: Version 3.1.0}, DOI={10.5281/zenodo.4843485}, abstractNote={<p>Full details are available at <a href="https://evanodell.com/projects/datasets/hansard-data">https://evanodell.com/projects/datasets/hansard-data</a></p> <p><strong>Version 3.1.0 contains the following changes:</strong></p> <p>- Coverage up to the end of April 2021</p>}, note={This release is an update of previously released datasets. See full documentation for details.}, publisher={Zenodo}, author={Odell, Evan}, year={2021}, month={May} }
感谢 @shamikbose 添加了这个数据集。