模型:
google/tapas-large-finetuned-sqa
该模型有两个可用的版本。默认版本对应于 tapas_sqa_inter_masklm_large_reset 检查点的 original Github repository 。该模型在MLM上进行了预训练,并额外进行了一步,作者称之为中间预训练,然后在 SQA 上进行了精调。它使用相对位置嵌入(即在表格的每个单元格处重置位置索引)。
另一个(非默认)可用的版本是:
免责声明:发布TAPAS的团队没有为该模型编写模型卡片,因此这个模型卡片是由Hugging Face团队和贡献者编写的。
Size | Reset | Dev Accuracy | Link |
---|---|---|---|
LARGE | noreset | 0.7223 | 1236321 |
LARGE | reset | 0.7289 | 1237321 |
BASE | noreset | 0.6737 | 1238321 |
BASE | reset | 0.874 | 1239321 |
MEDIUM | noreset | 0.6464 | 12310321 |
MEDIUM | reset | 0.6561 | 12311321 |
SMALL | noreset | 0.5876 | 12312321 |
SMALL | reset | 0.6155 | 12313321 |
MINI | noreset | 0.4574 | 12314321 |
MINI | reset | 0.5148 | 12315321 ) |
TINY | noreset | 0.2004 | 12316321 |
TINY | reset | 0.2375 | 12317321 |
TAPAS是一种类似BERT的transformers模型,通过自我监督的方式在大型英文数据语料库(来自维基百科)上进行预训练。这意味着它只在原始表格和相关文本上进行了预训练,没有以任何方式人工标注它们(这就是它可以使用大量公开可用数据的原因),而是使用自动生成输入和标签的自动过程。更确切地说,它通过两个目标进行预训练:
通过这种方式,模型学习了表格和相关文本中使用的英语的内部表示,然后可以用于提取对下游任务有用的特征,例如回答关于表格的问题,或确定一句话是否被表格的内容支持或反驳。通过在预训练模型之上添加单元格选择头,并随机初始化此分类头与基本模型一起对SQA进行联合训练来进行精调。
您可以使用该模型来回答与表格相关的问题,例如在对话设置中。
有关代码示例,请参阅HuggingFace网站上的TAPAS文档。
对文本进行小写处理,并使用WordPiece进行分词,词汇大小为30,000。该模型的输入形式为:
[CLS] Question [SEP] Flattened table [SEP]
该模型在32个Cloud TPU v3核心上进行了200,000步的精调,最大序列长度为512,批量大小为128。在此设置中,精调大约需要20小时。使用的优化器是Adam,学习率为1.25e-5,热身比例为0.2。添加归纳偏差,使模型只选择同一列的单元格。这通过 TapasConfig 的 select_one_column 参数体现。另请参阅 original paper 的表12。
@misc{herzig2020tapas, title={TAPAS: Weakly Supervised Table Parsing via Pre-training}, author={Jonathan Herzig and Paweł Krzysztof Nowak and Thomas Müller and Francesco Piccinno and Julian Martin Eisenschlos}, year={2020}, eprint={2004.02349}, archivePrefix={arXiv}, primaryClass={cs.IR} }
@misc{eisenschlos2020understanding, title={Understanding tables with intermediate pre-training}, author={Julian Martin Eisenschlos and Syrine Krichene and Thomas Müller}, year={2020}, eprint={2010.00571}, archivePrefix={arXiv}, primaryClass={cs.CL} }
@InProceedings{iyyer2017search-based, author = {Iyyer, Mohit and Yih, Scott Wen-tau and Chang, Ming-Wei}, title = {Search-based Neural Structured Learning for Sequential Question Answering}, booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics}, year = {2017}, month = {July}, abstract = {Recent work in semantic parsing for question answering has focused on long and complicated questions, many of which would seem unnatural if asked in a normal conversation between two humans. In an effort to explore a conversational QA setting, we present a more realistic task: answering sequences of simple but inter-related questions. We collect a dataset of 6,066 question sequences that inquire about semi-structured tables from Wikipedia, with 17,553 question-answer pairs in total. To solve this sequential question answering task, we propose a novel dynamic neural semantic parsing framework trained using a weakly supervised reward-guided search. Our model effectively leverages the sequential context to outperform state-of-the-art QA systems that are designed to answer highly complex questions.}, publisher = {Association for Computational Linguistics}, url = {https://www.microsoft.com/en-us/research/publication/search-based-neural-structured-learning-sequential-question-answering/}, }