这个模型有4个版本可供使用。最新的版本是默认版本,对应于 original Github repository 的tapas_sqa_inter_masklm_base_reset检查点。该模型在MLM上进行了预训练,并进行了作者称之为中间预训练的额外步骤,然后在 SQA 上进行了微调。默认情况下,它使用相对位置嵌入(即在表的每个单元格中重置位置索引)。
声明:发布TAPAS的团队没有为此模型撰写模型卡片,因此此模型卡片由Hugging Face团队和贡献者撰写。
[CLS] Question [SEP] Flattened table [SEP]
该模型使用32个Cloud TPU v3核心进行了200,000步的微调,最大序列长度为512,批次大小为128。在这个设置中,微调大约需要20小时。使用的优化器是学习率为1.25e-5的Adam,并具有0.2的预热比例。还添加了一个归纳偏差,使得模型只选择同一列的单元格。这可以通过TapasConfig的select_one_column参数来体现。请参阅 original paper 的表12。
@misc{herzig2020tapas, title={TAPAS: Weakly Supervised Table Parsing via Pre-training}, author={Jonathan Herzig and Paweł Krzysztof Nowak and Thomas Müller and Francesco Piccinno and Julian Martin Eisenschlos}, year={2020}, eprint={2004.02349}, archivePrefix={arXiv}, primaryClass={cs.IR} }
@misc{eisenschlos2020understanding, title={Understanding tables with intermediate pre-training}, author={Julian Martin Eisenschlos and Syrine Krichene and Thomas Müller}, year={2020}, eprint={2010.00571}, archivePrefix={arXiv}, primaryClass={cs.CL} }
@InProceedings{iyyer2017search-based, author = {Iyyer, Mohit and Yih, Scott Wen-tau and Chang, Ming-Wei}, title = {Search-based Neural Structured Learning for Sequential Question Answering}, booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics}, year = {2017}, month = {July}, abstract = {Recent work in semantic parsing for question answering has focused on long and complicated questions, many of which would seem unnatural if asked in a normal conversation between two humans. In an effort to explore a conversational QA setting, we present a more realistic task: answering sequences of simple but inter-related questions. We collect a dataset of 6,066 question sequences that inquire about semi-structured tables from Wikipedia, with 17,553 question-answer pairs in total. To solve this sequential question answering task, we propose a novel dynamic neural semantic parsing framework trained using a weakly supervised reward-guided search. Our model effectively leverages the sequential context to outperform state-of-the-art QA systems that are designed to answer highly complex questions.}, publisher = {Association for Computational Linguistics}, url = {https://www.microsoft.com/en-us/research/publication/search-based-neural-structured-learning-sequential-question-answering/}, }