数据集:
NTU-NLP-sg/xCodeEval
我们推出了xCodeEval,这是迄今为止规模最大的可执行多语言多任务基准测试,由来自大约7.5K个独特问题的25M个以文档为级别的编码示例组成,涵盖多达17种编程语言,具有执行级别的并行性。它包含总共七个任务,涉及代码理解、生成、翻译和检索,并采用基于执行的评估方法。我们开发了一个基于测试用例的多语言代码执行引擎 ExecEval ,支持xCodeEval中的所有编程语言。我们还提出了一种基于几何平均和图论原理的数据拆分和数据选择模式,用于平衡多个属性上的数据分布。
该存储库包含了xCodeEval paper 的示例代码和数据链接。
当前该存储库支持huggingface load_dataset() API。按照以下示例加载个别示例的数据集。
import datasets prog_synthesis_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "program_synthesis") code_translation_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "code_translation") tag_classification_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "tag_classification") apr_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "apr") pcode_compilation_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "code_compilation") retrieval_code_code_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "retrieval_code_code") retrieval_nl_code_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "retrieval_nl_code") retrieval_corpus_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "retrieval_corpus")
如果您在数据处理中遇到长时间延迟,请添加 ignore_verifications=True。
prog_synthesis_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "program_synthesis", ignore_verifications=True)
如果您在数据下载中遇到长时间延迟,请使用huggingface流式模式。
prog_synthesis_dataset = datasets.load_dataset("NTU-NLP-sg/xCodeEval", "program_synthesis", streaming=True)
数据也可以从huggingface的git LFS存储库中下载。
您可以使用以下命令下载完整数据。
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/NTU-NLP-sg/xCodeEval cd xCodeEval git lfs pull
要下载数据集的特定部分,
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/NTU-NLP-sg/xCodeEval cd xCodeEval git lfs pull --include "apr/test/*"
我们提出了7个任务。
如果您不使用huggingface load_dataset() API,您可能需要将某些数据链接到不同的任务上。
我们有两个数据文件需要多个任务使用。
您可以在huggingface数据集存储库的 main 分支的根目录中找到这两个文件。为了避免数据冗余,我们没有将这些数据与相关任务一起包含,而是添加了一个唯一的ID src_uid以检索这些数据。
一个示例:
{ "description": "There are $$$n$$$ positive integers $$$a_1, a_2, \\dots, a_n$$$. For the one move you can choose any even value $$$c$$$ and divide by two all elements that equal $$$c$$$.For example, if $$$a=[6,8,12,6,3,12]$$$ and you choose $$$c=6$$$, and $$$a$$$ is transformed into $$$a=[3,8,12,3,3,12]$$$ after the move.You need to find the minimal number of moves for transforming $$$a$$$ to an array of only odd integers (each element shouldn't be divisible by $$$2$$$).", "input_from": "standard input", "output_to": "standard output", "time_limit": "3 seconds", "memory_limit": "256 megabytes", "input_spec": "The first line of the input contains one integer $$$t$$$ ($$$1 \\le t \\le 10^4$$$) \u2014 the number of test cases in the input. Then $$$t$$$ test cases follow. The first line of a test case contains $$$n$$$ ($$$1 \\le n \\le 2\\cdot10^5$$$) \u2014 the number of integers in the sequence $$$a$$$. The second line contains positive integers $$$a_1, a_2, \\dots, a_n$$$ ($$$1 \\le a_i \\le 10^9$$$). The sum of $$$n$$$ for all test cases in the input doesn't exceed $$$2\\cdot10^5$$$.", "output_spec": "For $$$t$$$ test cases print the answers in the order of test cases in the input. The answer for the test case is the minimal number of moves needed to make all numbers in the test case odd (i.e. not divisible by $$$2$$$).", "notes": "NoteIn the first test case of the example, the optimal sequence of moves can be as follows: before making moves $$$a=[40, 6, 40, 3, 20, 1]$$$; choose $$$c=6$$$; now $$$a=[40, 3, 40, 3, 20, 1]$$$; choose $$$c=40$$$; now $$$a=[20, 3, 20, 3, 20, 1]$$$; choose $$$c=20$$$; now $$$a=[10, 3, 10, 3, 10, 1]$$$; choose $$$c=10$$$; now $$$a=[5, 3, 5, 3, 5, 1]$$$ \u2014 all numbers are odd. Thus, all numbers became odd after $$$4$$$ moves. In $$$3$$$ or fewer moves, you cannot make them all odd.", "sample_inputs": [ "4\n6\n40 6 40 3 20 1\n1\n1024\n4\n2 4 8 16\n3\n3 1 7" ], "sample_outputs": [ "4\n10\n4\n0" ], "tags": [ "number theory", "greedy" ], "src_uid": "afcd41492158e68095b01ff1e88c3dd4", "difficulty": 1200, "created_at": 1576321500 }
json文件的结构:
unittest_db = { "db884d679d9cfb1dc4bc511f83beedda" : [ { "input": "4\r\n3 2 3 2\r\n", "output": [ "1" ], }, { ... }, ... ] "3bc096d8cd3418948d5be6bf297aa9b5":[ ... ], ... }
@misc{khan2023xcodeeval, title={xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval}, author={Mohammad Abdullah Matin Khan and M Saiful Bari and Xuan Long Do and Weishi Wang and Md Rizwan Parvez and Shafiq Joty}, year={2023}, eprint={2303.03004}, archivePrefix={arXiv}, primaryClass={cs.CL} }