数据集:
AhmedSSoliman/DJANGO
Django dataset used in the paper "Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation" , Oda et al., ASE, 2015.
The Django dataset is a dataset for code generation comprising of 16000 training, 1000 development and 1805 test annotations. Each data point consists of a line of Python code together with a manually created natural language description.
@inproceedings{oda2015ase:pseudogen1, author = {Oda, Yusuke and Fudaba, Hiroyuki and Neubig, Graham and Hata, Hideaki and Sakti, Sakriani and Toda, Tomoki and Nakamura, Satoshi}, title = {Learning to Generate Pseudo-code from Source Code Using Statistical Machine Translation}, booktitle = {Proceedings of the 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)}, series = {ASE '15}, month = {November}, year = {2015}, isbn = {978-1-5090-0025-8}, pages = {574--584}, numpages = {11}, url = {https://doi.org/10.1109/ASE.2015.36}, doi = {10.1109/ASE.2015.36}, acmid = {2916173}, publisher = {IEEE Computer Society}, address = {Lincoln, Nebraska, USA} }