数据集:

theblackcat102/codex-math-qa

中文

Codex Math QA

Solve math_qa using codex-davinci-002 via Python Programming.

Since OpenAI decided to shut off code-davinci-002 behind Azure , this dataset aims to share the generation results for code-davinci-002 OpenAI's 176B code generation model.

Data Splits

name train validation test
main 25065 4133 2985
rational - 4151 2985
  • Basic prompts : only question
  • dataset = load_dataset("theblackcat102/codex-math-qa", "main")
    

    Example :

    the average weight of 8 person ' s increases by 1.5 kg when a new person comes in place of one of them weighing 75 kg. what might be the weight of the new person ?
    Write a short snippet of python function which solve this problem. No need to explain the answer, let the code output the answer.
    

    output

    def average_weight(weights):
        total_weight = sum(weights)
        return total_weight / len(weights)
    
    def new_person_weight(weights):
        average_weight = average_weight(weights)
        new_person_weight = average_weight + 1.5
        return new_person_weight
    
    weights = [75, 80, 85, 90, 95]
    print(new_person_weight(weights))
    
  • Prompts with rationale : include the Rationale text which should helps the programming planinig
  • dataset = load_dataset("theblackcat102/codex-math-qa", "rational")
    

    Example :

    one copy machine can make 20 copies a minute, and a second copy machine makes 10 copies a minute. if the two copiers work together, how long would it take them to make 900 copies ?
    Rationale: "total work done by both machines in a minute = 20 + 10 = 30 copies total number of copies required = 900 time = 900 / 30 = 30 mins answer d"
    Write a short snippet of python function which solve this problem within 30 lines of code. Add comments in the function block for explaination.
    

    output

    def copy_machine(copies, copy_speed):
        """
        :param copies: number of copies required
        :param copy_speed: speed of the copy machine
        :return: time in minutes
        """
        total_work = sum(copy_speed)
        time = copies / total_work
        return time
    
    
    if __name__ == "__main__":
        print(copy_machine(900, [20, 10]))
    

    Notes:

    The generated results are unvalidated and are as what it is from the codex-davinci-002 outputs. So there's a majority of answers which is incorrect and code with syntax error. However, this is a work for a future study and the aim of this dataset was to provide a source or reference for code based math answering by codex-davinci-002.

    Dataset Creation

    Dataset was sourced from math_qa and append prompts at the end of section for generating Python solutions for the answer. This is an aim for providing dataset for the work offload seem in galactica

    The generation config for code-davinci-02 are as follows:

    name value
    max_tokens 2048
    temperature 0.5
    top_p 0.7

    Citation Information

    @inproceedings{amini-etal-2019-mathqa,
        title = "{M}ath{QA}: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms",
        author = "Amini, Aida  and
          Gabriel, Saadia  and
          Lin, Shanchuan  and
          Koncel-Kedziorski, Rik  and
          Choi, Yejin  and
          Hajishirzi, Hannaneh",
        booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
        month = jun,
        year = "2019",
        address = "Minneapolis, Minnesota",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/N19-1245",
        doi = "10.18653/v1/N19-1245",
        pages = "2357--2367",
    }