Dataset Restricted Access
Völschow, Marcel; Buczek, P.; Carreno-Mosquera, P.; Mousavias, C.; Reganova, S.; Roldan-Rodriguez, E.; Steinbach, Peter; Strube, A.
{ "DOI": "10.14278/rodare.3137", "type": "dataset", "issued": { "date-parts": [ [ 2024, 9, 9 ] ] }, "abstract": "<p>Large-Language Models such as ChatGPT have the potential to revo-<br>\r\nlutionize academic teaching in physics in a similar way the electronic calculator,<br>\r\nthe home computer or the internet did. AI models are patient, produce answers<br>\r\ntailored to a student\u2019s needs and are accessible whenever needed. Those involved<br>\r\nin academic teaching are facing a number of questions: Just how reliable are pub-<br>\r\nlicly accessible models in answering, how does the question\u2019s language affect the<br>\r\nmodels\u2019 performance and how well do the models perform with more difficult tasks<br>\r\nbeyond retrieval? To adress these questions, we benchmark a number of publicly<br>\r\navailable models on the mlphys101 dataset, a new set of 823 university level MC5<br>\r\nquestions and answers released alongside this work. While the original questions<br>\r\nare in English, we employ GPT-4 to translate them into various other languages,<br>\r\nfollowed by revision and refinement by native speakers. Our findings indicate that<br>\r\nstate-of-the-art models perform well on questions involving the replication of facts,<br>\r\ndefinitions, and basic concepts, but struggle with multi-step quantitative reason-<br>\r\ning. This aligns with existing literature that highlights the challenges LLMs face<br>\r\nin mathematical and logical reasoning tasks. We conclude that the most advanced<br>\r\ncurrent LLMs are a valuable addition to the academic curriculum and LLM pow-<br>\r\nered translations are a viable method to increase the accessibility of materials, but<br>\r\ntheir utility for more difficult quantitative tasks remains limited.</p>\r\n\r\n<p>The dataset is available in English here only and will be removed, once the mlphys101 publication was accepted and released to the public.</p>", "note": "The dataset is available in English here only and will be removed, once the mlphys101 publication was accepted and released to the public.", "author": [ { "family": "V\u00f6lschow, Marcel" }, { "family": "Buczek, P." }, { "family": "Carreno-Mosquera, P." }, { "family": "Mousavias, C." }, { "family": "Reganova, S." }, { "family": "Roldan-Rodriguez, E." }, { "family": "Steinbach, Peter" }, { "family": "Strube, A." } ], "publisher": "Rodare", "language": "eng", "title": "mlphys101 - Exploring the performance of Large-Language Models in multilingual undergraduate physics education", "id": "3137" }
All versions | This version | |
---|---|---|
Views | 110 | 110 |
Downloads | 0 | 0 |
Data volume | 0 Bytes | 0 Bytes |
Unique views | 94 | 94 |
Unique downloads | 0 | 0 |