Dataset Restricted Access

mlphys101 - Exploring the performance of Large-Language Models in multilingual undergraduate physics education

Völschow, Marcel; Buczek, P.; Carreno-Mosquera, P.; Mousavias, C.; Reganova, S.; Roldan-Rodriguez, E.; Steinbach, Peter; Strube, A.

Large-Language Models such as ChatGPT have the potential to revo-
lutionize academic teaching in physics in a similar way the electronic calculator,
the home computer or the internet did. AI models are patient, produce answers
tailored to a student’s needs and are accessible whenever needed. Those involved
in academic teaching are facing a number of questions: Just how reliable are pub-
licly accessible models in answering, how does the question’s language affect the
models’ performance and how well do the models perform with more difficult tasks
beyond retrieval? To adress these questions, we benchmark a number of publicly
available models on the mlphys101 dataset, a new set of 823 university level MC5
questions and answers released alongside this work. While the original questions
are in English, we employ GPT-4 to translate them into various other languages,
followed by revision and refinement by native speakers. Our findings indicate that
state-of-the-art models perform well on questions involving the replication of facts,
definitions, and basic concepts, but struggle with multi-step quantitative reason-
ing. This aligns with existing literature that highlights the challenges LLMs face
in mathematical and logical reasoning tasks. We conclude that the most advanced
current LLMs are a valuable addition to the academic curriculum and LLM pow-
ered translations are a viable method to increase the accessibility of materials, but
their utility for more difficult quantitative tasks remains limited.

The dataset is available in English here only and will be removed, once the mlphys101 publication was accepted and released to the public.

The dataset is available in English here only and will be removed, once the mlphys101 publication was accepted and released to the public.
Restricted Access

You may request access to the files in this upload, provided that you fulfil the conditions below. The decision whether to grant/deny access is solely under the responsibility of the record owner.


Access will be granted based on reasonable request until we wait for the publication by the original author of the dataset on another platform.


12
0
views
downloads
All versions This version
Views 1212
Downloads 00
Data volume 0 Bytes0 Bytes
Unique views 88
Unique downloads 00

Share

Cite as