MAWPS: A Math Word Problem Repository. Rik Koncel-Kedziorski**, Subhro Roy**, Aida Amini, Nate Kushman and Hannaneh Hajishirzi. NAACL 2016 (Short). The system is live ...
The Long Multiplication Benchmark evaluates Large Language Models (LLMs) on their ability to handle and utilize long contexts to solve multiplication problems. Despite long multiplication requiring ...
What began with a focus on weather forecasting has evolved toward addressing errors in scientific modeling. In the collaborative environment of the Penn State Institute for Computational and Data ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results