Home Knowledge Base MAWPS (Math Word Problem Repository)

MAWPS (Math Word Problem Repository) is the unified testbed for evaluating arithmetic word problem solvers — aggregating multiple elementary math datasets (AddSub, MultiArith, SingleOp, SingleEq) into a standardized repository that enabled systematic comparison of semantic parsing, neural seq2seq, and symbolic AI approaches to math reasoning.

What Is MAWPS?

The Semantic Parsing Tradition

MAWPS was created in an era when the dominant approach to math word problems was semantic parsing — converting text into formal representations:

The repository unified these approaches by providing standardized train/test splits across all sub-datasets, enabling direct comparison.

Why MAWPS Was Strategically Important

Performance by Model Generation

ModelMAWPS Accuracy
SVM expression classifier (2015)~73%
Seq2Tree LSTM (2016)~88%
BERT fine-tuned (2020)~93%
GPT-3 few-shot (2022)~94%
GPT-4 (2023)~98%+

MAWPS in the Current Context

As a near-solved benchmark, MAWPS serves specific purposes:

Common Failure Patterns

Relationship to Other Benchmarks

BenchmarkDifficultyFocus
MAWPSElementaryArithmetic
GSM8KMiddle schoolMulti-step arithmetic
SVAMPElementary + adversarialRobustness
MATHCompetition levelCreative reasoning
AQuA-RATGRE/GMATAlgebraic reasoning

MAWPS is the elementary math class benchmark — historically essential for establishing arithmetic NLP baselines, now primarily serving as a sanity check confirming that modern LLMs have thoroughly mastered grade-school arithmetic word problems.

mawpsmawpsevaluation

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.