Home Knowledge Base CFQ (Compositional Freebase Questions)

CFQ (Compositional Freebase Questions) is the large-scale semantic parsing benchmark for measuring compositional generalization in natural language to SPARQL query translation over the Freebase knowledge graph — introducing the Maximum Compound Divergence (MCD) split methodology that maximizes the structural difference between training and test compounds, creating a rigorous compositional generalization test that exposed the limitations of standard seq2seq and pretrained language models.

What Is CFQ?

The MCD Split Innovation

Standard random train/test splits for semantic parsing are misleading — they allow the test set to contain the same predicate combinations as training, inflating accuracy estimates. MCD (Maximum Compound Divergence) creates splits that maximize structural novelty:

This design means a model that perfectly memorizes training compounds will score near 0% on MCD splits — only models that learn reusable predicate-level rules will generalize.

CFQ Results and the Generalization Gap

ModelMCD1MCD2MCD3Average
Seq2Seq (LSTM)28.9%5.0%10.8%14.9%
Transformer34.9%8.2%10.6%17.9%
BERT fine-tuned42.0%9.6%14.3%22.0%
T5 large62.0%30.1%31.2%41.1%
Compositional Struct. (~2023)81.0%51.0%60.0%64.0%
Human equivalent~97%+~97%+~97%+~97%+

The dramatic drop from random split (~97%) to MCD splits (~14-40%) demonstrates that standard models are "memorizing compounds, not learning rules."

Why CFQ Matters

Extensions

Comparison to COGS and SCAN

BenchmarkOutputGraph/DB CoverageCompound TypeScale
SCANAction sequencesNoneVerb+adverb20k
COGSλ-calculusNoneSyntactic roles24k
CFQSPARQLFreebase (large KB)Multi-join query patterns239k

CFQ is SPARQL composition for real-world knowledge graphs — measuring whether AI can parse complex natural language questions into database queries by combining learned predicate primitives in novel ways, with the MCD split methodology providing the most rigorous framework available for evaluating compositional generalization in semantic parsing.

cfqcfqevaluation

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.