Home Knowledge Base NewsQA

NewsQA is the machine reading comprehension dataset of 119,633 question-answer pairs based on CNN news articles — distinguished by its information-seeking construction methodology where crowdworkers wrote questions after seeing only the article headline and summary bullets, not the full article, ensuring questions represent genuine curiosity-driven information seeking rather than passage-scanning exercises.

Construction Methodology and Its Significance

Most reading comprehension datasets are constructed retrospectively: annotators read a passage and then write questions about what they just read. This produces questions whose answers are mentally available to the question writer, often leading to questions that can be answered by surface-level keyword matching rather than genuine comprehension.

NewsQA used a two-phase construction that separates question creation from answer annotation:

Phase 1 — Question Writing: Crowdworkers saw only the CNN article headline and the editorial highlight bullets (3–5 key facts). Without reading the full article, they wrote questions they would want answered — genuine information gaps relative to what the headline and bullets told them.

Phase 2 — Answer Annotation: A different set of crowdworkers received the full article and each question, then selected the answer span (or marked it as unanswerable). Multiple annotators provided answers; disagreements were adjudicated.

This separation produces questions that genuinely probe the article's informational content rather than surface features of the text — because question writers had no access to the surface form of the article.

Dataset Characteristics

Challenges and Characteristics

Inverted Pyramid Reading: CNN news articles use the inverted pyramid structure — most important information at the top, supporting details below. NewsQA questions frequently probe the supporting detail sections rather than the lead paragraph, requiring reading the full article.

Multi-Sentence Evidence: Many NewsQA answers require integrating information across multiple non-adjacent sentences. "Why did the president veto the bill?" may require one sentence stating the veto and another giving the reason, separated by paragraphs of background.

Ambiguous and Null Answers: The information-seeking construction naturally produces questions that the article does not fully answer — reflecting the reality that news articles often raise more questions than they resolve. The 9.5% null rate is lower than SQuAD 2.0 (50%) but reflects genuine information gaps.

Journalism-Specific Language: News writing uses specialized conventions: attributions ("according to officials"), hedging ("allegedly"), temporal markers ("last Tuesday"), and unnamed sources ("a senior official said"). Models must handle these conventions to extract accurate answers.

Comparison with SQuAD

AspectSQuAD v1.1NewsQA
SourceWikipedia (encyclopedia)CNN news articles
ConstructionRetrospectiveInformation-seeking
Article length~120 words/passage~600 words/article
Null answersNone~9.5%
Human F1~91.2~69.4
Answer distributionUniformFront-heavy (inverted pyramid)

The lower human F1 on NewsQA (69.4 vs. 91.2) reflects genuine ambiguity in news writing: multiple valid interpretations, partial answers, and questions that touch on information only implied rather than stated in the article.

Model Performance

ModelNewsQA F1
LSTM baseline50.1
BERT-base65.9
RoBERTa-large74.2
Human69.4

RoBERTa-large surpasses the human baseline in F1, but human annotators show more consistent and semantically valid answers at individual question level — the F1 metric advantage reflects answer span selection patterns rather than genuine comprehension superiority.

Information-Seeking QA and Downstream Applications

NewsQA's information-seeking design mirrors real-world applications:

News Search and Retrieval: Users searching for information about an event have seen headlines and want specific details — exactly the information gap that NewsQA questions model.

Automated Journalism: Systems that generate news summaries or answer questions about breaking events need the comprehension skills NewsQA tests.

Fact-Checking: Verifying claims against news articles requires reading journalism-style text and extracting specific factual claims.

Enterprise Knowledge Management: Internal news feeds and corporate communications require the same information-seeking QA pattern — employees who have seen an executive summary want details from the underlying report.

Legacy and Influence

NewsQA contributed to the understanding that:

NewsQA is the news reading comprehension benchmark built around genuine curiosity — constructed so that questions reflect what a reader actually wants to know after seeing a headline, producing a harder and more realistic reading comprehension challenge than passage-scanning exercises.

newsqaevaluation

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.