Definite Description Resolution

Keywords: definite description resolution, nlp

Definite Description Resolution is the discourse interpretation task of determining what a definite noun phrase (a noun phrase beginning with "the") refers to — distinguishing whether "the X" points back to a previously mentioned X in the current discourse (anaphoric use), refers to a unique object assumed to exist in the shared world (unique existential use), or bridges to an entity related to a prior antecedent (associative use).

The Philosophical Foundation

Definite descriptions — noun phrases of the form "the N" — have been central to philosophy of language since Bertrand Russell's 1905 analysis "On Denoting." Russell proposed that "The king of France is bald" asserts: (1) there exists exactly one king of France, and (2) that entity is bald. When there is no king of France, the sentence is false rather than meaningless (contra Frege, who considered it a reference failure).

For computational linguistics, Russell's analysis highlights the key challenge: "the" signals that the referent is identifiable to the reader, but the mechanism of identification differs fundamentally across contexts.

Three Principal Uses of Definite Descriptions

Anaphoric Use (Discourse-Referential):
The referent was explicitly introduced earlier in the discourse.
"A woman entered the room. The woman was carrying a briefcase."
"The woman" refers back to the previously mentioned woman. Resolution is discourse-internal: search the current discourse model for a matching entity. Standard coreference resolvers handle this case.

Unique Existential Use (World-Referential):
The entity is unique in the world (or uniquely identifiable from shared background knowledge) without prior mention in discourse.
"The sun rose at 6:43 a.m." — No prior mention of the sun needed; it is unique in the shared world model.
"The President called an emergency session." — Identifiable from shared political knowledge of who holds the presidency.
These references invoke world knowledge rather than discourse memory.

Associative / Bridging Use:
The entity is identifiable through its relationship to a previously mentioned entity.
"We checked into the hotel. The elevator was broken." — No prior mention of an elevator; it bridges from "the hotel" via world knowledge that hotels contain elevators.
This case blurs the boundary between anaphoric and existential uses and requires commonsense inference.

Discourse Status Theory

Linguist Ellen Prince's Familiarity Scale (1981) provides a theoretical framework:
- Brand-New: New entity being introduced ("a man").
- Unused: Unique world entity not yet mentioned ("the sun," "the president").
- Inferrable (Bridging): Entity inferable from context ("the elevator" from "the hotel").
- Textually Evoked: Previously mentioned entity being resumed ("the man" after "a man").

Definite description resolution assigns each definite NP to a position on this scale, then applies the appropriate resolution strategy.

Why Computational Resolution Is Hard

Anaphoric vs. Existential Ambiguity: "The dog barked" — anaphoric (referring to a dog previously mentioned) or existential (referring to the speaker's known dog)? The distinction requires modeling the discourse state, world knowledge, and pragmatic context simultaneously.

Bridging Identification: Distinguishing bridging uses from new entity introductions ("the elevator" in a hotel context vs. "the elevator" in a context with no previously mentioned building) requires assessing whether a plausible bridging relation exists.

Definite Plural Complexity: "The doctors" — does this refer to all doctors in the world (generic), a specific previously mentioned group, or a group inferrable from context ("the doctors" present at a hospital mentioned earlier)?

Reference Failure and Presupposition: "The present king of France is bald" — the definite description presupposes the existence of a unique referent. When the presupposition fails, standard resolution strategies break down. Models must handle presupposition failure gracefully.

Resolution Approaches

Rule-Based Salience Hierarchy: Classic Centering Theory (Grosz, Joshi, and Weinstein, 1995) defines a salience hierarchy for discourse entities: subjects > objects > other arguments. "The X" preferentially resolves to the highest-salience entity matching type X in the current utterance's backward-looking center.

Neural Mention-Ranking: Modern coreference models (SpanBERT-based) score each candidate antecedent for compatibility with the definite description using learned representations. Fine-tuned on OntoNotes for anaphoric uses; extended with knowledge-enhanced models for bridging.

World Knowledge Integration: For unique existential uses, a model must recognize that certain definite descriptions invoke world knowledge rather than discourse search. Named entity recognition, knowledge graph lookup, and entity salience models jointly identify world-referential descriptions.

Presupposition Filtering: Pragmatic inference to detect when a definite description's presupposition fails — when no unique referent exists — enabling the model to flag reference failure rather than confabulate a referent.

Connection to NLP Downstream Tasks

- Coreference Resolution: Definite description resolution is a sub-problem of full coreference: resolving "the CEO" to "Satya Nadella" mentioned two paragraphs earlier.
- Reading Comprehension: Answering "What did the pilot do?" requires resolving "the pilot" to the specific individual from the passage.
- Summarization: Using definite descriptions in summaries without establishing their referents creates unresolved references for readers who have not read the source.
- Fact Extraction: "The agreement was signed in 2022" — which agreement? Only a resolved referent enables accurate fact storage.
- Dialogue Systems: "What about the price?" in a shopping dialogue requires resolving "the price" to the item currently under discussion.

Definite Description Resolution is interpreting "The" — determining whether a definite noun phrase looks backward into the discourse, outward into shared world knowledge, or bridges conceptually from a related antecedent, requiring the integration of discourse memory, world knowledge, and pragmatic inference that distinguishes robust language understanding from surface pattern matching.

Want to learn more?

Search 13,225+ semiconductor and AI topics or chat with our AI assistant.

Search Topics Chat with CFSGPT