Home Knowledge Base Property inference

Property inference is a privacy attack against machine learning models that enables an adversary to determine aggregate statistical properties of the training dataset — such as the proportion of training examples with a particular attribute, the presence of a demographic subgroup, or the distribution of sensitive characteristics — by analyzing model parameters, outputs, or behavior patterns, constituting a privacy threat distinct from membership inference (which targets individual records) because it can reveal population-level secrets even when individual privacy is protected.

Distinction from Other Privacy Attacks

Attack TypeTargetWhat Is RecoveredExample
Membership inferenceIndividual recordsWas this specific person in the training set?Determining if patient X's record was used
Model inversionInput reconstructionWhat did the training inputs look like?Reconstructing faces from face recognition model
Property inferenceDataset statisticsWhat fraction of training data has property P?Inferring % of female patients in training set
Training data extractionMemorized contentExact verbatim training examplesExtracting memorized text from language models

Property inference is particularly insidious because it can succeed even when: the model implements differential privacy (which protects individuals, not population statistics), individual membership cannot be determined, and the model appears to behave normally on all evaluation inputs.

Attack Methodology

Property inference attacks typically follow one of two approaches:

Meta-classifier attack (Ganju et al., 2018): The adversary trains a meta-model on shadow models to predict the property from model parameters or activations.

Step 1: Train a large number of "shadow" models on datasets with known property prevalence (50% female, 30% female, 70% female, etc.) Step 2: Extract features from each shadow model (weight statistics, activation patterns, gradient signatures) Step 3: Train a meta-classifier mapping model features → property value Step 4: Apply meta-classifier to the target model to infer its training set property

Behavioral probing: Design probe inputs that elicit different model behaviors depending on training set composition:

Properties That Can Be Inferred

Research has demonstrated inference of:

Defenses

DefenseMechanismLimitation
Differential privacyAdd calibrated noise to gradientsProtects individuals but not aggregate properties by design
Representation scrubbingRemove property-correlated features from representationsMay degrade utility on legitimate tasks
Output perturbationAdd noise to API outputsReduces attack accuracy but degrades utility
Model weight encryptionPrevent direct weight accessDoes not prevent behavioral probing
Access control and rate limitingLimit query volumeSlows attack, does not prevent it

Significance for Regulated Industries

In healthcare, financial services, and government:

Property inference represents a fundamental tension in ML privacy: differential privacy provides strong individual-level protection but by design allows aggregate statistics to be learned — which is exactly what property inference exploits.

property inferenceprivacy

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.