Home Knowledge Base Apache Avro

Apache Avro is the row-based binary serialization format with embedded schema that serves as the standard data exchange format for Apache Kafka and streaming pipelines — providing compact binary encoding, rich schema evolution capabilities (adding/removing fields without breaking consumers), and a Schema Registry integration that ensures producers and consumers always agree on data structure.

What Is Apache Avro?

Why Avro Matters for AI/ML

Core Avro Concepts

Schema Definition (JSON format): { "type": "record", "name": "UserEvent", "namespace": "com.company.events", "fields": [ {"name": "user_id", "type": "string"}, {"name": "event_type", "type": "string"}, {"name": "timestamp", "type": "long", "logicalType": "timestamp-millis"}, {"name": "session_id", "type": ["null", "string"], "default": null} ] }

Schema Evolution Rules:

Avro with Confluent Schema Registry: from confluent_kafka import avro from confluent_kafka.avro import AvroConsumer

consumer = AvroConsumer({ "bootstrap.servers": "kafka:9092", "schema.registry.url": "http://schema-registry:8081", "group.id": "ml-feature-pipeline" }) consumer.subscribe(["user-events"]) msg = consumer.poll(1.0) record = msg.value() # Auto-deserialized using registered schema

Avro vs Other Serialization Formats

FormatOrientationSchemaCompactnessStreamingAnalytics
AvroRowEmbedded/RegistryHighExcellentPoor
ProtobufRow.proto filesVery HighGoodPoor
ParquetColumnEmbeddedVery HighPoorExcellent
JSONRowNoneLowGoodPoor
CSVRowNoneLowGoodPoor

Apache Avro is the streaming data format that makes Kafka pipelines reliable through schema evolution — by combining compact binary encoding with a Schema Registry that enforces compatibility rules as schemas change, Avro eliminates the "producer updated the schema and broke all consumers" class of data pipeline incidents that plague JSON-based streaming architectures.

avrorow formatschema

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.