WebSockets is the full-duplex communication protocol over a single persistent TCP connection that enables servers to push data to clients without polling — providing the real-time bidirectional communication foundation for live chat applications, multiplayer games, collaborative editors, and streaming AI interfaces where low-latency server-to-client data delivery is essential.
What Are WebSockets?
- Definition: A communication protocol that upgrades an HTTP/1.1 connection to a persistent, full-duplex TCP connection — once the WebSocket handshake completes, both client and server can send messages to each other at any time without the overhead of establishing new connections.
- Protocol Upgrade: WebSocket connections begin as HTTP requests with special headers (Upgrade: websocket, Connection: Upgrade) — the server responds with 101 Switching Protocols and the connection becomes a WebSocket channel.
- Full-Duplex: Unlike HTTP (client initiates every request), WebSocket allows simultaneous two-way communication — server can push data while client is also sending data, on the same connection.
- Persistent Connection: After establishment, the connection stays open until explicitly closed — eliminating the latency of TCP handshake and HTTP overhead for each message exchange.
- Framing: WebSocket messages are sent as frames — with support for text frames (UTF-8 JSON), binary frames, ping/pong heartbeats, and control frames for connection management.
Why WebSockets Matters for AI/ML
- Voice AI Applications: Real-time voice assistants (OpenAI Realtime API, ElevenLabs, AssemblyAI streaming) use WebSockets for bidirectional audio streaming — client streams microphone audio to the server, server streams back synthesized speech, simultaneously, for low-latency conversation.
- Live Training Dashboards: ML training dashboards showing live loss curves, GPU utilization, and gradient norms use WebSockets — server pushes metric updates as they occur rather than clients polling every second.
- Collaborative AI Tools: Multi-user AI annotation or code review tools use WebSockets — when one user adds a label, all collaborators see the update instantly via server-pushed messages.
- AI Agent Streaming: Complex AI agent workflows with multiple tool calls and reasoning steps stream progress via WebSocket — users see the agent's thinking and intermediate results as they happen.
- Multiplayer Game AI: Game AI opponents and NPCs in multiplayer games communicate state via WebSockets — sub-100ms latency required for responsive game feel.
WebSocket vs SSE vs Polling
| Pattern | Direction | Latency | Complexity | Best For |
|---------|-----------|---------|------------|---------|
| WebSocket | Bidirectional | Lowest | Medium | Voice AI, games, collaboration |
| SSE | Server→Client only | Low | Low | LLM token streaming, dashboards |
| Long Polling | Server→Client | Medium | Low | Simple notifications |
| Polling | Server→Client | Highest | Lowest | Non-realtime updates |
Python WebSocket Server (FastAPI):
from fastapi import FastAPI, WebSocket
app = FastAPI()
@app.websocket("/ws/voice-chat")
async def voice_chat(websocket: WebSocket):
await websocket.accept()
try:
while True:
# Receive audio chunk from client
audio_data = await websocket.receive_bytes()
# Stream transcription and response back
async for token in process_voice(audio_data):
await websocket.send_text(token)
except WebSocketDisconnect:
pass
Python WebSocket Client:
import asyncio
import websockets
async def stream_voice():
async with websockets.connect("ws://server/ws/voice-chat") as ws:
await ws.send(audio_bytes)
async for message in ws:
print(message, end="", flush=True)
asyncio.run(stream_voice())
OpenAI Realtime API (WebSocket-based):
import websockets, json
async def realtime_session():
async with websockets.connect(
"wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview",
extra_headers={"Authorization": f"Bearer {api_key}"}
) as ws:
# Send audio
await ws.send(json.dumps({"type": "input_audio_buffer.append", "audio": b64_audio}))
# Receive streaming response
async for msg in ws:
event = json.loads(msg)
if event["type"] == "response.audio.delta":
play_audio(event["delta"])
WebSockets is the real-time communication protocol that enables AI applications to move beyond request-response into continuous, low-latency interaction — by maintaining a persistent full-duplex connection, WebSockets enables the kind of bidirectional streaming required for voice AI, live training monitoring, and collaborative AI tools where sub-second latency and server-initiated communication are essential.