REST API (Representational State Transfer) is the architectural style for distributed hypermedia systems that uses HTTP methods (GET, POST, PUT, DELETE) and resource URLs to define a uniform interface for client-server communication — the dominant API design pattern for public-facing web services, LLM APIs, and cloud services where human readability, broad client compatibility, and ecosystem tooling matter more than raw performance.
What Is REST?
- Definition: An architectural style (not a protocol or standard) defined by Roy Fielding in his 2000 PhD dissertation — six constraints define REST: client-server separation, statelessness, cacheability, uniform interface, layered system, and optional code-on-demand.
- Resource-Oriented: Everything is a "resource" with a URL identity (/users/123, /models/gpt-4, /conversations/abc) — HTTP methods describe operations on resources (GET=read, POST=create, PUT=replace, PATCH=partial update, DELETE=remove).
- Stateless: Each request must contain all information needed to process it — the server holds no client session state between requests. Auth tokens, query parameters, and request body carry all context.
- JSON Standard: Modern REST APIs use JSON as the payload format — human-readable, widely supported by every programming language, and debuggable via curl or browser developer tools.
- HTTP Semantics: REST leverages HTTP status codes for result communication — 200 OK, 201 Created, 400 Bad Request, 401 Unauthorized, 404 Not Found, 422 Unprocessable Entity, 500 Internal Server Error.
Why REST Matters for AI/ML
- LLM API Standard: OpenAI, Anthropic, Google, and every LLM provider exposes REST APIs — POST /v1/chat/completions with a JSON body containing messages and parameters is the universal interface for LLM integration.
- Model Serving: FastAPI-based REST endpoints are the most common way to serve ML models — /predict endpoint accepts feature JSON, returns prediction JSON, accessible from any language or client.
- Webhook Callbacks: Async ML jobs (fine-tuning, batch inference) notify completion via REST webhooks — the job server POSTs a result payload to a client-specified callback URL when processing completes.
- Cloud Service Integration: AWS, GCP, and Azure all expose management APIs as REST — provisioning GPU instances, managing model deployments, and querying metrics all happen via HTTP/JSON.
- OpenAI-Compatible APIs: vLLM, Ollama, and LiteLLM implement OpenAI-compatible REST endpoints — any code written against the OpenAI REST API works against self-hosted models with a URL change.
Core REST Concepts
Resource Operations:
GET /v1/models → List available models
GET /v1/models/{id} → Get specific model metadata
POST /v1/chat/completions → Create a chat completion
POST /v1/fine-tuning/jobs → Create a fine-tuning job
GET /v1/fine-tuning/jobs/{id} → Check fine-tuning job status
DELETE /v1/fine-tuning/jobs/{id} → Cancel a job
HTTP Status Code Semantics:
200 OK — Request succeeded, response body contains result
201 Created — POST succeeded, new resource created
400 Bad Request — Client error: invalid parameters or malformed JSON
401 Unauthorized — Missing or invalid API key
403 Forbidden — Valid key but insufficient permissions
404 Not Found — Resource doesn't exist at this URL
422 Unprocessable — Request syntax valid but semantically incorrect
429 Too Many Requests — Rate limit exceeded, check Retry-After header
500 Internal Error — Server-side failure, not the client's fault
Python REST Client (requests):
import requests
response = requests.post(
"https://api.openai.com/v1/chat/completions",
headers={"Authorization": f"Bearer {api_key}"},
json={
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Explain REST APIs"}],
"temperature": 0.7
}
)
response.raise_for_status()
result = response.json()
REST vs Alternatives
| Aspect | REST | gRPC | GraphQL |
|--------|------|------|---------|
| Protocol | HTTP/1.1+ | HTTP/2 | HTTP/1.1+ |
| Format | JSON | Protobuf (binary) | JSON |
| Schema | Optional (OpenAPI) | Required (.proto) | Required (SDL) |
| Streaming | SSE/WebSocket | Native | Subscriptions |
| Browser support | Universal | Limited | Universal |
| Learning curve | Low | Medium | Medium |
| Best for | Public APIs, LLM APIs | Internal services | Complex data graphs |
REST API is the universal interface pattern that makes distributed systems interoperable — by building on ubiquitous HTTP, human-readable JSON, and resource-oriented URLs with well-understood semantics, REST APIs achieve the broadest client compatibility and lowest integration barrier of any API style, which is why every LLM provider, cloud service, and ML platform exposes REST as its primary interface.