Home Knowledge Base Flask

Flask is the minimalist Python web framework that provides routing, request handling, and Jinja2 templating without imposing architectural decisions — historically the dominant framework for serving ML models as HTTP APIs due to its simplicity and flexibility, though largely superseded by FastAPI for new ML projects requiring performance and automatic documentation.

What Is Flask?

Why Flask Matters for AI/ML

Core Flask Patterns

Basic ML Serving Endpoint: from flask import Flask, request, jsonify import torch

app = Flask(__name__) model = torch.load("model.pt").eval()

@app.route("/predict", methods=["POST"]) def predict(): data = request.get_json() if not data or "text" not in data: return jsonify({"error": "text field required"}), 400

with torch.no_grad(): output = model(data["text"])

return jsonify({"prediction": output.item(), "text": data["text"]})

if __name__ == "__main__": app.run(host="0.0.0.0", port=8000)

Production Deployment (Gunicorn): gunicorn --workers 4 --bind 0.0.0.0:8000 app:app

4 workers = 4 parallel model inference processes

Flask Extension Ecosystem:

When to Use Flask vs FastAPI

Use Flask when:

Use FastAPI when:

Flask is the foundational Python web framework that made ML model serving accessible — while FastAPI has surpassed it for new development, Flask's simplicity, extensive documentation, and massive deployment footprint keep it relevant for ML practitioners who need a simple HTTP wrapper around a model with minimal infrastructure complexity.

flaskpythonsimple

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.