Python for LLM development

Python for LLM development provides the essential programming foundation for building AI applications — with libraries for API access, model serving, vector databases, and application frameworks, Python is the dominant language for LLM development due to its ecosystem, readability, and extensive ML tooling.

Why Python for LLMs?

- Ecosystem: Most LLM tools and libraries are Python-first.
- ML Heritage: Built on PyTorch, TensorFlow, scikit-learn.
- API Clients: Official SDKs from OpenAI, Anthropic, etc.
- Rapid Prototyping: Quick iteration from idea to working code.
- Community: Largest AI/ML developer community.

Essential Libraries

API Clients:
``Library | Purpose | Install ------------|---------------------|------------------ openai | OpenAI API | pip install openai anthropic | Claude API | pip install anthropic google-ai | Gemini API | pip install google-generativeai together | Together.ai API | pip install together`

Model & Inference:`Library | Purpose | Install -------------|---------------------|------------------ transformers | Hugging Face models | pip install transformers vllm | Fast LLM serving | pip install vllm llama-cpp | Local inference | pip install llama-cpp-python optimum | Optimized inference | pip install optimum`

Frameworks & Tools:`Library | Purpose | Install ------------|---------------------|------------------ langchain | LLM orchestration | pip install langchain llamaindex | RAG framework | pip install llama-index chromadb | Vector database | pip install chromadb pydantic | Data validation | pip install pydantic`

Quick Start Examples

OpenAI API:`python from openai import OpenAI

client = OpenAI() # Uses OPENAI_API_KEY env var

response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "You are helpful."}, {"role": "user", "content": "Hello!"} ] )

print(response.choices[0].message.content)`

Claude API:`python from anthropic import Anthropic

client = Anthropic() # Uses ANTHROPIC_API_KEY env var

message = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=[ {"role": "user", "content": "Hello!"} ] )

print(message.content[0].text)`

Streaming Responses:`python stream = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Tell a story"}], stream=True )

for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")`

Async for High Throughput:`python import asyncio from openai import AsyncOpenAI

client = AsyncOpenAI()

async def process_batch(prompts): tasks = [ client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": p}] ) for p in prompts ] return await asyncio.gather(*tasks)

# Run batch responses = asyncio.run(process_batch(prompts))`

Best Practices

Environment Variables:`python import os from dotenv import load_dotenv

load_dotenv() # Load from .env file

api_key = os.environ["OPENAI_API_KEY"] # Never hardcode keys!`

Retry Logic:`python from tenacity import retry, stop_after_attempt, wait_exponential

@retry( stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=60) ) def call_llm_with_retry(prompt): return client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}] )`

Response Caching:`python from functools import lru_cache import hashlib

@lru_cache(maxsize=1000) def cached_llm_call(prompt_hash): # Cache based on hash of prompt return call_llm(prompt)

def call_with_cache(prompt): prompt_hash = hashlib.md5(prompt.encode()).hexdigest() return cached_llm_call(prompt_hash)`

Simple RAG Implementation:`python from langchain.embeddings import OpenAIEmbeddings from langchain.vectorstores import Chroma from langchain.text_splitter import CharacterTextSplitter

# 1. Load and split documents texts = CharacterTextSplitter().split_text(document)

# 2. Create vector store vectorstore = Chroma.from_texts(texts, OpenAIEmbeddings())

# 3. Query results = vectorstore.similarity_search("my question", k=3)

# 4. Generate answer with context context = " ".join([r.page_content for r in results]) answer = call_llm(f"Context: {context}

Question: my question")`

Project Structure:`my_llm_app/ ├── .env # API keys (gitignored) ├── requirements.txt # Dependencies ├── src/ │ ├── __init__.py │ ├── llm.py # LLM client wrapper │ ├── embeddings.py # Embedding functions │ └── prompts.py # Prompt templates ├── tests/ │ └── test_llm.py └── main.py``

Python for LLM development is the gateway to building AI applications — its rich ecosystem of libraries, straightforward syntax, and extensive community resources make it the natural choice for developers entering the AI space.

Want to learn more?