Home Knowledge Base Git best practices

Git best practices establish version control workflows that enable safe collaboration, clean history, and reliable code management — using branching strategies, commit conventions, and review processes that keep ML projects organized and enable teams to work together effectively on complex AI codebases.

Why Git Best Practices Matter

Essential Commands

Daily Workflow:

# Start new feature
git checkout main
git pull origin main
git checkout -b feature/my-feature

# Make changes and commit
git add -p                    # Interactive staging
git commit -m "feat: add new embedding model"

# Keep up with main
git fetch origin
git rebase origin/main

# Push and create PR
git push -u origin feature/my-feature

Useful Commands:

# View history
git log --oneline -20
git log --graph --oneline --all

# Undo last commit (keep changes)
git reset --soft HEAD~1

# Discard local changes
git checkout -- file.py
git restore file.py           # Modern alternative

# Stash work temporarily
git stash
git stash pop

# Interactive rebase (clean history)
git rebase -i HEAD~3

Branching Strategy

GitHub Flow (Recommended for most teams):

main (always deployable)
  │
  ├── feature/add-rag-pipeline
  │     └── [PR] → review → merge → delete
  │
  ├── feature/fix-embedding-bug
  │     └── [PR] → review → merge → delete
  │
  └── feature/upgrade-model
        └── [PR] → review → merge → delete

Branch Naming:

feature/add-vector-store    # New functionality
fix/memory-leak-inference   # Bug fixes
docs/update-readme          # Documentation
refactor/clean-prompts      # Code improvement
experiment/new-model-arch   # Exploratory work

Commit Message Convention

Conventional Commits:

<type>(<scope>): <description>

Types:
- feat: New feature
- fix: Bug fix
- docs: Documentation only
- refactor: Code change (no feature/fix)
- test: Adding tests
- chore: Maintenance

Examples:
feat(rag): add hybrid search with BM25
fix(inference): resolve OOM on long contexts
docs: add API usage examples
refactor(prompts): consolidate system prompts

Good Commit Messages:

# ✅ Good
git commit -m "feat: add streaming response support"
git commit -m "fix: handle empty context in RAG pipeline"

# ❌ Bad
git commit -m "fixed stuff"
git commit -m "WIP"
git commit -m "changes"

Code Review Process

PR Best Practices:

## Description
Brief explanation of what this PR does.

## Changes
- Added new embedding model
- Updated vector store config
- Fixed chunking logic

## Testing
- [ ] Unit tests pass
- [ ] Manual testing completed
- [ ] Eval set shows no regression

## Screenshots
(if applicable)

Review Checklist:

□ Code is readable and follows style guide
□ Tests cover new functionality
□ No hardcoded secrets or credentials
□ ML-specific: eval results attached
□ Documentation updated if needed

Git for ML Projects

What to Track:

✅ Track in Git:
- Source code
- Config files
- Small test fixtures
- Documentation

❌ Don't track (use DVC/LFS):
- Model weights (too large)
- Datasets (use DVC)
- Generated outputs
- API keys/secrets

.gitignore for ML:

# Python
__pycache__/
*.pyc
.venv/
venv/

# ML artifacts
*.pt
*.onnx
*.safetensors
models/
checkpoints/

# Data
data/raw/
data/processed/
*.parquet
*.csv

# Secrets
.env
*_key.json

# IDE
.vscode/
.idea/

Advanced Techniques

# Bisect to find breaking commit
git bisect start
git bisect bad HEAD
git bisect good v1.0.0
# Git will guide you to the breaking commit

# Cherry-pick specific commits
git cherry-pick abc1234

# Find who changed a line
git blame file.py

Git best practices are essential infrastructure for team productivity — clean workflows, meaningful commits, and effective review processes enable rapid development while maintaining code quality and collaboration on complex ML projects.

gitgithubversion controlbranchingcommitspull requestscode reviewmerge

Related Topics

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.