Git best practices establish version control workflows that enable safe collaboration, clean history, and reliable code management — using branching strategies, commit conventions, and review processes that keep ML projects organized and enable teams to work together effectively on complex AI codebases.
Why Git Best Practices Matter
- Collaboration: Multiple contributors work without conflicts.
- History: Track what changed, when, and why.
- Rollback: Revert problematic changes quickly.
- Review: Code review catches issues before merge.
- Reproducibility: Tag releases for exact reproduction.
Essential Commands
Daily Workflow:
# Start new feature
git checkout main
git pull origin main
git checkout -b feature/my-feature
# Make changes and commit
git add -p # Interactive staging
git commit -m "feat: add new embedding model"
# Keep up with main
git fetch origin
git rebase origin/main
# Push and create PR
git push -u origin feature/my-feature
Useful Commands:
# View history
git log --oneline -20
git log --graph --oneline --all
# Undo last commit (keep changes)
git reset --soft HEAD~1
# Discard local changes
git checkout -- file.py
git restore file.py # Modern alternative
# Stash work temporarily
git stash
git stash pop
# Interactive rebase (clean history)
git rebase -i HEAD~3
Branching Strategy
GitHub Flow (Recommended for most teams):
main (always deployable)
│
├── feature/add-rag-pipeline
│ └── [PR] → review → merge → delete
│
├── feature/fix-embedding-bug
│ └── [PR] → review → merge → delete
│
└── feature/upgrade-model
└── [PR] → review → merge → delete
Branch Naming:
feature/add-vector-store # New functionality
fix/memory-leak-inference # Bug fixes
docs/update-readme # Documentation
refactor/clean-prompts # Code improvement
experiment/new-model-arch # Exploratory work
Commit Message Convention
Conventional Commits:
<type>(<scope>): <description>
Types:
- feat: New feature
- fix: Bug fix
- docs: Documentation only
- refactor: Code change (no feature/fix)
- test: Adding tests
- chore: Maintenance
Examples:
feat(rag): add hybrid search with BM25
fix(inference): resolve OOM on long contexts
docs: add API usage examples
refactor(prompts): consolidate system prompts
Good Commit Messages:
# ✅ Good
git commit -m "feat: add streaming response support"
git commit -m "fix: handle empty context in RAG pipeline"
# ❌ Bad
git commit -m "fixed stuff"
git commit -m "WIP"
git commit -m "changes"
Code Review Process
PR Best Practices:
## Description
Brief explanation of what this PR does.
## Changes
- Added new embedding model
- Updated vector store config
- Fixed chunking logic
## Testing
- [ ] Unit tests pass
- [ ] Manual testing completed
- [ ] Eval set shows no regression
## Screenshots
(if applicable)
Review Checklist:
□ Code is readable and follows style guide
□ Tests cover new functionality
□ No hardcoded secrets or credentials
□ ML-specific: eval results attached
□ Documentation updated if needed
Git for ML Projects
What to Track:
✅ Track in Git:
- Source code
- Config files
- Small test fixtures
- Documentation
❌ Don't track (use DVC/LFS):
- Model weights (too large)
- Datasets (use DVC)
- Generated outputs
- API keys/secrets
.gitignore for ML:
# Python
__pycache__/
*.pyc
.venv/
venv/
# ML artifacts
*.pt
*.onnx
*.safetensors
models/
checkpoints/
# Data
data/raw/
data/processed/
*.parquet
*.csv
# Secrets
.env
*_key.json
# IDE
.vscode/
.idea/
Advanced Techniques
# Bisect to find breaking commit
git bisect start
git bisect bad HEAD
git bisect good v1.0.0
# Git will guide you to the breaking commit
# Cherry-pick specific commits
git cherry-pick abc1234
# Find who changed a line
git blame file.py
Git best practices are essential infrastructure for team productivity — clean workflows, meaningful commits, and effective review processes enable rapid development while maintaining code quality and collaboration on complex ML projects.
Related Topics
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.