Home Knowledge Base Bayesian Optimization

Bayesian Optimization is a sample-efficient hyperparameter tuning strategy that builds a probabilistic model of the objective function to intelligently decide which configuration to try next — unlike Random Search (blind sampling) or Grid Search (exhaustive enumeration), Bayesian Optimization "learns" from past trials which regions of the hyperparameter space are promising, balancing exploration (trying unexplored regions) and exploitation (refining known good regions) to find optimal configurations in far fewer trials.

What Is Bayesian Optimization?

How Bayesian Optimization Works

StepProcessWhat Happens
1. Initial trialsEvaluate 5-10 random configurationsBuild initial understanding
2. Fit surrogate modelGaussian Process on (config → performance) pairsModel predicts performance + uncertainty for any config
3. Acquisition functionFind config that maximizes Expected ImprovementBalance: try where predicted good OR where very uncertain
4. EvaluateTrain model with chosen configGet actual performance
5. Update surrogateAdd new result, refit GPSurrogate becomes more accurate
6. RepeatGo to step 3Converge toward optimum

Surrogate Models

ModelHow It WorksProsCons
Gaussian Process (GP)Non-parametric regression with uncertainty estimatesGold standard, principled uncertaintyScales poorly beyond ~1000 trials
TPE (Tree Parzen Estimator)Model P(xgood) and P(xbad) separatelyHandles categorical/conditional params wellLess principled than GP
Random ForestEnsemble regression as surrogateScales well, handles mixed typesLess smooth uncertainty estimates

Acquisition Functions

FunctionStrategyBehavior
Expected Improvement (EI)Choose point with highest expected improvement over current bestGood balance of exploration/exploitation
Upper Confidence Bound (UCB)Choose point with highest (predicted mean + κ × uncertainty)κ controls explore/exploit
Probability of Improvement (PI)Choose point most likely to beat current bestGreedy, can get stuck

Libraries

LibrarySurrogateStrengths
OptunaTPE (default)Modern, Python-native, pruning support, visualization
HyperoptTPEClassic, widely tested
BoTorch / AxGaussian ProcessFacebook's framework, most principled
Ray TuneWraps Optuna/HyperoptDistributed execution
Scikit-OptimizeGP, RF, ExtraTreessklearn-compatible interface
import optuna

def objective(trial):
    lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True)
    depth = trial.suggest_int("max_depth", 3, 12)
    model = train_model(lr=lr, max_depth=depth)
    return evaluate(model)

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=50)
print(study.best_params)

Bayesian Optimization is the most sample-efficient hyperparameter tuning strategy — intelligently selecting which configurations to evaluate by building a probabilistic model of the objective function, making it the preferred approach when each trial is computationally expensive and the budget is limited to tens rather than hundreds of evaluations.

bayesian optimizationpriorefficient

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.