Getting Started with BONNI#
BONNI (Bayesian Optimization with Neural Network surrogates and gradients) optimizes black-box functions that return both a value and a gradient. By incorporating gradient information into the MLP ensemble surrogate, BONNI achieves high sample efficiency — especially in high-dimensional spaces.
This notebook walks through:
Defining an objective function compatible with BONNI
Running BONNI Bayesian Optimization (
optimize_bonni)Running gradient-based optimization with IPOPT (
optimize_ipopt)Inspecting and plotting results
Advanced options: custom configs and warm-starting from previous data
Installation#
Install pixi, then clone the repository and run:
git clone https://github.com/ymahlau/bonni.git
cd bonni
pixi install
This resolves all dependencies — including the native IPOPT libraries — from conda-forge automatically.
For GPU-accelerated JAX, add the CUDA-enabled variant after installation:
pixi run pip install jax[cuda]
1. Defining an Objective Function#
Every function passed to BONNI must accept a 1-D NumPy array x of shape (D,) and return a tuple (value, gradient):
value— a scalarfloatgradient— a NumPy array of shape(D,)
Here we define a simple 2-D function \(f(x) = x_0^2 + x_1\) with analytical gradient:
import numpy as np
def fn(x: np.ndarray):
value = x[0] ** 2 + x[1]
grad = np.asarray([2 * x[0], 1.0])
return value, grad
# Sanity check
x_test = np.array([1.0, 0.5])
val, grad = fn(x_test)
print(f"f({x_test}) = {val}, grad = {grad}")
2. Bayesian Optimization with optimize_bonni#
optimize_bonni runs the full BO loop:
Draw
num_random_samplespoints at random to bootstrap the surrogate.Fit the MLP ensemble surrogate on
(xs, ys, gs).Maximize Expected Improvement (EI) via IPOPT to select the next query point.
Evaluate the objective and repeat for
num_bonni_iterationssteps.
The function returns the full history (xs, ys, gs) collected across all evaluations (random samples + BO iterations).
from bonni import optimize_bonni
bounds = np.asarray([[-1.0, 1.0], [0.0, 1.0]]) # shape (D, 2)
xs, ys, gs = optimize_bonni(
fn=fn,
bounds=bounds,
num_bonni_iterations=5, # BO steps after initialization
num_random_samples=3, # random evaluations used for warm-up
direction="minimize",
seed=42,
)
print(f"Total evaluations: {len(xs)}")
best_idx = np.argmin(ys)
print(f"Best point: x = {xs[best_idx]}, f = {ys[best_idx]:.4f}")
Inspecting the optimization history#
The returned arrays give you complete access to every evaluation:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(7, 3))
ax.plot(ys, marker="o")
ax.axhline(np.min(ys), color="red", linestyle="--", label=f"best = {np.min(ys):.4f}")
ax.set_xlabel("Evaluation index")
ax.set_ylabel("Objective value")
ax.set_title("Optimization history")
ax.legend()
plt.tight_layout()
plt.show()
3. Using a Built-in Test Function#
BONNI ships with the Styblinski-Tang benchmark function.
It is differentiable, multimodal, and has a known global minimum at \(x^* \approx (-2.903, \ldots, -2.903)\) with \(f(x^*) = 0\) (after the built-in shift).
from bonni.synthetic import StyblinskiTangFn
d = 2
fn_st = StyblinskiTangFn(d=d)
xs_st, ys_st, gs_st = optimize_bonni(
fn=fn_st,
bounds=fn_st.bounds, # predefined [-5, 5]^d
num_bonni_iterations=10,
num_random_samples=4,
direction="minimize",
seed=0,
)
best_idx = np.argmin(ys_st)
print(f"Best value found: {ys_st[best_idx]:.4f} at x = {xs_st[best_idx]}")
4. Gradient-Based Optimization with optimize_ipopt#
For functions where gradients are cheap (or the problem is convex), you may want to skip the BO surrogate entirely and run IPOPT directly on the true objective.
BONNI provides optimize_ipopt as a convenient wrapper around cyipopt:
Requires an explicit starting point
x0Uses
max_fn_eval/max_iterationsas stopping criteriaReturns the same
(xs, ys, gs)history format asoptimize_bonni
from bonni import optimize_ipopt
x0 = np.asarray([0.5, 0.5])
bounds = np.asarray([[-1.0, 1.0], [0.0, 1.0]])
xs_ip, ys_ip, gs_ip = optimize_ipopt(
fn=fn,
x0=x0,
bounds=bounds,
max_fn_eval=20,
max_iterations=10,
direction="minimize",
)
best_idx = np.argmin(ys_ip)
print(f"Total evaluations: {len(xs_ip)}")
print(f"Best point: x = {xs_ip[best_idx]}, f = {ys_ip[best_idx]:.4f}")
Maximization#
Both optimize_bonni and optimize_ipopt support direction="maximize".
The sign flip is handled internally — you do not need to negate your function.
xs_max, ys_max, gs_max = optimize_ipopt(
fn=fn,
x0=x0,
bounds=bounds,
max_fn_eval=20,
max_iterations=10,
direction="maximize",
)
best_idx = np.argmax(ys_max)
print(f"Best point: x = {xs_max[best_idx]}, f = {ys_max[best_idx]:.4f}")
5. Advanced Options#
5.1 Saving results to disk#
Pass a save_path directory to automatically write (xs, ys, gs) as an .npz file after each evaluation:
from pathlib import Path
import tempfile
with tempfile.TemporaryDirectory() as tmp:
save_dir = Path(tmp)
xs_s, ys_s, gs_s = optimize_bonni(
fn=fn,
bounds=bounds,
num_bonni_iterations=3,
num_random_samples=2,
seed=1,
save_path=save_dir,
)
saved_files = list(save_dir.glob("*.npz"))
print(f"Saved files: {[f.name for f in saved_files]}")
# Load the results back
data = np.load(saved_files[0])
print(f"Loaded xs shape: {data['xs'].shape}")
5.2 Warm-starting from previous data#
If you already have evaluations from a previous run (or from a different optimizer), pass them via xs, ys, gs instead of using num_random_samples:
# Simulate previously collected data
prev_xs = np.array([[-0.8, 0.2], [0.3, 0.7], [0.0, 0.5]])
prev_ys = np.array([fn(x)[0] for x in prev_xs])
prev_gs = np.array([fn(x)[1] for x in prev_xs])
xs_warm, ys_warm, gs_warm = optimize_bonni(
fn=fn,
bounds=bounds,
num_bonni_iterations=5,
xs=prev_xs,
ys=prev_ys,
gs=prev_gs,
seed=7,
)
print(f"Total evaluations (prev + BO): {len(xs_warm)}")
5.3 Custom model and optimizer configuration#
BONNI exposes three configuration dataclasses for fine-tuning the surrogate and BO loop:
Dataclass |
Controls |
|---|---|
|
MLP architecture (layers, hidden size, normalization) |
|
Training optimizer (learning rate, steps, warm-up) |
|
Expected Improvement acquisition function |
Pass any of them via the custom_* keyword arguments:
from bonni import MLPModelConfig, OptimConfig
model_cfg = MLPModelConfig(
num_layer=3,
hidden_channels=128,
out_channels=1,
norm_groups=4,
)
optim_cfg = OptimConfig(
total_steps=500,
warmup_steps=20,
)
xs_cfg, ys_cfg, gs_cfg = optimize_bonni(
fn=fn,
bounds=bounds,
num_bonni_iterations=5,
num_random_samples=3,
seed=99,
custom_base_model_config=model_cfg,
custom_optim_config=optim_cfg,
ensemble_size=10, # fewer models → faster training
)
best_idx = np.argmin(ys_cfg)
print(f"Best: f = {ys_cfg[best_idx]:.4f} at x = {xs_cfg[best_idx]}")
5.4 Non-differentiable parameters#
If some input dimensions are non-differentiable (e.g. discrete parameters), mark them with a boolean mask via non_diff_params.
BONNI will still optimize over these dimensions but will ignore their gradient entries in training.
# x[1] is treated as non-differentiable; its gradient is ignored during surrogate training
non_diff = np.array([False, True])
xs_nd, ys_nd, gs_nd = optimize_bonni(
fn=fn,
bounds=bounds,
num_bonni_iterations=5,
num_random_samples=3,
seed=5,
non_diff_params=non_diff,
)
print(f"Best: f = {np.min(ys_nd):.4f}")
Summary#
Feature |
Key argument |
|---|---|
Number of BO iterations |
|
Random warm-up samples |
|
Optimization direction |
|
Reproducibility |
|
Save history to disk |
|
Warm-start from prior data |
|
Custom MLP architecture |
|
Custom training optimizer |
|
Custom EI config |
|
Non-differentiable dims |
|
Ensemble size |
|
For the full API reference see the API docs.