Hi everyone,
I am curious about the potential differences in performance when using normalized versus unnormalized parameters in optimization algorithms. Say:
- Normalized parameters:
x1
: [0.0, 1.0]
x2
: [0.0, 1.0]
- Unnormalized parameters:
x1
: [5.0, 12.0]
x2
: [3.0, 5.0]
I am planning to test this out with honegumi, but I wanted to see if anyone has insights or experiences with how normalization affects convergence, performance, or any other aspect of the algorithm. Would love to hear your thoughts!
Off to test this out.
Fly-by comment: Most algorithms/packages that I’m aware of perform normalization under the hood (including Ax). There’s probably a lower level way to avoid doing the normalization in BoTorch, or do something more customized.
1 Like
The essence of normalization is to make values with different ranges of values fall within the same scale range.
Bayesian optimization typically uses a Gaussian process as a surrogate model, which is predicated on the assumption that the input features obey a standardized multivariate Gaussian distribution.
Too much variation in the range of values in the data can lead to instability in the covariance matrix of the Gaussian process.
Normalization this helps to explore the search space more uniformly, avoiding certain features from dominating the optimization process, and also speeds up convergence.
2 Likes
That makes perfect sense.
Thanks!