Feel free to give opinionated answers. What have you chosen to use and why? What has worked well for you, either from personal experience, or from what you’ve seen?
For example, “vanilla BO” is typically a Gaussian process surrogate model complemented by the expected improvement acquisition function. Meanwhile, in terms of the acquisition functions, there are others such as probability of improvement and upper confidence bound.
Likewise, some may want to favor exploration over exploitation more explicitly.
So far, I have used no-code platform (Atinary) where I used the open-source algorithm Gryffin. Currently, I’m using Ax (thanks to your tutorials!
In the beginning, I tried really hard to get the algorithms from Prof. Alán Aspuru-Guzik’s group running on my system, but found it challenging to even install the package. It would have been great to have some tutorials for those as well! Since I couldn’t get that to work, I’ve stuck with Ax, which I’m still learning and using.
At the moment, I am focusing on fine-tuning the balance between exploration and exploitation within Ax. I am experimenting with different acquisition functions and trying to better understand how to control this aspect of the optimization process.
Thanks for sharing this! Glad to hear some of the tutorials have been helping! Have you looked into honegumi? It’s a minimal working example generator, currently exposing the Ax Service API and a set of tutorials and conceptual documents.
While I don’t do a ton of BO for materials design, I have used it for manufacturing optimization purposes for a number of years. Ultimately, I have defaulted to rolling my own system due to the need for complete control in defining the models, acquisition functions, optimizers, etc. Here’s a rough breakdown of my strategy:
Surrogate modeling: I use either a Gaussian process (GP) or random forest (RF)-type surrogate model. My preference for the random forest surrogate models is Lolopy, which is a tool that came out of Citrine Informatics, because it has carefully calibrated uncertainty estimates and has performed well in my experience. As for GP surrogate modeling, I have used GPflow as I like the interface, though I have had to implement a differential evolution method for the hyperparameter optimization. A postdoc who works with me has had luck with gpCAM - the developer is quite helpful when issues arise. I have not tried Ax or honegumi yet. In practice, using GPs is tricky because getting optimal values for the hyperparameters is key to their performance. I highly recommend doing some sort of cross-validation (LOOCV for small datasets) on a regular basis to validate performance of the surrogate model.
Acquisition functions: For BO I have had a lot of success with a custom lower confidence bound (LCB) implementation: LCB = (1-kappa) * y_mean - kappa * y_std, where kappa is a value between 0 and 1 where 0 is pure exploitation and 1 is pure exploration. I typically schedule kappa’s evolution throughout the optimization run such that we start in exploration and move towards exploitation, but these values are randomized so that we don’t get stuck in local minima. Again, I use differential evolution to find the optimal next location to sample as I found it to be more reliable than repeatedly instantiated newton-raphson type methods.
Batching: I use a “poisoning” strategy where we iteratively build a batch by picking a next experiment location using our surrogate model and acquisition function, and they including this as a temporary “imaginary” point in the dataset. We then update the surrogate with the imaginary point, find the minimum of the surrogate, and add then next point to our experiment queue and as an additional imaginary point.
I am probably forgetting something - I’ll check back on my answer later and update