Suggestion for code implementations of deep kernel learning

Utkarsh · June 7, 2024, 5:49pm

Hi,

I am looking for any implementations available for Deep kernel learning([1511.02222] Deep Kernel Learning). I am aware of gpax(GitHub - ziatdinovmax/gpax: Gaussian Processes for Experimental Sciences, more specifically this example gpax/examples/gpax_viDKL_plasmons.ipynb at main · ziatdinovmax/gpax · GitHub) having it, but it is built on top of Jax.

Does anyone know if there is any implementation already out there in botorch (motivation being botorch already have many optimizers to play with)?

sgbaird · June 7, 2024, 8:52pm

While it’s not a direct answer/solution, this thread may be of help!

github.com/facebook/Ax

structured Gaussian processes, deep kernel learning, and incorporating domain knowledge

opened 09:47PM - 07 Oct 22 UTC

closed 03:46PM - 30 Nov 22 UTC

sgbaird

announcement

I probably lack the understanding and the language required to talk about this e…ffectively, so here are a few follow-up questions. - are you familiar with structured GPs? - is this the right name to be using? (e.g. what's described in https://www.sciencedirect.com/science/article/pii/S0021999119300397) From my basic understanding, it's functionally similar to [performing BO over a VAE latent space](https://botorch.org/tutorials/vae_mnist), except that the latent space embeddings aren't entirely as fixed, and the manifold itself is learned based on what a deep kernel learning (?) model decides is "useful" or not. On a higher level, I've been told it's useful for incorporating physical insight/domain knowledge (e.g. physical models) into active learning. I'm asking based on some discussion with Sergei Kalinin on DKL models they've been applying in microscopy settings and how it applies to other domains. See e.g. https://arxiv.org/abs/2205.15458 Related: - https://github.com/ziatdinovmax/gpax - https://github.com/pycroscopy/atomai - https://twitter.com/Sergei_Imaging/status/1467856622668103686 - [The Promises and Pitfalls of Deep Kernel Learning](https://arxiv.org/abs/2102.12108) from [Twitter search of deep kernel learning](https://twitter.com/search?q=deep%20kernel%20learning) From https://github.com/ziatdinovmax/gpax: > The limitation of the standard GP is that it does not usually allow for the incorporation of prior domain knowledge and can be biased toward a trivial interpolative solution. Recently, we [introduced](https://arxiv.org/abs/2108.10280) a structured Gaussian Process (sGP), where a classical GP is augmented by a structured probabilistic model of the expected system’s behavior. This approach allows us to [balance](https://towardsdatascience.com/unknown-knowns-bayesian-inference-and-structured-gaussian-processes-why-domain-scientists-know-4659b7e924a4) the flexibility of the non-parametric GP approach with a rigid structure of prior (physical) knowledge encoded into the parametric model. Implementation-wise, we substitute a constant/zero prior mean function in GP with a probabilistic model of the expected system's behavior. > > The limitation of the standard GP is that it does not usually allow for the incorporation of prior domain knowledge and can be biased toward a trivial interpolative solution. Recently, we [introduced](https://arxiv.org/abs/2108.10280) a structured Gaussian Process (sGP), where a classical GP is augmented by a structured probabilistic model of the expected system’s behavior. This approach allows us to [balance](https://towardsdatascience.com/unknown-knowns-bayesian-inference-and-structured-gaussian-processes-why-domain-scientists-know-4659b7e924a4) the flexibility of the non-parametric GP approach with a rigid structure of prior (physical) knowledge encoded into the parametric model. > Implementation-wise, we substitute a constant/zero prior mean function in GP with a probabilistic model of the expected system's behavior. > ... > For example, if we have prior knowledge that our objective function has a discontinuous 'phase transition', and a power law-like behavior before and after this transition, we may express it using a simple piecewise function > ```python3 > import jax.numpy as jnp > def piecewise(x: jnp.ndarray, params: Dict[str, float]) -> jnp.ndarray: > """Power-law behavior before and after the transition""" > return jnp.piecewise( > x, [x < params["t"], x >= params["t"]], > [lambda x: x**params["beta1"], lambda x: x**params["beta2"]]) > ``` > where ```jnp``` corresponds to jax.numpy module. This function is deterministic. To make it probabilistic, we put priors over its parameters with the help of [NumPyro](https://github.com/pyro-ppl/numpyro) > ```python3 > import numpyro > from numpyro import distributions > def piecewise_priors(): > # Sample model parameters > t = numpyro.sample("t", distributions.Uniform(0.5, 2.5)) > beta1 = numpyro.sample("beta1", distributions.Normal(3, 1)) > beta2 = numpyro.sample("beta2", distributions.Normal(3, 1)) > # Return sampled parameters as a dictionary > return {"t": t, "beta1": beta1, "beta2": beta2} Feel free to close as this is just a discussion post, and no worries if this doesn't fit well within the scope of Ax/BoTorch. Curious to hear your thoughts, if any!

AFalk · June 15, 2024, 11:09am

You might look into GPytorch, which is the GP backend for Botorch. Depending on your familiarity with GPs it has a bit of a learning curve, but they have several tutorials on deep kernel learning.

https://docs.gpytorch.ai/en/stable/examples/06_PyTorch_NN_Integration_DKL/index.html

https://docs.gpytorch.ai/en/stable/examples/06_PyTorch_NN_Integration_DKL/KISSGP_Deep_Kernel_Regression_CUDA.html

maxim.ziatdinov · June 21, 2024, 1:47am

@Utkarsh - there is a GPyTorch version of deep kernel learning, PyTorch NN Integration (Deep Kernel Learning) — GPyTorch 1.12.dev60+g25da2cc documentation, which should be compatible with BOTorch. However, one of the reasons I added DKL to GPax was that I didn’t find the GPyTorch implementation flexible enough or easily customizable. For example, trying to run the exact DKL for convents was quite a painful process. Same with placing priors over the NN weights. Is the problem with jax in particular or is there anything I can add to the current GPax implementation that may help you?

Utkarsh · June 21, 2024, 3:09am

@maxim.ziatdinov

I want to train the DKL with multiple objectives. I noticed that extending the JAX version might not allow me to use the multiobjective-optimizers available in BoTorch. However, as you mentioned, the GPyTorch implementation is not particularly friendly for customization, especially for convolutional networks. I’m a bit confused about how to proceed. Could you provide some thoughts?

maxim.ziatdinov · June 26, 2024, 6:14pm

If the goal is to use an advanced suite of multi-objective optimization tools in Botorch, I’m afraid there’s no other way but to deal with the pain of customizing gpytorch’s DKL models. I won’t have the capacity to add those advanced multi-objective opt tools to GPax for the foreseeable future.

Utkarsh · June 27, 2024, 1:25am

Thank you! I will try customising gpytorch’s DKL model.

Utkarsh · March 27, 2025, 6:00pm

Just to update : This was achieved here - GitHub - utkarshp1161/Active-learning-in-microscopy: Notebooks for Active learning in microscopy.

Thanks!

Topic		Replies	Views
Multi-task / transfer learning Bayesian optimization for heterogenous search spaces Tools bayes-opt	2	150	September 20, 2024
Bayesian Optimisation in Google Sheets (User Test Invitation) Tools	3	67	November 6, 2024
What kind of BO set up (representation, model, acquisition function, etc.) do you typically use for materials discovery tasks? Tools	3	63	August 28, 2025
Invitation to “Active Learning in Pharma” Symposium (December 2024) Ecosystem	0	21	October 19, 2024
AC BO Hackathon Ecosystem bo-hackathon	4	320	May 15, 2024

Suggestion for code implementations of deep kernel learning

Related topics