高斯过程回归的各向异性核如何与可变数量的特征一起使用



我的用例是,我想为高斯过程回归自动选择特征。对于各向同性内核,这很容易完成,如以下示例所示:

import numpy as np
from mlxtend.feature_selection import SequentialFeatureSelector
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
X = np.random.rand(100, 10)
y = np.random.rand(100)
gpr = GaussianProcessRegressor(kernel=RBF(length_scale=[1]))
selector = SequentialFeatureSelector(gpr, forward=False)
selector.fit(X, y)

为了使用各向异性内核,必须将内核的定义更改为 RBF(length_scale=[1] * num_features)

但是,在每一轮功能选择中,功能的数量都会发生变化,这会增加ValueError: Anisotropic kernel must have the same number of dimensions as data (10!=9)

有没有办法获得具有动态特征数量的各向异性内核?

作为一个肮脏的黑客,我GaussianProcessRegressor子类化并在fit中添加了一个函数,该函数递归扫描所有内核,并将所有内核的length_scale参数替换为向量,这些内核可以是各向异性的(目前只有 RBF 和 Matern(。

class GaussianProcessRegressorAnisotropic(GaussianProcessRegressor):
    def fit(self, X, y):
        self._fix_kernel_length_scales(self.kernel, X.shape[1])
        super().fit(X, y)
    def _fix_kernel_length_scales(self, kernel, num_features):
        if isinstance(kernel, RBF) or isinstance(kernel, Matern):
            kernel.length_scale = [kernel.length_scale] * num_features
        elif isinstance(kernel, Product) or isinstance(kernel, Sum):
            self._fix_kernel_length_scales(kernel.k1, num_features)
            self._fix_kernel_length_scales(kernel.k2, num_features)
        elif isinstance(kernel, Exponentiation):
            self._fix_kernel_length_scales(kernel.kernel, num_features)
        elif isinstance(kernel, CompoundKernel):
            for sub_kernel in kernel.kernels:
                self._fix_kernel_length_scales(sub_kernel, num_features)

但也许有人有更好的解决方案?

相关内容

  • 没有找到相关文章

最新更新