Gaussian Process models and global optimization with categorical variables
Loading...
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This thesis is concerned with Gaussian Process (GP) models for computer experiments with both numerical and categorical input variables. The Low-Rank Correlation kernel LRCr is introduced for the estimation of the cross-correlation matrix – i.e., the matrix that contains the correlations of the GP given different levels of a categorical variable. LRCr is a rank-r approximation of the real but unknown cross-correlation matrix and provides two advantages over existing parsimonious correlation kernels: First, it lets the practictioner adapt the number of parameters to be estimated according to the problem at hand by choosing an appropriate rank r. And second, the entries of the estimated cross-correlation matrix are not restricted to non-negative values.
Moreover, an approach is presented that can generate a test function with mixed inputs from a test function having only continuous variables. This is done by discretizing (or “slicing”) one of its dimensions. Depending on the function and the slice positions, the slices sometimes happen to be highly positively correlated. By turning some slices in a specific way, the position and value of the global optimum can be preserved while changing the sign of a number of cross-correlations.
With these methods, a simulation study is conducted that investigates the estimation accuracy of the cross-correlation matrices as well as the prediction accuracy of the response surface among different correlation kernels. Thereby, the number of points in the initial design of experiments and the amount of negative cross-correlations are varied in order to compare their impact on different kernels.
We then focus on GP models with mixed inputs in the context of the Efficient Global Optimization (EGO) algorithm. We conduct another simulation study in which the distances of the different kernels' best found solutions to the optimum are compared. Again, the number of points in the initial experimental design is varied. However, the total budget of function evaluations is fixed. The results show that a higher number of EGO iterations tends to be preferable over a larger initial experimental design.
Finally, three applications are considered: First, an optimization of hyperparameters of a computer vision algorithm. Second, an optimization of a logistics production process using a simulation model. And third, a bi-objective optimization of shift planning in a simulated high-bay warehouse, where constraints on the input variables must be met. These applications involve further challenges, which are successfully solved.
Description
Table of contents
Keywords
Metamodellbasierte Optimierung, Kriging, Computerexperiment