|
Gaussian processes are flexible non-parametric prior distributions for multivariate functions. Gaussian process priors can be used to specify prior assumptions on the underlying function that describes the underlying relationships between inputs and response variables. The main limitation of the implementation of Gaussian processes in practical applications is their computational demands. One contribution of this thesis work is to perform a study, analysis and implementation of the recently developed Hilbert space method to approximate Gaussian processes. We analyze in detail the performance and accuracy of the method in relation to the key factors of the method, and make recommendations and diagnosis of the approximation. We focus on its implementation in a probabilistic programming framework and on the use of computational sampling methods. We demonstrate the applicability and the implementation of the methodology, the reduction of the computation and the improvement in sampling efficiency.
On the other hand, in modeling problems of learning stochastic functions from data, there is often a priori knowledge and/or additional information available concerning the function to be learned, which can be used to improve the performance of the modeling. Thus, another contribution of this work is about using additional (virtual) derivative observations to induce monotonicity and gradient to functions, so function dynamics can be controlled or constrained in some way. However, some inference issues causing overly smoothed posterior distributions, especially with Gaussian process priors, can arise when the number of virtual observations of the sign of the derivative to induce monotonicity on functions is large. Basically, this is because the monotonicity information is included in the likelihood of the model instead into the prior, making the posterior distribution of the function dependent on the number and location of the virtual observations in the input space. However, we argue that if the function is smooth, this problem can be avoided in practice by choosing only a few virtual points and placing them appropriately, and also that monotonic functions with a Gaussian process prior provide reliable model extrapolation since they have a stronger inductive bias than a model without derivative information.
In addition, in this thesis work we tackle three real-world applications. First, we tackle an application to prehistoric archeological rock art paintings, in which prior knowledge in the form of virtual derivative observations to induce monotonicity and long-term stabilization to the predicted functions is considered. We show that models with additional derivative information have a stronger inductive bias, yielding better predictive performance and confidence intervals. Secondly, an application to image sensor noise of decomposing and characterizing the signal recorded by an image sensor into its different noise sources is carried out. We argue that the Bayesian framework, by its property of defining conditional dependencies among parameters in a fully probabilistic model, allows for fully propagation of uncertainty among noise parameters, obtaining accurate and reliable estimates in flexible models. And finally, we tackle a classical task of great interest in the field of remote sensing and environmental geo sciences for spatio-temporal land use classification. We formulate a spatio-temporal Gaussian process model for classification using the approximate Gaussian process model previously commented in this summary, which allows for dealing with much larger datasets than regular Gaussian processes.
In the applications to rock art paintings and land use classification, models that exploit to the full the correlation structure of the data in order to make useful and accurate generalization of data, also in scenarios with a short and/or very noisy set of sampling observations, are sought. We argue that a Gaussian process prior model with a multidimensional covariance function is one of the most natural ways to accomplish this objective for this type of data consisting in spatio-temporal stochastic observations.
|