WIP: Pre-final version

2021-07-09 11:15:19 +02:00 · 2021-07-09 11:15:19 +02:00 · 286e952ec3
commit 286e952ec3
parent 7def536787
26 changed files with 288 additions and 151 deletions
--- a/50_Choice_of_Hyperparameters.tex
+++ b/50_Choice_of_Hyperparameters.tex
@ -1,4 +1,4 @@
-\section{Choice of Hyperparameters}
+\section{Choice of Hyperparameters}~\label{sec:hyperparameters}

 This section will discuss and try to validate the choice of all the
 hyperparameters necessary for the training of a \acrshort{gp} model to capture
@ -9,16 +9,14 @@ behaviour directly from data. This comes in contrast to white-box and grey-box
 modelling techniques, which require much more physical insight into the plant's
 behaviour.

-The advantage of black-box models lies in the lack of physical parameters to be
-fitted. On the flip side, this versatility of being able to fit much more
-complex models purely on data comes at the cost of having to properly define the
-model hyperparameters: the number of regressors, the number of autoregressive
-lags for each class of inputs, the shape of the covariance function have to be
-taken into account when designing a \acrshort{gp} model. These choices have
-direct influence on the resulting model behaviour and where it can be
-generalized, as well as indirect influence in the form of more time consuming
-computations in the case of larger number of regressors and more complex kernel
-functions.
+On the flip side, this versatility comes at the cost of having to properly
+define the model hyperparameters: the number of regressors, the number of
+autoregressive lags for each class of inputs, the shape of the covariance
+function have to be taken into account when designing a \acrshort{gp} model.
+These choices have direct influence on the resulting model behaviour and where
+it can be generalized, as well as indirect influence in the form of more time
+consuming computations in the case of larger number of regressors and more
+complex kernel functions.

 As described in Section~\ref{sec:gp_dynamical_system}, for the purpose of this
 project, the \acrlong{gp} model will be trained using the \acrshort{narx}
@ -64,7 +62,12 @@ always benefit the \acrshort{gp} model at least not comparably to the
 additional computational complexity.

 For the exogenous inputs the choice has therefore been made to take the
-\textit{Global Solar Irradiance} and \textit{Outside Temperature Measurement}.
+\textit{Global Solar Irradiance}\footnotemark and \textit{Outside Temperature
+Measurement}.
+
+\footnotetext{Using the \acrshort{ghi} makes sense in the case of the \pdome\
+building which has windows facing all directions, as opposed to a room which
+would have windows on only one side.}

 \subsection{The Kernel}

@ -104,7 +107,7 @@ three lengthscales apart.
    \begin{tabular}{||c c ||}
        \hline
        $\norm{\mathbf{x} - \mathbf{x}'}$ &
-        $\exp{(-\frac{1}{2}*\frac{\norm{\mathbf{x} - \mathbf{x}'}^2}{l^2})}$ \\
+        $\exp{(-\frac{1}{2}\times\frac{\norm{\mathbf{x} - \mathbf{x}'}^2}{l^2})}$ \\
        \hline \hline
        $1l$ & 0.606 \\
        $2l$ & 0.135 \\
@ -133,6 +136,8 @@ and the variance for different combinations of the exogenous input lags ($l_w$),
 the controlled input lags ($l_u$) and the output lags ($l_y$) for a classical
 \acrshort{gp} model. 

+% TODO: [Lengthscales] Explain lags and w1,1, w2, etc.
+
 \begin{table}[ht]
 %\vspace{-8pt}
 \centering
@ -165,15 +170,15 @@ the controlled input lags ($l_u$) and the output lags ($l_y$) for a classical
 \label{tab:GP_hyperparameters}
 \end{table}

-In general, the results of Table~\ref{tab:GP_hyperparameters} show that the
-past outputs are important when predicting future values. Of importance is also
-the past inputs, with the exception of the models with very high variance, where
-the relative importances stay almost constant across all the inputs. For the
-exogenous inputs, the outside temperature ($w2$) is generally more important
-than the solar irradiation ($w1$). In the case of more autoregressive lags for
-the exogenous inputs the more recent information is usually more important in
-the case of the solar irradiation, while the second-to-last measurement is
-preffered for the outside temperature.
+In general, the results of Table~\ref{tab:GP_hyperparameters} show that the past
+outputs are important when predicting future values. Of importance is also the
+past inputs, with the exception of the models with very high variance, where the
+relative importances stay almost constant across all the inputs. For example, it
+can be seen that for the exogenous inputs, the outside temperature ($w2$) is
+generally more important than the solar irradiation ($w1$). In the case of more
+autoregressive lags for the exogenous inputs the more recent information is
+usually more important in the case of the solar irradiation, while the
+second-to-last measurement is preffered for the outside temperature.

 For the classical \acrshort{gp} model the appropriate choice of lags would be
 $l_u = 1$ and $l_y = 3$ with $l_w$ taking the values of either 1, 2 or 3,
@ -348,8 +353,6 @@ metrics, as well as being cheaper from a computational perspective. In order to
 make a more informed choice for the best hyperparameters, the simulation
 performance of all three combinations has been analysed.

-\clearpage
-
 \begin{table}[ht]
 %\vspace{-8pt}
 \centering
@ -381,7 +384,6 @@ other combinations scoring much worse on the \acrshort{msll} and \acrshort{lpd}
 loss functions. This has, therefore, been chosen as the model for the full year
 simulations.

-
 \subsection{Validation of hyperparameters}\label{sec:validation_hyperparameters}

 The validation step has the purpose of testing the viability of the trained
@ -449,10 +451,13 @@ this proves to be the best simulation model.

 Lastly, \model{3}{1}{3} has a much worse simulation performance than the other
 two models. This could hint at an over fitting of the model on the training data.
+
+\clearpage
+
 This is consistent with the results found in Table~\ref{tab:GP_loss_functions}
 for the \acrshort{rmse} and \acrshort{smse}, as well as can be seen in
 Appendix~\ref{apx:hyperparams_gp}, Figure~\ref{fig:GP_313_test_validation},
-where the model has much worse performance on the testing dataset predictions
+where \model{3}{1}{3} has much worse performance on the testing dataset predictions
 than the other two models.

 The performance of the three models in simulation mode is consistent with the
@ -471,9 +476,7 @@ while still having good performance on all loss functions. In implementation,
 however, this model turned out to be very unstable, and the more conservative
 \model{1}{1}{3} model was used instead.

-\clearpage
-
-    \subsubsection{Sparse and Variational Gaussian Process}
+\subsubsection{Sparse and Variational Gaussian Process}

 For the \acrshort{svgp} models, only the performance of \model{1}{2}{3} was
 investigated, since it had the best performance according to all four loss