Fixed unconsistent use of acronyms

This commit is contained in:
Radu C. Martin 2021-07-22 22:13:51 +02:00
parent 1e1cc5acd8
commit 721953642c
7 changed files with 49 additions and 47 deletions

View file

@ -35,7 +35,7 @@ The idea of using Gaussian Processes as regression models for control of dynamic
systems is not new, and has already been explored a number of times. A general
description of their use, along with the necessary theory and some example
implementations is given in~\cite{kocijanModellingControlDynamic2016}.
In~\cite{pleweSupervisoryModelPredictive2020}, a \acrlong{gp} Model with a
In~\cite{pleweSupervisoryModelPredictive2020}, a \acrshort{gp} Model with a
\acrlong{rq} Kernel is used for temperature set point optimization.
Gaussian Processes for building control have also been studied before in the
@ -66,7 +66,7 @@ the original identified model goes further and further into the extrapolated
regions.
This project tries to combine the use of online learning control schemes with
\acrlong{gp} Models through implementing \acrlong{svgp} Models. \acrshort{svgp}s
\acrshort{gp} Models through implementing \acrfull{svgp} Models. \acrshort{svgp}s
provide means of extending the use of \acrshort{gp}s to larger datasets, thus
enabling the periodic re-training of the model to include all the historically
available data.
@ -81,7 +81,7 @@ multiple control schemes using both classical \acrshort{gp}s, as well as
Section~\ref{sec:gaussian_processes} provides the mathematical background for
understanding \acrshort{gp}s, as well as the definition in very broad strokes of
\acrshort{svgp}s and their differences from the classical implementation of
\acrlong{gp}es. This information is later used for comparing their performances
\acrshort{gp}s. This information is later used for comparing their performances
and outlining their respective pros and cons.
Section~\ref{sec:CARNOT} goes into the details of the implementation of the

View file

@ -144,7 +144,7 @@ choices~\cite{kocijanModellingControlDynamic2016}:
\subsubsection*{Squared Exponential Kernel}
This kernel is used when the system to be modelled is assumed to be smooth and
continuous. The basic version of the \acrshort{se} kernel has the following form:
continuous. The basic version of the \acrfull{se} kernel has the following form:
\begin{equation}
k(\mathbf{x}, \mathbf{x'}) = \sigma^2 \exp{\left(- \frac{1}{2}\frac{\norm{\mathbf{x} -
@ -182,7 +182,7 @@ value of the hyperparameters. This is the \acrfull{ard} property.
The \acrfull{rq} Kernel can be interpreted as an infinite sum of \acrshort{se}
kernels with different lengthscales. It has the same smooth behaviour as the
\acrlong{se} Kernel, but can take into account the difference in function
\acrshort{se} Kernel, but can take into account the difference in function
behaviour for large scale vs small scale variations.
\begin{equation}
@ -207,11 +207,11 @@ without inquiring the penalty of inverting the covariance matrix. An overview
and comparison of multiple methods is given
at~\cite{liuUnderstandingComparingScalable2019}.
For the scope of this project, the choice of using the \acrfull{svgp} models has
been made, since it provides a very good balance of scalability, capability,
For the scope of this project, the choice of using the \acrshort{svgp} models
has been made, since it provides a very good balance of scalability, capability,
robustness and controllability~\cite{liuUnderstandingComparingScalable2019}.
The \acrlong{svgp} has been first introduced
The \acrshort{svgp} has been first introduced
by~\textcite{hensmanGaussianProcessesBig2013} as a way to scale the use of
\acrshort{gp}s to large datasets. A detailed explanation on the mathematics of
\acrshort{svgp}s and reasoning behind it is given
@ -264,7 +264,7 @@ In order to solve this problem, the log likelihood equation
classical \acrshort{gp} is replaced with an approximate value, that is
computationally tractable on larger sets of data.
The following derivation of the \acrshort{elbo} is based on the one presented
The following derivation of the \acrfull{elbo} is based on the one presented
in~\cite{yangUnderstandingVariationalLower}.
Assume $X$ to be the observations, and $Z$ the set parameters of the
@ -300,7 +300,7 @@ divergence, which for variational inference takes the following form:
\end{equation}
\vspace{5pt}
where L is the \acrfull{elbo}. Rearranging this equation we get:
where L is the \acrshort{elbo}. Rearranging this equation we get:
\begin{equation}
L = \log{\left(p(X)\right)} - KL\left[q(Z)||p(Z|X)\right]
@ -312,13 +312,13 @@ lower bound of the log probability of observations.
\subsection{Gaussian Process Models for Dynamical
Systems}\label{sec:gp_dynamical_system}
In the context of Dynamical Systems Identification and Control, Gaussian
Processes are used to represent different model structures, ranging from state
space and \acrshort{nfir} structures, to the more complex \acrshort{narx},
\acrshort{noe} and \acrshort{narmax}.
In the context of Dynamical Systems Identification and Control, \acrshort{gp}s
are used to represent different model structures, ranging from state
space and \acrfull{nfir} structures, to the more complex \acrfull{narx},
\acrfull{noe} and \acrfull{narmax}.
The general form of an \acrfull{narx} model is as follows:
The general form of an \acrshort{narx} model is as follows:
\begin{equation}
\hat{y}(k) =

View file

@ -378,7 +378,7 @@ The unit has a typical \acrlong{eer} (\acrshort{eer}, cooling efficiency) of 4.9
maximum cooling capacity of 64.2 kW.
One particularity of this \acrshort{hvac} unit is that during summer, only one
of the two compressors are running. This results in a higher \acrlong{eer}, in
of the two compressors are running. This results in a higher \acrshort{eer}, in
the cases where the full cooling capacity is not required.
\subsubsection*{Ventilation}
@ -504,7 +504,7 @@ it will oscillate between using one or two compressors. Lastly, it is possible
to notice that the \acrshort{hvac} is not turned on during the night, with the
exception of the external fan, which continues running.
\subsubsection{The CARNOT WDB weather data format}\label{sec:CARNOT_WDB}
\subsubsection{The CARNOT Weather Data Bus format}\label{sec:CARNOT_WDB}
For a correct simulation of the building behaviour, CARNOT requires not only the
detailed definition of the building blocks/nodes, but also a very detailed set
@ -514,7 +514,7 @@ sun's position throughout the simulation (zenith and azimuth angles), the
as well as information on the ambient temperature, humidity, precipitation,
pressure, wind speed and direction, etc. A detailed overview of each
measurement necessary for a simulation is given in the CARNOT user
manual~\cite{CARNOTManual}.
manual~\cite{CARNOTManual}. This data structure is known as the \acrfull{wdb}.
In order to compare the CARNOT model's performance to that of the real \pdome,
it is necessary to simulate the CARNOT model under the same set of conditions as
@ -532,17 +532,19 @@ are computed using the Python pvlib
library~\cite{f.holmgrenPvlibPythonPython2018}.
As opposed to the solar angles, which can be computed exactly from the available
information, the Solar Radiation Components (DHI and DNI) have to be estimated
from the available measurements of GHI, zenith angles (Z) and datetime
information. \textcite{erbsEstimationDiffuseRadiation1982} present an empirical
relationship between GHI and the diffuse fraction DF and the ratio of GHI to
extraterrestrial irradiance $K_t$, known as the Erbs model. The DF is then used
to compute DHI and DNI as follows:
information, the Solar Radiation Components (\acrshort{dhi} and \acrshort{dni})
have to be estimated from the available measurements of \acrfull{ghi}, zenith
angles (Z) and datetime information.
\textcite{erbsEstimationDiffuseRadiation1982} present an empirical relationship
between \acrshort{ghi} and the \acrfull{df} and the ratio of \acrshort{ghi} to
extraterrestrial irradiance $K_t$, known as the Erbs model. The \acrshort{df}
is then used to compute \acrshort{dhi} and \acrshort{dni} as follows:
\begin{equation}
\begin{aligned}
\text{DHI} &= \text{DF} \times \text{GHI} \\
\text{DNI} &= \frac{\text{GHI} - \text{DHI}}{\cos{\text{Z}}}
\text{\acrshort{dhi}} &= \text{DF} \times \text{\acrshort{ghi}} \\
\text{\acrshort{dni}} &= \frac{\text{\acrshort{ghi}} -
\text{\acrshort{dhi}}}{\cos{\text{Z}}}
\end{aligned}
\end{equation}

View file

@ -19,7 +19,7 @@ consuming computations in the case of larger number of regressors and more
complex kernel functions.
As described in Section~\ref{sec:gp_dynamical_system}, for the purpose of this
project, the \acrlong{gp} model will be trained using the \acrshort{narx}
project, the \acrshort{gp} model will be trained using the \acrshort{narx}
structure. This already presents an important choice in the selection of
regressors and their respective autoregressive lags.
@ -185,7 +185,7 @@ $l_u = 1$ and $l_y = 3$ with $l_w$ taking the values of either 1, 2 or 3,
depending on the results of further analysis.
As for the case of the \acrlong{svgp}, the results for the classical
As for the case of the \acrshort{svgp}, the results for the classical
\acrshort{gp} (cf. Table~\ref{tab:GP_hyperparameters}) are not necessarily
representative of the relationships between the regressors of the
\acrshort{svgp} model, due to the fact that the dataset used for training is
@ -259,8 +259,8 @@ This performance metric is very useful when training a model whose goal is
solely to minimize the difference between the measured values, and the ones
predicted by the model.
A variant of the \acrshort{mse} is the \acrfull{smse}, which normalizes the
\acrlong{mse} by the variance of the output values of the validation dataset.
A variant of the \acrfull{mse} is the \acrfull{smse}, which normalizes the
\acrshort{mse} by the variance of the output values of the validation dataset.
\begin{equation}\label{eq:smse}
\text{SMSE} = \frac{1}{N}\frac{\sum_{i=1}^N \left(y_i -
@ -403,7 +403,7 @@ the discrepancies.
\subsubsection{Conventional Gaussian Process}
The simulation performance of the three lag combinations chosen for the
classical \acrlong{gp} models has been analyzed, with the results presented in
classical \acrshort{gp} models has been analyzed, with the results presented in
Figures~\ref{fig:GP_113_multistep_validation},~\ref{fig:GP_213_multistep_validation}
and~\ref{fig:GP_313_multistep_validation}. For reference, the one-step ahead
predictions for the training and test datasets are presented in

View file

@ -48,7 +48,7 @@ the correct amount of data for the weather predictions and to properly generate
the optimization problem, the discrete/continuous transition and vice-versa
happens on the Simulink side. This simplifies the adjustment of the sampling
time, with the downside of harder inclusion of meta-data such as hour of the
day, day of the week, etc.\ in the \acrlong{gp} Model.
day, day of the week, etc.\ in the \acrshort{gp} Model.
The weather prediction is done using the information present in the CARNOT
\acrshort{wdb} object. Since the sampling time and control horizon of the
@ -66,13 +66,13 @@ evaluating a \acrshort{gp} has an algorithmic complexity of $\mathcal{O}(n^3)$.
This means that naive implementations can get too expensive in terms of
computation time very quickly.
In order to have as smallest of a bottleneck as possible when dealing with
\acrshort{gp}s, a very fast implementation of \acrlong{gp} Models was used, in
the form of GPflow~\cite{matthewsGPflowGaussianProcess2017}. It is based on
TensorFlow~\cite{tensorflow2015-whitepaper}, which has very efficient
implementation of all the necessary Linear Algebra operations. Another benefit
of this implementation is the very simple use of any additional computational
resources, such as a GPU, TPU, etc.
In order to have as smallest of a bottleneck as possible when dealing with the
required algebraic operations, a very fast implementation of \acrshort{gp}
Models was used, in the form of GPflow~\cite{matthewsGPflowGaussianProcess2017}.
It is based on TensorFlow~\cite{tensorflow2015-whitepaper}, which has very
efficient implementation of all the necessary Linear Algebra operations. Another
benefit of this implementation is the very simple use of any additional
computational resources, such as a GPU, TPU, etc.
\subsubsection{Classical Gaussian Process training}
@ -158,7 +158,7 @@ Let $w_l$, $u_l$, and $y_l$ be the lengths of the state vector components
$\mathbf{w}$, $\mathbf{u}$, $\mathbf{y}$ (cf. Equation~\ref{eq:components}).
Also, let X be the matrix of all the system states over the optimization horizon
and W be the matrix of the predicted disturbances for all the future steps. The
original \acrlong{ocp} can be rewritten using index notation as:
original \acrshort{ocp} can be rewritten using index notation as:
\begin{subequations}\label{eq:sparse_optimal_control_problem}
\begin{align}

View file

@ -7,7 +7,7 @@ analyzed in this Section have used a sampling time of 15 minutes and a control
horizon of 8 steps.
Section~\ref{sec:GP_results} analyzes the results of a conventional
\acrlong{gp} Model trained on the first five days of gathered data. The model
\acrshort{gp} Model trained on the first five days of gathered data. The model
is then used for the rest of the year, with the goal of tracking the defined
reference temperature.
@ -131,7 +131,7 @@ performance, but are more complex in implementation.
\subsection{Sparse and Variational Gaussian Process}\label{sec:SVGP_results}
The \acrlong{svgp} models are setup in a similar way as described before. The
The \acrshort{svgp} models are setup in a similar way as described before. The
model is first identified using 5 days worth of experimental data collected
using a \acrshort{pi} controller and a random disturbance signal. The difference
lies in the fact than the \acrshort{svgp} model gets re-identified every night
@ -143,7 +143,7 @@ setup performs much better than the initial one. The only large deviations from
the reference temperature are due to cold weather, when the \acrshort{hvac}'s
limited heat capacity is unable to maintain the proper temperature.
Additionnaly, the \acrshort{svgp} controller takes around 250 - 300ms of
computation time for each simulation time, decreasing the computational cost of
computation time for each simulation step, decreasing the computational cost of
the original \acrshort{gp} by a factor of six.
@ -293,7 +293,7 @@ As seen in Figure~\ref{fig:SVGP_evol_importance}, the variance of the
signifies the increase in confidence of the model. The hyperparameters
corresponding to the exogenous inputs --- $w1,1$ and $w1,2$ --- become generally
less important for future predictions over the course of the year, with the
importance of $w1,1$, the \acrlong{ghi}, climbing back up over the last, colder
importance of $w1,1$, the \acrshort{ghi}, climbing back up over the last, colder
months. This might be due to the fact that during the colder months, the
\acrshort{ghi} is the only way for the exogenous inputs to \textit{provide}
additional heat to the system.
@ -361,7 +361,7 @@ simulation data (cf. Figures~\ref{fig:SVGP_96pts_fullyear_simulation}
and~\ref{fig:SVGP_96pts_abserr}) it is very notable that the model performs
almost identically to the one identified in the previous sections. This
highlights one of the practical benefits of the \acrshort{svgp} implementations
compared to the classical \acrlong{gp} -- it is possible to start with a rougher
compared to the classical \acrshort{gp} -- it is possible to start with a rougher
controller trained on less data and refine it over time, reducing the need for
cumbersome and potentially costly initial experiments for gathering data.
@ -473,7 +473,7 @@ models can be deployed with less explicit identification data, but they will
continue to improve over the course of the year, as the building passes through
different regions of the state space and more data is collected.
However, these results do not discredit the use of \acrlong{gp} for employment
However, these results do not discredit the use of \acrshort{gp} for employment
in a multi-seasonal situation. As shown before, given the same amount of data
and ignoring the computational cost, they perform better than the alternative
\acrshort{svgp} models. The bad initial performance could be mitigated by

View file

@ -62,7 +62,7 @@ throughout the year. The \acrshort{svgp} models also present a computational
cost advantage both in training and in evaluation, due to several approximations
shown in Section~\ref{sec:gaussian_processes}.
Focusing on the \acrlong{gp} models, there could be several ways of improving
Focusing on the \acrshort{gp} models, there could be several ways of improving
its performance, as noted previously: a more varied identification dataset and
smart update of a fixed-size data dictionary according to information gain,
could mitigate the present problems.