Fixed unconsistent use of acronyms
This commit is contained in:
parent
1e1cc5acd8
commit
721953642c
7 changed files with 49 additions and 47 deletions
|
@ -35,7 +35,7 @@ The idea of using Gaussian Processes as regression models for control of dynamic
|
|||
systems is not new, and has already been explored a number of times. A general
|
||||
description of their use, along with the necessary theory and some example
|
||||
implementations is given in~\cite{kocijanModellingControlDynamic2016}.
|
||||
In~\cite{pleweSupervisoryModelPredictive2020}, a \acrlong{gp} Model with a
|
||||
In~\cite{pleweSupervisoryModelPredictive2020}, a \acrshort{gp} Model with a
|
||||
\acrlong{rq} Kernel is used for temperature set point optimization.
|
||||
|
||||
Gaussian Processes for building control have also been studied before in the
|
||||
|
@ -66,7 +66,7 @@ the original identified model goes further and further into the extrapolated
|
|||
regions.
|
||||
|
||||
This project tries to combine the use of online learning control schemes with
|
||||
\acrlong{gp} Models through implementing \acrlong{svgp} Models. \acrshort{svgp}s
|
||||
\acrshort{gp} Models through implementing \acrfull{svgp} Models. \acrshort{svgp}s
|
||||
provide means of extending the use of \acrshort{gp}s to larger datasets, thus
|
||||
enabling the periodic re-training of the model to include all the historically
|
||||
available data.
|
||||
|
@ -81,7 +81,7 @@ multiple control schemes using both classical \acrshort{gp}s, as well as
|
|||
Section~\ref{sec:gaussian_processes} provides the mathematical background for
|
||||
understanding \acrshort{gp}s, as well as the definition in very broad strokes of
|
||||
\acrshort{svgp}s and their differences from the classical implementation of
|
||||
\acrlong{gp}es. This information is later used for comparing their performances
|
||||
\acrshort{gp}s. This information is later used for comparing their performances
|
||||
and outlining their respective pros and cons.
|
||||
|
||||
Section~\ref{sec:CARNOT} goes into the details of the implementation of the
|
||||
|
|
|
@ -144,7 +144,7 @@ choices~\cite{kocijanModellingControlDynamic2016}:
|
|||
\subsubsection*{Squared Exponential Kernel}
|
||||
|
||||
This kernel is used when the system to be modelled is assumed to be smooth and
|
||||
continuous. The basic version of the \acrshort{se} kernel has the following form:
|
||||
continuous. The basic version of the \acrfull{se} kernel has the following form:
|
||||
|
||||
\begin{equation}
|
||||
k(\mathbf{x}, \mathbf{x'}) = \sigma^2 \exp{\left(- \frac{1}{2}\frac{\norm{\mathbf{x} -
|
||||
|
@ -182,7 +182,7 @@ value of the hyperparameters. This is the \acrfull{ard} property.
|
|||
|
||||
The \acrfull{rq} Kernel can be interpreted as an infinite sum of \acrshort{se}
|
||||
kernels with different lengthscales. It has the same smooth behaviour as the
|
||||
\acrlong{se} Kernel, but can take into account the difference in function
|
||||
\acrshort{se} Kernel, but can take into account the difference in function
|
||||
behaviour for large scale vs small scale variations.
|
||||
|
||||
\begin{equation}
|
||||
|
@ -207,11 +207,11 @@ without inquiring the penalty of inverting the covariance matrix. An overview
|
|||
and comparison of multiple methods is given
|
||||
at~\cite{liuUnderstandingComparingScalable2019}.
|
||||
|
||||
For the scope of this project, the choice of using the \acrfull{svgp} models has
|
||||
been made, since it provides a very good balance of scalability, capability,
|
||||
For the scope of this project, the choice of using the \acrshort{svgp} models
|
||||
has been made, since it provides a very good balance of scalability, capability,
|
||||
robustness and controllability~\cite{liuUnderstandingComparingScalable2019}.
|
||||
|
||||
The \acrlong{svgp} has been first introduced
|
||||
The \acrshort{svgp} has been first introduced
|
||||
by~\textcite{hensmanGaussianProcessesBig2013} as a way to scale the use of
|
||||
\acrshort{gp}s to large datasets. A detailed explanation on the mathematics of
|
||||
\acrshort{svgp}s and reasoning behind it is given
|
||||
|
@ -264,7 +264,7 @@ In order to solve this problem, the log likelihood equation
|
|||
classical \acrshort{gp} is replaced with an approximate value, that is
|
||||
computationally tractable on larger sets of data.
|
||||
|
||||
The following derivation of the \acrshort{elbo} is based on the one presented
|
||||
The following derivation of the \acrfull{elbo} is based on the one presented
|
||||
in~\cite{yangUnderstandingVariationalLower}.
|
||||
|
||||
Assume $X$ to be the observations, and $Z$ the set parameters of the
|
||||
|
@ -300,7 +300,7 @@ divergence, which for variational inference takes the following form:
|
|||
\end{equation}
|
||||
\vspace{5pt}
|
||||
|
||||
where L is the \acrfull{elbo}. Rearranging this equation we get:
|
||||
where L is the \acrshort{elbo}. Rearranging this equation we get:
|
||||
|
||||
\begin{equation}
|
||||
L = \log{\left(p(X)\right)} - KL\left[q(Z)||p(Z|X)\right]
|
||||
|
@ -312,13 +312,13 @@ lower bound of the log probability of observations.
|
|||
\subsection{Gaussian Process Models for Dynamical
|
||||
Systems}\label{sec:gp_dynamical_system}
|
||||
|
||||
In the context of Dynamical Systems Identification and Control, Gaussian
|
||||
Processes are used to represent different model structures, ranging from state
|
||||
space and \acrshort{nfir} structures, to the more complex \acrshort{narx},
|
||||
\acrshort{noe} and \acrshort{narmax}.
|
||||
In the context of Dynamical Systems Identification and Control, \acrshort{gp}s
|
||||
are used to represent different model structures, ranging from state
|
||||
space and \acrfull{nfir} structures, to the more complex \acrfull{narx},
|
||||
\acrfull{noe} and \acrfull{narmax}.
|
||||
|
||||
|
||||
The general form of an \acrfull{narx} model is as follows:
|
||||
The general form of an \acrshort{narx} model is as follows:
|
||||
|
||||
\begin{equation}
|
||||
\hat{y}(k) =
|
||||
|
|
|
@ -378,7 +378,7 @@ The unit has a typical \acrlong{eer} (\acrshort{eer}, cooling efficiency) of 4.9
|
|||
maximum cooling capacity of 64.2 kW.
|
||||
|
||||
One particularity of this \acrshort{hvac} unit is that during summer, only one
|
||||
of the two compressors are running. This results in a higher \acrlong{eer}, in
|
||||
of the two compressors are running. This results in a higher \acrshort{eer}, in
|
||||
the cases where the full cooling capacity is not required.
|
||||
|
||||
\subsubsection*{Ventilation}
|
||||
|
@ -504,7 +504,7 @@ it will oscillate between using one or two compressors. Lastly, it is possible
|
|||
to notice that the \acrshort{hvac} is not turned on during the night, with the
|
||||
exception of the external fan, which continues running.
|
||||
|
||||
\subsubsection{The CARNOT WDB weather data format}\label{sec:CARNOT_WDB}
|
||||
\subsubsection{The CARNOT Weather Data Bus format}\label{sec:CARNOT_WDB}
|
||||
|
||||
For a correct simulation of the building behaviour, CARNOT requires not only the
|
||||
detailed definition of the building blocks/nodes, but also a very detailed set
|
||||
|
@ -514,7 +514,7 @@ sun's position throughout the simulation (zenith and azimuth angles), the
|
|||
as well as information on the ambient temperature, humidity, precipitation,
|
||||
pressure, wind speed and direction, etc. A detailed overview of each
|
||||
measurement necessary for a simulation is given in the CARNOT user
|
||||
manual~\cite{CARNOTManual}.
|
||||
manual~\cite{CARNOTManual}. This data structure is known as the \acrfull{wdb}.
|
||||
|
||||
In order to compare the CARNOT model's performance to that of the real \pdome,
|
||||
it is necessary to simulate the CARNOT model under the same set of conditions as
|
||||
|
@ -532,17 +532,19 @@ are computed using the Python pvlib
|
|||
library~\cite{f.holmgrenPvlibPythonPython2018}.
|
||||
|
||||
As opposed to the solar angles, which can be computed exactly from the available
|
||||
information, the Solar Radiation Components (DHI and DNI) have to be estimated
|
||||
from the available measurements of GHI, zenith angles (Z) and datetime
|
||||
information. \textcite{erbsEstimationDiffuseRadiation1982} present an empirical
|
||||
relationship between GHI and the diffuse fraction DF and the ratio of GHI to
|
||||
extraterrestrial irradiance $K_t$, known as the Erbs model. The DF is then used
|
||||
to compute DHI and DNI as follows:
|
||||
information, the Solar Radiation Components (\acrshort{dhi} and \acrshort{dni})
|
||||
have to be estimated from the available measurements of \acrfull{ghi}, zenith
|
||||
angles (Z) and datetime information.
|
||||
\textcite{erbsEstimationDiffuseRadiation1982} present an empirical relationship
|
||||
between \acrshort{ghi} and the \acrfull{df} and the ratio of \acrshort{ghi} to
|
||||
extraterrestrial irradiance $K_t$, known as the Erbs model. The \acrshort{df}
|
||||
is then used to compute \acrshort{dhi} and \acrshort{dni} as follows:
|
||||
|
||||
\begin{equation}
|
||||
\begin{aligned}
|
||||
\text{DHI} &= \text{DF} \times \text{GHI} \\
|
||||
\text{DNI} &= \frac{\text{GHI} - \text{DHI}}{\cos{\text{Z}}}
|
||||
\text{\acrshort{dhi}} &= \text{DF} \times \text{\acrshort{ghi}} \\
|
||||
\text{\acrshort{dni}} &= \frac{\text{\acrshort{ghi}} -
|
||||
\text{\acrshort{dhi}}}{\cos{\text{Z}}}
|
||||
\end{aligned}
|
||||
\end{equation}
|
||||
|
||||
|
|
|
@ -19,7 +19,7 @@ consuming computations in the case of larger number of regressors and more
|
|||
complex kernel functions.
|
||||
|
||||
As described in Section~\ref{sec:gp_dynamical_system}, for the purpose of this
|
||||
project, the \acrlong{gp} model will be trained using the \acrshort{narx}
|
||||
project, the \acrshort{gp} model will be trained using the \acrshort{narx}
|
||||
structure. This already presents an important choice in the selection of
|
||||
regressors and their respective autoregressive lags.
|
||||
|
||||
|
@ -185,7 +185,7 @@ $l_u = 1$ and $l_y = 3$ with $l_w$ taking the values of either 1, 2 or 3,
|
|||
depending on the results of further analysis.
|
||||
|
||||
|
||||
As for the case of the \acrlong{svgp}, the results for the classical
|
||||
As for the case of the \acrshort{svgp}, the results for the classical
|
||||
\acrshort{gp} (cf. Table~\ref{tab:GP_hyperparameters}) are not necessarily
|
||||
representative of the relationships between the regressors of the
|
||||
\acrshort{svgp} model, due to the fact that the dataset used for training is
|
||||
|
@ -259,8 +259,8 @@ This performance metric is very useful when training a model whose goal is
|
|||
solely to minimize the difference between the measured values, and the ones
|
||||
predicted by the model.
|
||||
|
||||
A variant of the \acrshort{mse} is the \acrfull{smse}, which normalizes the
|
||||
\acrlong{mse} by the variance of the output values of the validation dataset.
|
||||
A variant of the \acrfull{mse} is the \acrfull{smse}, which normalizes the
|
||||
\acrshort{mse} by the variance of the output values of the validation dataset.
|
||||
|
||||
\begin{equation}\label{eq:smse}
|
||||
\text{SMSE} = \frac{1}{N}\frac{\sum_{i=1}^N \left(y_i -
|
||||
|
@ -403,7 +403,7 @@ the discrepancies.
|
|||
\subsubsection{Conventional Gaussian Process}
|
||||
|
||||
The simulation performance of the three lag combinations chosen for the
|
||||
classical \acrlong{gp} models has been analyzed, with the results presented in
|
||||
classical \acrshort{gp} models has been analyzed, with the results presented in
|
||||
Figures~\ref{fig:GP_113_multistep_validation},~\ref{fig:GP_213_multistep_validation}
|
||||
and~\ref{fig:GP_313_multistep_validation}. For reference, the one-step ahead
|
||||
predictions for the training and test datasets are presented in
|
||||
|
|
|
@ -48,7 +48,7 @@ the correct amount of data for the weather predictions and to properly generate
|
|||
the optimization problem, the discrete/continuous transition and vice-versa
|
||||
happens on the Simulink side. This simplifies the adjustment of the sampling
|
||||
time, with the downside of harder inclusion of meta-data such as hour of the
|
||||
day, day of the week, etc.\ in the \acrlong{gp} Model.
|
||||
day, day of the week, etc.\ in the \acrshort{gp} Model.
|
||||
|
||||
The weather prediction is done using the information present in the CARNOT
|
||||
\acrshort{wdb} object. Since the sampling time and control horizon of the
|
||||
|
@ -66,13 +66,13 @@ evaluating a \acrshort{gp} has an algorithmic complexity of $\mathcal{O}(n^3)$.
|
|||
This means that naive implementations can get too expensive in terms of
|
||||
computation time very quickly.
|
||||
|
||||
In order to have as smallest of a bottleneck as possible when dealing with
|
||||
\acrshort{gp}s, a very fast implementation of \acrlong{gp} Models was used, in
|
||||
the form of GPflow~\cite{matthewsGPflowGaussianProcess2017}. It is based on
|
||||
TensorFlow~\cite{tensorflow2015-whitepaper}, which has very efficient
|
||||
implementation of all the necessary Linear Algebra operations. Another benefit
|
||||
of this implementation is the very simple use of any additional computational
|
||||
resources, such as a GPU, TPU, etc.
|
||||
In order to have as smallest of a bottleneck as possible when dealing with the
|
||||
required algebraic operations, a very fast implementation of \acrshort{gp}
|
||||
Models was used, in the form of GPflow~\cite{matthewsGPflowGaussianProcess2017}.
|
||||
It is based on TensorFlow~\cite{tensorflow2015-whitepaper}, which has very
|
||||
efficient implementation of all the necessary Linear Algebra operations. Another
|
||||
benefit of this implementation is the very simple use of any additional
|
||||
computational resources, such as a GPU, TPU, etc.
|
||||
|
||||
\subsubsection{Classical Gaussian Process training}
|
||||
|
||||
|
@ -158,7 +158,7 @@ Let $w_l$, $u_l$, and $y_l$ be the lengths of the state vector components
|
|||
$\mathbf{w}$, $\mathbf{u}$, $\mathbf{y}$ (cf. Equation~\ref{eq:components}).
|
||||
Also, let X be the matrix of all the system states over the optimization horizon
|
||||
and W be the matrix of the predicted disturbances for all the future steps. The
|
||||
original \acrlong{ocp} can be rewritten using index notation as:
|
||||
original \acrshort{ocp} can be rewritten using index notation as:
|
||||
|
||||
\begin{subequations}\label{eq:sparse_optimal_control_problem}
|
||||
\begin{align}
|
||||
|
|
|
@ -7,7 +7,7 @@ analyzed in this Section have used a sampling time of 15 minutes and a control
|
|||
horizon of 8 steps.
|
||||
|
||||
Section~\ref{sec:GP_results} analyzes the results of a conventional
|
||||
\acrlong{gp} Model trained on the first five days of gathered data. The model
|
||||
\acrshort{gp} Model trained on the first five days of gathered data. The model
|
||||
is then used for the rest of the year, with the goal of tracking the defined
|
||||
reference temperature.
|
||||
|
||||
|
@ -131,7 +131,7 @@ performance, but are more complex in implementation.
|
|||
|
||||
\subsection{Sparse and Variational Gaussian Process}\label{sec:SVGP_results}
|
||||
|
||||
The \acrlong{svgp} models are setup in a similar way as described before. The
|
||||
The \acrshort{svgp} models are setup in a similar way as described before. The
|
||||
model is first identified using 5 days worth of experimental data collected
|
||||
using a \acrshort{pi} controller and a random disturbance signal. The difference
|
||||
lies in the fact than the \acrshort{svgp} model gets re-identified every night
|
||||
|
@ -143,7 +143,7 @@ setup performs much better than the initial one. The only large deviations from
|
|||
the reference temperature are due to cold weather, when the \acrshort{hvac}'s
|
||||
limited heat capacity is unable to maintain the proper temperature.
|
||||
Additionnaly, the \acrshort{svgp} controller takes around 250 - 300ms of
|
||||
computation time for each simulation time, decreasing the computational cost of
|
||||
computation time for each simulation step, decreasing the computational cost of
|
||||
the original \acrshort{gp} by a factor of six.
|
||||
|
||||
|
||||
|
@ -293,7 +293,7 @@ As seen in Figure~\ref{fig:SVGP_evol_importance}, the variance of the
|
|||
signifies the increase in confidence of the model. The hyperparameters
|
||||
corresponding to the exogenous inputs --- $w1,1$ and $w1,2$ --- become generally
|
||||
less important for future predictions over the course of the year, with the
|
||||
importance of $w1,1$, the \acrlong{ghi}, climbing back up over the last, colder
|
||||
importance of $w1,1$, the \acrshort{ghi}, climbing back up over the last, colder
|
||||
months. This might be due to the fact that during the colder months, the
|
||||
\acrshort{ghi} is the only way for the exogenous inputs to \textit{provide}
|
||||
additional heat to the system.
|
||||
|
@ -361,7 +361,7 @@ simulation data (cf. Figures~\ref{fig:SVGP_96pts_fullyear_simulation}
|
|||
and~\ref{fig:SVGP_96pts_abserr}) it is very notable that the model performs
|
||||
almost identically to the one identified in the previous sections. This
|
||||
highlights one of the practical benefits of the \acrshort{svgp} implementations
|
||||
compared to the classical \acrlong{gp} -- it is possible to start with a rougher
|
||||
compared to the classical \acrshort{gp} -- it is possible to start with a rougher
|
||||
controller trained on less data and refine it over time, reducing the need for
|
||||
cumbersome and potentially costly initial experiments for gathering data.
|
||||
|
||||
|
@ -473,7 +473,7 @@ models can be deployed with less explicit identification data, but they will
|
|||
continue to improve over the course of the year, as the building passes through
|
||||
different regions of the state space and more data is collected.
|
||||
|
||||
However, these results do not discredit the use of \acrlong{gp} for employment
|
||||
However, these results do not discredit the use of \acrshort{gp} for employment
|
||||
in a multi-seasonal situation. As shown before, given the same amount of data
|
||||
and ignoring the computational cost, they perform better than the alternative
|
||||
\acrshort{svgp} models. The bad initial performance could be mitigated by
|
||||
|
|
|
@ -62,7 +62,7 @@ throughout the year. The \acrshort{svgp} models also present a computational
|
|||
cost advantage both in training and in evaluation, due to several approximations
|
||||
shown in Section~\ref{sec:gaussian_processes}.
|
||||
|
||||
Focusing on the \acrlong{gp} models, there could be several ways of improving
|
||||
Focusing on the \acrshort{gp} models, there could be several ways of improving
|
||||
its performance, as noted previously: a more varied identification dataset and
|
||||
smart update of a fixed-size data dictionary according to information gain,
|
||||
could mitigate the present problems.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue