Fixed unconsistent use of acronyms

2021-07-22 22:13:51 +02:00 · 2021-07-22 22:13:51 +02:00 · 721953642c
commit 721953642c
parent 1e1cc5acd8
7 changed files with 49 additions and 47 deletions
--- a/Sections/10_Introduction.tex
+++ b/Sections/10_Introduction.tex
@ -35,7 +35,7 @@ The idea of using Gaussian Processes as regression models for control of dynamic
 systems is not new, and has already been explored a number of times. A general
 description of their use, along with the necessary theory and some example
 implementations is given in~\cite{kocijanModellingControlDynamic2016}.
-In~\cite{pleweSupervisoryModelPredictive2020}, a \acrlong{gp} Model with a
+In~\cite{pleweSupervisoryModelPredictive2020}, a \acrshort{gp} Model with a
 \acrlong{rq} Kernel is used for temperature set point optimization.

 Gaussian Processes for building control have also been studied before in the
@ -66,7 +66,7 @@ the original identified model goes further and further into the extrapolated
 regions.

 This project tries to combine the use of online learning control schemes with
-\acrlong{gp} Models through implementing \acrlong{svgp} Models. \acrshort{svgp}s
+\acrshort{gp} Models through implementing \acrfull{svgp} Models. \acrshort{svgp}s
 provide means of extending the use of \acrshort{gp}s to larger datasets, thus
 enabling the periodic re-training of the model to include all the historically
 available data.
@ -81,7 +81,7 @@ multiple control schemes using both classical \acrshort{gp}s, as well as
 Section~\ref{sec:gaussian_processes} provides the mathematical background for
 understanding \acrshort{gp}s, as well as the definition in very broad strokes of
 \acrshort{svgp}s and their differences from the classical implementation of
-\acrlong{gp}es. This information is later used for comparing their performances
+\acrshort{gp}s. This information is later used for comparing their performances
 and outlining their respective pros and cons.

 Section~\ref{sec:CARNOT} goes into the details of the implementation of the
--- a/Sections/30_Gaussian_Processes_Background.tex
+++ b/Sections/30_Gaussian_Processes_Background.tex
@ -144,7 +144,7 @@ choices~\cite{kocijanModellingControlDynamic2016}:
 \subsubsection*{Squared Exponential Kernel}

 This kernel is used when the system to be modelled is assumed to be smooth and
-continuous. The basic version of the \acrshort{se} kernel has the following form:
+continuous. The basic version of the \acrfull{se} kernel has the following form:

 \begin{equation}
    k(\mathbf{x}, \mathbf{x'}) = \sigma^2 \exp{\left(- \frac{1}{2}\frac{\norm{\mathbf{x} -
@ -182,7 +182,7 @@ value of the hyperparameters. This is the \acrfull{ard} property.

 The \acrfull{rq} Kernel can be interpreted as an infinite sum of \acrshort{se}
 kernels with different lengthscales. It has the same smooth behaviour as the
-\acrlong{se} Kernel, but can take into account the difference in function
+\acrshort{se} Kernel, but can take into account the difference in function
 behaviour for large scale vs small scale variations.

 \begin{equation}
@ -207,11 +207,11 @@ without inquiring the penalty of inverting the covariance matrix. An overview
 and comparison of multiple methods is given
 at~\cite{liuUnderstandingComparingScalable2019}.

-For the scope of this project, the choice of using the \acrfull{svgp} models has
-been made, since it provides a very good balance of scalability, capability,
+For the scope of this project, the choice of using the \acrshort{svgp} models
+has been made, since it provides a very good balance of scalability, capability,
 robustness and controllability~\cite{liuUnderstandingComparingScalable2019}.

-The \acrlong{svgp} has been first introduced
+The \acrshort{svgp} has been first introduced
 by~\textcite{hensmanGaussianProcessesBig2013} as a way to scale the use of
 \acrshort{gp}s to large datasets. A detailed explanation on the mathematics of
 \acrshort{svgp}s and reasoning behind it is given
@ -264,7 +264,7 @@ In order to solve this problem, the log likelihood equation
 classical \acrshort{gp} is replaced with an approximate value, that is
 computationally tractable on larger sets of data.

-The following derivation of the \acrshort{elbo} is based on the one presented
+The following derivation of the \acrfull{elbo} is based on the one presented
 in~\cite{yangUnderstandingVariationalLower}.

 Assume $X$ to be the observations, and $Z$ the set parameters of the
@ -300,7 +300,7 @@ divergence, which for variational inference takes the following form:
 \end{equation}
 \vspace{5pt}

-where L is the \acrfull{elbo}. Rearranging this equation we get: 
+where L is the \acrshort{elbo}. Rearranging this equation we get:

 \begin{equation}
    L = \log{\left(p(X)\right)} - KL\left[q(Z)||p(Z|X)\right] 
@ -312,13 +312,13 @@ lower bound of the log probability of observations.
 \subsection{Gaussian Process Models for Dynamical
 Systems}\label{sec:gp_dynamical_system}

-In the context of Dynamical Systems Identification and Control, Gaussian
-Processes are used to represent different model structures, ranging from state
-space and \acrshort{nfir} structures, to the more complex \acrshort{narx},
-\acrshort{noe} and \acrshort{narmax}. 
+In the context of Dynamical Systems Identification and Control, \acrshort{gp}s
+are used to represent different model structures, ranging from state
+space and \acrfull{nfir} structures, to the more complex \acrfull{narx},
+\acrfull{noe} and \acrfull{narmax}.


-The general form of an \acrfull{narx} model is as follows:
+The general form of an \acrshort{narx} model is as follows:

 \begin{equation}
    \hat{y}(k) =
--- a/Sections/40_CARNOT_model.tex
+++ b/Sections/40_CARNOT_model.tex
@ -378,7 +378,7 @@ The unit has a typical \acrlong{eer} (\acrshort{eer}, cooling efficiency) of 4.9
 maximum cooling capacity of 64.2 kW. 

 One particularity of this \acrshort{hvac} unit is that during summer, only one
-of the two compressors are running. This results in a higher \acrlong{eer}, in
+of the two compressors are running. This results in a higher \acrshort{eer}, in
 the cases where the full cooling capacity is not required.

 \subsubsection*{Ventilation}
@ -504,7 +504,7 @@ it will oscillate between using one or two compressors. Lastly, it is possible
 to notice that the \acrshort{hvac} is not turned on during the night, with the
 exception of the external fan, which continues running.

-\subsubsection{The CARNOT WDB weather data format}\label{sec:CARNOT_WDB}
+\subsubsection{The CARNOT Weather Data Bus format}\label{sec:CARNOT_WDB}

 For a correct simulation of the building behaviour, CARNOT requires not only the
 detailed definition of the building blocks/nodes, but also a very detailed set
@ -514,7 +514,7 @@ sun's position throughout the simulation (zenith and azimuth angles), the
 as well as information on the ambient temperature, humidity, precipitation,
 pressure, wind speed and direction, etc.  A detailed overview of each
 measurement necessary for a simulation is given in the CARNOT user
-manual~\cite{CARNOTManual}.
+manual~\cite{CARNOTManual}. This data structure is known as the \acrfull{wdb}.

 In order to compare the CARNOT model's performance to that of the real \pdome,
 it is necessary to simulate the CARNOT model under the same set of conditions as
@ -532,17 +532,19 @@ are computed using the Python pvlib
 library~\cite{f.holmgrenPvlibPythonPython2018}.

 As opposed to the solar angles, which can be computed exactly from the available
-information, the Solar Radiation Components (DHI and DNI) have to be estimated
-from the available measurements of GHI, zenith angles (Z) and datetime
-information.  \textcite{erbsEstimationDiffuseRadiation1982} present an empirical
-relationship between GHI and the diffuse fraction DF and the ratio of GHI to
-extraterrestrial irradiance $K_t$, known as the Erbs model. The DF is then used
-to compute DHI and DNI as follows:
+information, the Solar Radiation Components (\acrshort{dhi} and \acrshort{dni})
+have to be estimated from the available measurements of \acrfull{ghi}, zenith
+angles (Z) and datetime information.
+\textcite{erbsEstimationDiffuseRadiation1982} present an empirical relationship
+between \acrshort{ghi} and the \acrfull{df} and the ratio of \acrshort{ghi} to
+extraterrestrial irradiance $K_t$, known as the Erbs model.  The \acrshort{df}
+is then used to compute \acrshort{dhi} and \acrshort{dni} as follows:

 \begin{equation}
    \begin{aligned}
-        \text{DHI} &= \text{DF} \times \text{GHI} \\
-        \text{DNI} &= \frac{\text{GHI} - \text{DHI}}{\cos{\text{Z}}}
+        \text{\acrshort{dhi}} &= \text{DF} \times \text{\acrshort{ghi}} \\
+        \text{\acrshort{dni}} &= \frac{\text{\acrshort{ghi}} -
+        \text{\acrshort{dhi}}}{\cos{\text{Z}}}
    \end{aligned}
 \end{equation}

--- a/Sections/50_Choice_of_Hyperparameters.tex
+++ b/Sections/50_Choice_of_Hyperparameters.tex
@ -19,7 +19,7 @@ consuming computations in the case of larger number of regressors and more
 complex kernel functions.

 As described in Section~\ref{sec:gp_dynamical_system}, for the purpose of this
-project, the \acrlong{gp} model will be trained using the \acrshort{narx}
+project, the \acrshort{gp} model will be trained using the \acrshort{narx}
 structure. This already presents an important choice in the selection of
 regressors and their respective autoregressive lags.

@ -185,7 +185,7 @@ $l_u = 1$ and $l_y = 3$ with $l_w$ taking the values of either 1, 2 or 3,
 depending on the results of further analysis.


-As for the case of the \acrlong{svgp}, the results for the classical
+As for the case of the \acrshort{svgp}, the results for the classical
 \acrshort{gp} (cf. Table~\ref{tab:GP_hyperparameters}) are not necessarily
 representative of the relationships between the regressors of the
 \acrshort{svgp} model, due to the fact that the dataset used for training is
@ -259,8 +259,8 @@ This performance metric is very useful when training a model whose goal is
 solely to minimize the difference between the measured values, and the ones
 predicted by the model.

-A variant of the \acrshort{mse} is the \acrfull{smse}, which normalizes the
-\acrlong{mse} by the variance of the output values of the validation dataset.
+A variant of the \acrfull{mse} is the \acrfull{smse}, which normalizes the
+\acrshort{mse} by the variance of the output values of the validation dataset.

 \begin{equation}\label{eq:smse}
    \text{SMSE} = \frac{1}{N}\frac{\sum_{i=1}^N \left(y_i -
@ -403,7 +403,7 @@ the discrepancies.
 \subsubsection{Conventional Gaussian Process}

 The simulation performance of the three lag combinations chosen for the
-classical \acrlong{gp} models has been analyzed, with the results presented in
+classical \acrshort{gp} models has been analyzed, with the results presented in
 Figures~\ref{fig:GP_113_multistep_validation},~\ref{fig:GP_213_multistep_validation}
 and~\ref{fig:GP_313_multistep_validation}. For reference, the one-step ahead
 predictions for the training and test datasets are presented in
--- a/Sections/70_Implementation.tex
+++ b/Sections/70_Implementation.tex
@ -48,7 +48,7 @@ the correct amount of data for the weather predictions and to properly generate
 the optimization problem, the discrete/continuous transition and vice-versa
 happens on the Simulink side. This simplifies the adjustment of the sampling
 time, with the downside of harder inclusion of meta-data such as hour of the
-day, day of the week, etc.\ in the \acrlong{gp} Model.
+day, day of the week, etc.\ in the \acrshort{gp} Model.

 The weather prediction is done using the information present in the CARNOT
 \acrshort{wdb} object. Since the sampling time and control horizon of the
@ -66,13 +66,13 @@ evaluating a \acrshort{gp} has an algorithmic complexity of $\mathcal{O}(n^3)$.
 This means that naive implementations can get too expensive in terms of
 computation time very quickly.

-In order to have as smallest of a bottleneck as possible when dealing with
-\acrshort{gp}s, a very fast implementation of \acrlong{gp} Models was used, in
-the form of GPflow~\cite{matthewsGPflowGaussianProcess2017}. It is based on
-TensorFlow~\cite{tensorflow2015-whitepaper}, which has very efficient
-implementation of all the necessary Linear Algebra operations. Another benefit
-of this implementation is the very simple use of any additional computational
-resources, such as a GPU, TPU, etc.
+In order to have as smallest of a bottleneck as possible when dealing with the
+required algebraic operations, a very fast implementation of \acrshort{gp}
+Models was used, in the form of GPflow~\cite{matthewsGPflowGaussianProcess2017}.
+It is based on TensorFlow~\cite{tensorflow2015-whitepaper}, which has very
+efficient implementation of all the necessary Linear Algebra operations. Another
+benefit of this implementation is the very simple use of any additional
+computational resources, such as a GPU, TPU, etc.

 \subsubsection{Classical Gaussian Process training}

@ -158,7 +158,7 @@ Let $w_l$, $u_l$, and $y_l$ be the lengths of the state vector components
 $\mathbf{w}$, $\mathbf{u}$, $\mathbf{y}$ (cf. Equation~\ref{eq:components}).
 Also, let X be the matrix of all the system states over the optimization horizon
 and W be the matrix of the predicted disturbances for all the future steps. The
-original \acrlong{ocp} can be rewritten using index notation as:
+original \acrshort{ocp} can be rewritten using index notation as:

 \begin{subequations}\label{eq:sparse_optimal_control_problem}
    \begin{align}
--- a/Sections/80_Results.tex
+++ b/Sections/80_Results.tex
@ -7,7 +7,7 @@ analyzed in this Section have used a sampling time of 15 minutes and a control
 horizon of 8 steps.

 Section~\ref{sec:GP_results} analyzes the results of a conventional
-\acrlong{gp} Model trained on the first five days of gathered data. The model
+\acrshort{gp} Model trained on the first five days of gathered data. The model
 is then used for the rest of the year, with the goal of tracking the defined
 reference temperature.

@ -131,7 +131,7 @@ performance, but are more complex in implementation.

 \subsection{Sparse and Variational Gaussian Process}\label{sec:SVGP_results}

-The \acrlong{svgp} models are setup in a similar way as described before. The
+The \acrshort{svgp} models are setup in a similar way as described before. The
 model is first identified using 5 days worth of experimental data collected
 using a \acrshort{pi} controller and a random disturbance signal. The difference
 lies in the fact than the \acrshort{svgp} model gets re-identified every night
@ -143,7 +143,7 @@ setup performs much better than the initial one. The only large deviations from
 the reference temperature are due to cold weather, when the \acrshort{hvac}'s
 limited heat capacity is unable to maintain the proper temperature.
 Additionnaly, the \acrshort{svgp} controller takes around 250 - 300ms of
-computation time for each simulation time, decreasing the computational cost of
+computation time for each simulation step, decreasing the computational cost of
 the original \acrshort{gp} by a factor of six.


@ -293,7 +293,7 @@ As seen in Figure~\ref{fig:SVGP_evol_importance}, the variance of the
 signifies the increase in confidence of the model. The hyperparameters
 corresponding to the exogenous inputs --- $w1,1$ and $w1,2$ --- become generally
 less important for future predictions over the course of the year, with the
-importance of $w1,1$, the \acrlong{ghi}, climbing back up over the last, colder
+importance of $w1,1$, the \acrshort{ghi}, climbing back up over the last, colder
 months. This might be due to the fact that during the colder months, the
 \acrshort{ghi} is the only way for the exogenous inputs to \textit{provide}
 additional heat to the system.
@ -361,7 +361,7 @@ simulation data (cf. Figures~\ref{fig:SVGP_96pts_fullyear_simulation}
 and~\ref{fig:SVGP_96pts_abserr}) it is very notable that the model performs
 almost identically to the one identified in the previous sections. This
 highlights one of the practical benefits of the \acrshort{svgp} implementations
-compared to the classical \acrlong{gp} -- it is possible to start with a rougher
+compared to the classical \acrshort{gp} -- it is possible to start with a rougher
 controller trained on less data and refine it over time, reducing the need for
 cumbersome and potentially costly initial experiments for gathering data.

@ -473,7 +473,7 @@ models can be deployed with less explicit identification data, but they will
 continue to improve over the course of the year, as the building passes through
 different regions of the state space and more data is collected.

-However, these results do not discredit the use of \acrlong{gp} for employment
+However, these results do not discredit the use of \acrshort{gp} for employment
 in a multi-seasonal situation. As shown before, given the same amount of data
 and ignoring the computational cost, they perform better than the alternative
 \acrshort{svgp} models. The bad initial performance could be mitigated by
--- a/Sections/90_Conclusion.tex
+++ b/Sections/90_Conclusion.tex
@ -62,7 +62,7 @@ throughout the year. The \acrshort{svgp} models also present a computational
 cost advantage both in training and in evaluation, due to several approximations
 shown in Section~\ref{sec:gaussian_processes}.

-Focusing on the \acrlong{gp} models, there could be several ways of improving
+Focusing on the \acrshort{gp} models, there could be several ways of improving
 its performance, as noted previously: a more varied identification dataset and
 smart update of a fixed-size data dictionary according to information gain,
 could mitigate the present problems.