16. Optimal Control
1. LQR infinity horizon control
Given linear system,
\[
\begin{aligned}
\dot x(t) &= Ax(t) + Bu(t), &x(0) = x_0
\end{aligned}
\]
The LQR cost function could be defined:
\[
J(x_0, u, 0) = \int_0^\infty(x^T(\tau)Qx(\tau) + u^T(\tau)Ru(\tau))d\tau
\]
- \(Q = Q^T = C_q^TC_q \succeq 0\)
- \(R = R^T \succ 0\)
It should satisfy the condition:
- \((A, C_q)\) is observable
- \((A, B)\) is reachable
The optimal \(LQ_\infty\) Control law is:
\[
\bar K = R^{-1}B^T\bar P
\]
And the Riccati equation is:
\[
\begin{aligned}
\bar PA + A^T\bar P + Q - \bar PBR^{-1}B^T\bar P &= 0 \\
\bar PA + A^T\bar P + Q - \bar K^TR\bar K &= 0
\end{aligned}
\]
The closed loop system have:
\[
\dot x = (A-B\bar K)x(t)
\]
is A.S.
Example
\[
\dot x(t) = ax(t) + b_1u_1(t) + b_2u_2(t)
\]
\[
\begin{aligned}
A &= a,
&B = \begin{bmatrix} b_1 & b_2 \end{bmatrix}
\end{aligned}
\]
We can define the weights:
\[
\begin{aligned}
Q = q,
&&R^{-1} = \begin{bmatrix} \rho_1&0 \\ 0&\rho_2 \end{bmatrix},
&&R = \begin{bmatrix} \frac{1}{\rho_1}&0 \\ 0&\frac{1}{\rho_2} \end{bmatrix}
\end{aligned}
\]
- \(\rho_1 > 0\), \(\rho_2 > 0\)
- \(C_q = \sqrt{q}\)
The Riccati Equation are:
\[
2\bar pa + q - \bar p^2(b_1^2\rho_1 + b_2^2\rho_2) = 0
\]
And given:
- \(a = 0\), \(b_1 = b_2 = 1\)
We can solve the Riccati Equation:
- \(\bar p = \sqrt{\frac{q}{\rho_1 + \rho_2}}\)
- \(\bar K =
\begin{bmatrix} \rho_1&0 \\ 0&\rho_2 \end{bmatrix}
\begin{bmatrix} 1&1 \end{bmatrix}^T
\sqrt{\frac{q}{\rho_1 + \rho_2}} =
\begin{bmatrix} k_1 \\ k_2 \end{bmatrix} =
\begin{bmatrix} \sqrt{\frac{\rho_1^2q}{\rho_1 + \rho_2}} \\ \sqrt{\frac{\rho_2^2q}{\rho_1 + \rho_2}} \end{bmatrix}\)
Example
\[
\dot x(t) = ax(t) + u(t)
\]
\[
\begin{aligned}
A &= a, &B = 1
\end{aligned}
\]
We can choose the weight:
The Riccati Equation:
\[
\begin{aligned}
2a\bar p + 1 - \frac{\bar p^2}{r} &= 0 \\
\bar p^2 - 2ar\bar p - r &= 0 \\ \hfill \\
\bar p = ar + \sqrt{(ar)^2 + r}
\end{aligned}
\]
- \(\bar k = \frac{ar + \sqrt{(ar)^2 + r}}{r} = a + \sqrt{a^2 + \frac{1}{r}}\)
The closed loop system is:
\[
\dot x(t) = -\sqrt{a^2 + \frac{1}{r}}x(t)
\]
When \(r \to \infty\),
\[
\dot x(t) = -|a|x(t)
\]
is A.S.
When \(r \to 0\),
\[
\dot x(t) = -\alpha x(t)
\]
- \(\alpha = \sqrt{a^2 + \frac{1}{r}} \to \infty\)
And the complementary sensitive function is:
\[
L(s) = \bar k G(s) = \frac{a + \sqrt{a^2 + \frac{1}{r}}}{s-a}
\]
At the crossing frequency: \(|L(j\omega_c)| = 1\)
\[
|\frac{a + \sqrt{a^2 + \frac{1}{r}}}{j\omega_c - a}| = \frac{a + \sqrt{a^2 + \frac{1}{r}}}{\sqrt{a^2 + \omega_c^2}} = 1
\]
Example
Constrain the speed of converging: \(\tilde A = A + \alpha I\)
Given the system
\[
\dot x(t) = x(t) + u(t)
\]
\[
A = B = 1
\]
- \(\tilde A = 1 + \alpha\)
The Riccati Equation:
\[
\bar P \tilde A + \tilde A^T P + Q - \bar PBR^{-1}B^T\bar P = 0
\]
- \(Q = q > 0\)
- \(R = r > 0\)
\[
2\bar p(1+\alpha) + q - \frac{\bar p^2}{r} = 0
\]
- \(\bar p = \frac{1+\alpha + \sqrt{(1+\alpha)^2 + q/r}}{1/r}\)
- \(\bar k_\alpha = 1+\alpha + \sqrt{(1+\alpha)^2 + q/r}\)
\[
A - B\bar k_\alpha = 1 - \bar k_\alpha = -\alpha - \sqrt{(1+\alpha)^2 + q/r} < -\alpha
\]
Example
\[
\left\{\begin{aligned}
&\dot x_1(t) = bu(t) \\
&x_2(t) = x_1(t) + u(t)
\end{aligned}\right.
\]
\[
\begin{aligned}
A = \begin{bmatrix} 0&0 \\ 1&0 \end{bmatrix}
&&B = \begin{bmatrix} b \\ 1 \end{bmatrix}
\end{aligned}
\]
- \(Q = \begin{bmatrix} q_1&0 \\ 0&q_2 \end{bmatrix}\), \(q_1 \geq 0\), \(q_2 \geq 0\)
- \(R = 1\)
We need \((A, B)\) is reachable and \((A, C_q)\) is observable, \(Q = C_q^TC_q\):
\[
M_R =
\begin{bmatrix} B&AB \end{bmatrix} =
\begin{bmatrix} b&0 \\ 1&b \end{bmatrix}
\]
The system is reachable,
\[
C_q = Q^\frac12 = \begin{bmatrix} \sqrt{q_1}&0 \\ 0&\sqrt{q_2} \end{bmatrix}
\]
\[
M_O =
\begin{bmatrix} C_q \\ C_qA \end{bmatrix} =
\begin{bmatrix} \sqrt{q_1}&0 \\ 0&\sqrt{q_2} \\ 0&0 \\ \sqrt{q_2}&0 \end{bmatrix}
\]
To make the system observable, \(q_2 \neq 0\),
\[
\tilde y(t) = C_qx(t) =
\begin{bmatrix} \sqrt{q_1}x_1(t) \\ \sqrt{q_2}x_2(t) \end{bmatrix}
\]
We can set \(Q = I\), and \(b = 1\), \(\Rightarrow\) \(\bar P = I\), check the stability of the closed loop system
\[
\dot x(t) = (A-B\bar K)x(t)
\]
- by computing the eigenvalue of \((A-B\bar K)\)
- by using a suitable Lyapunov Function
- \(\bar K = \begin{bmatrix} 1&1 \end{bmatrix}\)
\[
A-B\bar K = \begin{bmatrix} 0&0\\1&0 \end{bmatrix}
\begin{bmatrix} 1\\1 \end{bmatrix}
\begin{bmatrix} 1&1 \end{bmatrix} =
\begin{bmatrix} -1&-1\\0&-1 \end{bmatrix}
\]
- \(\lambda_1 = \lambda_2 = -1\)
\[
\begin{aligned}
V(x) &= J^\circ(x) = x^T\bar P x = x^Tx \\
\dot V(x) &= \dot x^T x + x^T\dot x \\
&= x^T(A-B\bar K)^Tx + x^T(A-B\bar K)x \\
&= -x^T
\begin{bmatrix} 2&1\\1&2 \end{bmatrix}
x
\end{aligned}
\]
\[
\det(\lambda I - \begin{bmatrix} 2&1\\1&2 \end{bmatrix}) =
\det\begin{bmatrix} \lambda-2&-1\\-1&\lambda-2 \end{bmatrix} =
\lambda^2 - 4\lambda + 3
\]
- \(\lambda_1 = 1\), \(\lambda_2 = 3\)
Example
\[
u(t) = -\rho \bar K x(t)
\]
We can get the new closed loop function:
\[
\tilde A_\rho = A-\rho B\bar K = \begin{bmatrix} -\rho&-\rho\\1-\rho&-\rho \end{bmatrix}
\]
\[
\det(\lambda I - \tilde A_\rho) = \lambda^2 + 2\rho\lambda + \rho
\]
2. Kalman Filter
The reference system gives:
\[
\begin{aligned}
\dot x(t) &= Ax(t) + Bu(t) + v_x(t) \\
y(t) &= Cx(t) + v_y(t)
\end{aligned}
\]
- \(v_x(t)\), \(v_y(t)\) is defined as gaussian noise, \(v(t) = \begin{bmatrix} v_x(t) \\ v_y(t) \end{bmatrix} \sim \mathcal{G}(0, V)\), \(E(v_x(t)) = E(v_y(t)) = 0\)
\[
\begin{aligned}
E(V(t_1)V(t_2)^T) = V\delta(t_2 - t_1)
\end{aligned}
\]
- \(V\) is the covariance matrix, \(V = \begin{bmatrix} \tilde Q & z \\ z^T & \tilde R \end{bmatrix}\), where \(\tilde Q \succeq 0\), \(\tilde R \succ 0\)
- \(\delta(\tau) = \left\{\begin{aligned} &0, &\tau \neq 0 \\ &1, &\tau = 0\end{aligned}\right.\)
For the initial condition, there have:
\[
\begin{aligned}
&x(0) \sim \mathcal{G}(\bar x, \tilde P) \\
&E((x(0) - \bar x_0)v(t)^T) = 0
\end{aligned}
\]
2.1 Covariance Matrix is independent
Assume \(z = 0\),
\[
\begin{aligned}
\dot{\hat x}(t) &= A\hat x + Bu(t) + L(t)(y(t) - C\hat x(t)) \\
e(t) &= x(t) - \hat x(t) \\ \hfill \\
\dot e(t) &= Ae(t) + v_x(t) + L(t)(Ce(t) + v_y(t)) \\
&= (A - L(t)C)e(t) + B_c(t)v(t) \\ \hfill \\
\dot{\bar e}(t) &= (A - L(t)C)e(t)
\end{aligned}
\]
- \(B_c(t) = \begin{bmatrix} I & -L(t) \end{bmatrix}\)
- \(\bar e(t) = E(e(t))\), at initial state, \(e(0) = x(0) - \hat x(0) = x(0) - \bar x(0)\), \(\bar e(0) = 0\), thus there have, \(\bar e(t) = 0\), \(\forall t \geq 0\)
For the variance \(e(t)\),
\[
\tilde P(t) = E(e(t) e(t)^T)
\]
- \(\tilde P(t)\) is symmetric and \(\tilde P(t) \succeq 0\), \(\tilde P(0) = \tilde P_0\)
Our optimization target is:
\[
\begin{aligned}
&\min_{L(t)} &\gamma^T \tilde P(t) \gamma
\end{aligned}
\]
- \(\gamma\) is a generic vector in \(\Re^n\)
\[
\begin{aligned}
\dot{\tilde P}(t) &= A\tilde P(t) + \tilde P(t)A^T + \tilde Q - \tilde P(t) C^T \tilde R^{-1}C \tilde P(t) \\
\tilde P(0) &= \tilde P_0 \\
L(t) &= \tilde P(t) C^T \tilde R^{-1} \rightarrow L(t)^T = \tilde R^{-1}C\tilde P(t)
\end{aligned}
\]
Info
We go back to look at the LQ controller design:
\[
\begin{aligned}
&\min_{K(t)} &x^T P(t) x
\end{aligned}
\]
\[
\begin{aligned}
-\dot P(t) &= AP(t) + P(t)A^T + Q - P(t)C^TR^{-1}CP(t) \\
P(T) &= S \\
K(t) &= R^{-1}B^TP(t) \\
P'(\tau) &= P(T-\tau) \leftarrow \dot P'(\tau) = -\dot P(T-\tau)
\end{aligned}
\]
The duality of LQ controller and Kalman Filter is:
KF |
LQR |
\(C\) |
\(B^T\) |
\(A\) |
\(A^T\) |
\(\tilde Q = B_qB_q^T\) |
\(Q = C_q^TC_q\) |
\(\tilde R\) |
\(R\) |
\(L\) |
\(K^T\) |
\(\tilde P(t)\) |
\(P(T-t)\) |
For the LQR design,
- \((A, B)\) should be reachable
- \((A, C_q)\) should be observable
Similarly, we can derive the condition for Kalman Filter:
- \((A, C)\) should be observable \(\Leftrightarrow\) \((A^T, C^T)\) should be reachable
- \((A, B_q)\) should be reachable \(\Leftrightarrow\) \((A^T, B_q^T)\) should be observable
If \((A, C)\) is observable and \((A, B_q)\) is reachable, \(\tilde Q = B_qB_q^T\), then, the optimal steady state observer is:
\[
\begin{aligned}
\dot{\hat x}(t) &= A\hat x(t) + Bu(t) + L(y(t) - C\hat x(t)) \\
&= (A-LC) \hat x(t) + Bu(t) + Ly(t)
\end{aligned}
\]
With \(L = \bar{\tilde P}C^T R^{-1}\),
Where \(\bar{\tilde P}\) is the unique (symmetric) positive definite solution to the stationary Riccati equation,
\[
\begin{aligned}
A\bar{\tilde P} + \bar{\tilde P}A^T + \tilde Q - \bar{\tilde P}C^TR^{-1}C\bar{\tilde P} = 0
\end{aligned}
\]
The observer is A.S. (\(A-LC\) has all eigenvalues with negative real part)
2.2 Covariance Matrix is non-independent
Assume \(z \neq 0\),
\[
V = \begin{bmatrix} \tilde Q & z \\ z^T & \tilde R \end{bmatrix}
\]
The state space equation becomes:
\[
\begin{aligned}
\dot x(t) &= Ax(t) + Bu(t) + v_x(t) + z\tilde R^{-1}(y(t) - (Cx(t) + v_y(t))) \\
&= (A-z\tilde R^{-1}C)x(t) + Bu(t) + z\tilde R^{-1}y(t) + v_x(t) - z\tilde R^{-1}v_y(t)
\end{aligned}
\]
The Kalman Filter equation becomes:
\[
\begin{aligned}
\dot{\hat x}(t) &= \tilde A\hat x(t) + Bu(t) + z\tilde R^{-1}y(t) + L(y(t) - C\hat x(t)) \\
\end{aligned}
\]
- \(E(v_x(t)) = 0\), \(E(v_x(t)v_x(t)^T) = \tilde Q\)
- \(v(t) = \begin{bmatrix} v_x(t) \\ v_y(t) \end{bmatrix} \sim \mathcal{G}(0, V)\)
- \(E(\tilde v_x(t)) = 0\),
\[
\begin{aligned}
E(\tilde v_x(t)\tilde v_x(t)^T) &= E((v_x(t)z\tilde R^{-1}v_y(t))(v_x(t)z\tilde R^{-1}v_y(t))^T) \\
&= E(v_x(t)v_x(t)^T) + z\tilde R^{-1} E(v_x(t)v_y(t)^T)\tilde R^{-1}z^T - z\tilde R^{-1} E(v_y(t)v_x(t)^T) - E(v_x(t)v_y(t)^T)\tilde R^{-1}z^T \\
&= \tilde Q - z\tilde R^{-1}z^T \succeq 0
\end{aligned}
\]
Now we have:
\[
\begin{aligned}
\tilde v &= \begin{bmatrix} \tilde v_x(t) \\ \tilde v_y(t) \end{bmatrix} &
V &= \begin{bmatrix} \tilde Q - z\tilde R^{-1}z^T & 0 \\ 0 & \tilde R \end{bmatrix}
\end{aligned}
\]
Thus we can define \(\tilde z\),
\[
\begin{aligned}
\tilde z &= E(\tilde v_x(t) \tilde v_y(t)^T) \\
&= E((v_x(t) - z\tilde R^{-1}v_y^T)v_t(t)^T) \\
&= \underbrace{E(v_x(t)v_y(t)^T)}_{z} - z\tilde R^{-1}\underbrace{E(v_y(t)v_y(t)^T)}_{\tilde R} = 0
\end{aligned}
\]
Example
Estimation of a constant
\[
\begin{aligned}
&\dot x(t) = 0 &&x(0) = \bar x \\
&y(t) = x(t) + v_y(t) &&v_y = \mathcal G(0,r)
\end{aligned}
\]
We can find the state space matrices: \(A = 0\), \(B = 0\), \(\dot Q = 0\), \(C = 1\), \(\tilde R = r\).
- \((A,C)\) is observable
- \((A,B_q)\) is not reachable
\[
\begin{aligned}
\dot{\hat x} &= L(t)(y(t) - \hat x(t)) \\
\dot{\tilde P}(t) &= -\frac{\tilde P(t)^2}{r} &\tilde P(0) = 0 \\
\tilde P(t) &= \frac{1}{r^{-1}t + \frac{1}{\tilde P}_0} &t\geq0 \\
L(t) &= \frac{r^{-1}}{r^{-1}t + \frac{1}{\tilde P}_0}
\end{aligned}
\]
Example
\[
\begin{aligned}
\dot x(t) &= Ax(t) + N_x(t) \\
y(t) &= N_y(t)
\end{aligned}
\]
Here, \(C = 0\), \(L(t) = 0\)
- \((A,C)\) is not observable
Example
Brownian Motion
\[
\begin{aligned}
\dot x(t) &= v_x(t) &&v_x = \mathcal G(0,\tilde Q) &&\tilde Q > 0\\
y(t) &= Cx(t) + v_y(t) &&v_y = \mathcal G(0,\tilde R) &&\tilde R > 0\\
\end{aligned}
\]
Here, \(C>0\), \(A=0\), \(B=0\), \(B_q = \sqrt{\tilde Q} > 0\)
- \((A,C)\) is observable
- \((A,B_q)\) is reachable
\[
\begin{aligned}
&\tilde Q - \frac{\tilde P^2 C^2}{\tilde R} = 0 \\
&\bar{\tilde P} = \sqrt{\frac{\tilde P^2 \tilde R}{C^2}} \\
&\bar L = \bar{\tilde P}C\tilde R^{-1} = \sqrt{\frac{\tilde Q}{\tilde R}} \\
&\dot{\hat x} = \bar L(y(t)-C\hat x(t)) \\
&\dot{\hat x} = -\sqrt{\frac{\tilde Q}{\tilde R}}C\hat x(t) + \sqrt{\frac{\tilde Q}{\tilde R}}y(t) \\
\end{aligned}
\]
We assume \(\tilde Q\) and \(\tilde R\) and \(C\) to be 1
\[
\dot{\hat x} = -\hat x(t) + y(t) \\
\]
Now assume \(\tilde R = \begin{bmatrix} \sigma_1^2 & 0 \\ 0 & \sigma_2^2 \end{bmatrix} = I\), \(C = \begin{bmatrix} 1\\1 \end{bmatrix}\),
\[
\begin{aligned}
\bar{\tilde P} &= \sqrt{\frac12} \\
\bar L &= \sqrt{\frac12} \begin{bmatrix} 1&1 \end{bmatrix} \\
\dot{\hat x} &= \sqrt{\frac12} (y_1 + y_2 - 2\hat x)
\end{aligned}
\]
Example
\[
\begin{aligned}
&\dot x = -ax + bu + v && \dot v = \gamma v + \delta v_x && v_x = \mathcal G(0,\tilde q^2)\\
&y = x + v_y && v_y = \mathcal G(0,\tilde R) && \tilde R > 0\\
\end{aligned}
\]
We can define the enlarged system with the noise:
\[
\begin{aligned}
\begin{bmatrix} \dot x \\ \dot v \end{bmatrix} &=
\begin{bmatrix} -a&1 \\ 0&-\gamma \end{bmatrix}
\begin{bmatrix} x\\v \end{bmatrix} +
\begin{bmatrix} b\\0 \end{bmatrix}u +
\begin{bmatrix} 0\\ \gamma \end{bmatrix}v_x \\
y &= \begin{bmatrix} 1&0 \end{bmatrix}\begin{bmatrix} x\\v \end{bmatrix} + v_y
\end{aligned}
\]
Where, \(\tilde v_x = \mathcal G(0,\tilde Q)\), \(\tilde Q = E\begin{bmatrix} \tilde v_x & \tilde v_x^T \end{bmatrix} = Mq^2M^T = q^2\begin{bmatrix} 0&0\\ 0&\delta^2 \end{bmatrix}\)
\[
M_o = \begin{bmatrix} 1&0\\ -Q&1 \end{bmatrix}
\]
\((A,C)\) is observable
\[
\begin{aligned}
\tilde Q &= B_q B_q^T = \tilde q M \tilde q M^T \\
B_q &= \begin{bmatrix} 0\\ \delta \tilde q \end{bmatrix} \\
M_r &= \begin{bmatrix} 0 & \delta \tilde q \\ \delta \tilde q & -\gamma \delta \tilde q\end{bmatrix}
\end{aligned}
\]
\((A,B_q)\) is reachable
\[
\begin{aligned}
\begin{bmatrix} \hat{\dot x} \\ \hat{\dot v} \end{bmatrix} &=
A \begin{bmatrix} \hat{x} \\ \hat{v} \end{bmatrix} + Bu + \bar L(y - C
\begin{bmatrix} \hat{x} \\ \hat{v} \end{bmatrix}) \\
\hat{\dot x} &= -a\hat{x} + \hat{v} + bu + \bar l_1(y - \hat x) \\
\hat{\dot v} &= -\gamma\hat{v} + \bar l_2(y - \hat x)
\end{aligned}
\]
Consider \(a=1\), \(\tilde R = 1\), \(\delta = \gamma > 0\), \(\tilde q = 1\)
- \(\gamma = 0.1\) \(\rightarrow\) slow
- \(\gamma = 10\) \(\rightarrow\) fast
We look at the TF of \(y\rightarrow x\),
\[
C(sI - (A-\bar LC))^{-1} \bar L
\]
Example
When the noise is not a gaussian noise:
\[
\dot x = Ax + Bu + v
\]
we can use spectral factorization
\(v\) is stationary
\[
\Phi_v(s) = \int_{-\infty}^{+\infty}c_v(\tau)e^{-s\tau}d\tau
\]
3. LQG Control
\[
\begin{aligned}
\dot x(t) = Ax(t) + Bu(t) + v_x(t) \\
y(t) = Cx(t) + v_y(t)
\end{aligned}
\]
Same assumption as for kalman filter on \(v_x(t)\), \(v_y(t)\), \(x(0)\).
The loss function is:
\[
J = \lim_{T\to \infty}\frac{1}{T}E[\int_0^T(x(t)^TQx(t) + u(t)^TRu(t)dt)]
\]
In the long run, if the system is A.S., the loss function becomes:
\[
J = E[(x_\infty(t)^TQx_\infty(t) + u_\infty(t)^TRu_\infty(t)dt)]
\]
The solution is:
if \((A,B)\) is reachable, and \((A,C_q)\) is observable (\((A,C)\) is observable, and \((A,B_q)\) is reachable)
- \(R \succ 0\), \(\tilde R \succ 0\)
- \(u(t) = -\bar K \hat x(t)\)
- \(\dot{\hat x}(t) = A\hat x(t) + Bu(t) + \bar L(y(t)-C\hat x(t))\)
It combined LQ controller and Kalman filter together, we can find the transfer function of \(y \to u\).
\[
\begin{aligned}
\dot{\hat x}(t) &= \underbrace{(A-B\bar K - \bar LC)}_{\tilde A} \hat x + \underbrace{\bar L}_{\tilde B} y(t) \\
u(t) &= \underbrace{-K}_{\tilde C}\hat x(t) \\
R(s) &= \bar K(sI - (A-B\bar K - \bar LC))^{-1}\bar L
\end{aligned}
\]