LQR Inverted Pendulum Simulator Back
Control Engineering Simulator

LQR Inverted Pendulum Simulator — Optimal State Feedback

Stabilize a cart-mounted inverted pendulum with LQR state feedback. Tune Q and R weights to explore the trade-off between response speed and control effort.

Parameters
Position weight Q_p
Angle weight Q_θ
Input penalty R
Initial angle θ_0
°

Cart mass M=1.0 kg, pendulum mass m=0.1 kg, length L=0.5 m, gravity g=9.81 m/s², simulation time 5 s (dt=0.01 s, RK4 integration).

Results
Gain K
Settling time t_s (5%)
Peak input |u|_max
Angle overshoot
Time response and control input

Top: cart position p(t) (blue) and pendulum angle θ(t) (orange, deg) / Bottom: control input u(t) (green, N)

Theory & Key Formulas

Small-angle linearized cart-pole model with state $x=[p,\dot p,\theta,\dot\theta]^\top$ and input $u$ (horizontal cart force):

$$\dot x = A x + B u,\quad A=\begin{bmatrix}0&1&0&0\\0&0&-\tfrac{mg}{M}&0\\0&0&0&1\\0&0&\tfrac{(M+m)g}{ML}&0\end{bmatrix},\ B=\begin{bmatrix}0\\ \tfrac{1}{M}\\0\\ -\tfrac{1}{ML}\end{bmatrix}$$

LQR finds the state-feedback gain that minimizes the quadratic cost $J=\int_0^\infty(x^\top Q x + u^\top R u)\,dt$:

$$u = -K x,\qquad K = R^{-1} B^\top P$$

$P$ is the symmetric positive-definite solution of the algebraic Riccati equation $A^\top P + PA - PBR^{-1}B^\top P + Q = 0$.

For pedagogy this tool uses a baseline gain K=[1, 1.5, 30, 5] linearly scaled by √(Q/R) rather than solving the Riccati equation in the browser.

What is the LQR Inverted Pendulum Simulator?

🙋
I remember LQR from my control class, but what is it actually doing here?
🎓
Roughly, LQR balances two goals: drive the state to zero quickly, but don't burn too much control effort. It minimizes the cost $J=\int(x^\top Q x + u^\top R u)dt$. A larger Q hates state error; a larger R hates control effort. Try setting Q_θ=1000 and R=0.01 above — the pendulum snaps upright instantly, but the input u(t) goes wild.
🙋
Got it. So increasing the angle weight Q_θ makes the controller stricter about pendulum tilt?
🎓
Exactly. Higher Q_θ grows the angle-related entries of K, so the slightest tilt is pushed back hard. Raising Q_p does the same for cart position. In practice you almost never want the pendulum to fall, but the cart can wander a bit — so Q_θ ≫ Q_p is the usual choice. The defaults here (Q_θ=100, Q_p=1) reflect that bias.
🙋
The bottom plot is the input u(t), right? It changes a lot when I move R.
🎓
R is the cost of control. Make R small and the optimizer thinks input is free, so it slams the cart and the response is fast but |u|_max spikes. Make R large and the optimizer becomes thrifty, so the input stays gentle while the pendulum takes longer to settle. On real hardware you pick R based on actuator saturation — if your motor caps at 200 N, you raise R so |u|_max never reaches that limit.
🙋
What does "settling time" measure?
🎓
The time after which |θ| stays within 5% of the initial tilt and never escapes again. It is the single most common scalar metric for "how fast and how cleanly did the controller calm everything down". Raising Q_θ or Q_p shortens settling time, but at the cost of bigger u_max — and that trade-off is exactly the heart of LQR design.

Frequently Asked Questions

The algebraic Riccati equation $A^\top P + PA - PBR^{-1}B^\top P + Q = 0$ delivers the optimal cost matrix $P$ for the infinite-horizon LQR problem. Once $P$ is known the optimal gain is uniquely $K = R^{-1}B^\top P$. $P$ is symmetric positive definite and has a physical meaning: $x^\top P x$ is the minimum total cost remaining when the state is $x$. Standard numerical solvers use Schur decomposition or eigendecomposition of the Hamiltonian matrix; MATLAB's lqr() and Python's scipy.linalg.solve_continuous_are() handle it in one call.
Real plants have sensor noise, disturbances, and rarely expose every state directly. A Kalman filter is the minimum-variance state estimator: it fuses noisy measurements with an internal model to produce $\hat x$. Combining it with the LQR gain as $u=-K\hat x$ yields LQG (Linear Quadratic Gaussian) control. The Separation Principle proves that the optimal regulator and the optimal estimator can be designed independently and still produce the jointly optimal stochastic controller. LQG is how the LQR ideal survives in the messy real world.
Yes — the underlying physics is identical. A humanoid standing has its center of mass above a small support polygon (the feet), making it an unstable equilibrium that the Linear Inverted Pendulum Model (LIPM) approximates well. Ground-reaction moment becomes the input and LQR or MPC stabilizes the posture. Honda's ASIMO, Boston Dynamics' Atlas, and Agility Robotics' Digit all use variations of this approach. Complex multi-joint robots reduce to a small-dimensional cart-pole-like model once you extract their centroidal dynamics.
The state-space model here assumes the small-angle approximation $\sin\theta\approx\theta$ and $\cos\theta\approx1$. Empirically the error stays within a few percent for $|\theta|<20°$, but past $30°$ the nonlinear terms cannot be ignored. Setting the initial angle slider to $\pm30°$ may show "stable" recovery in this linear simulator while the real cart-pole would fail. Practical systems combine a separate swing-up controller (often energy-based) with the LQR stabilizer, switching to LQR only once $|\theta|<15°$. For broader operating ranges engineers use gain scheduling or nonlinear MPC.

Real-World Applications

Rocket and missile attitude control: Just after liftoff a rocket has its thrust line below its center of mass, making it dynamically identical to an inverted pendulum. SpaceX's Falcon 9 vertical landings feed state estimates into gimbal-angle commands that keep the booster upright on the way down. LQR and its extensions (LQG, MPC) form the backbone of most attitude controllers in launchers and missiles.

Two-wheeled self-balancing robots: Segway, Ninebot, and hobby balance bots are literal inverted pendulums. Gyros and accelerometers measure body tilt and a feedback law on the wheel motors keeps the bot upright. Educational kits like LEGO Mindstorms balance bots use LQR or PID. The Q–R tuning experience you get in this simulator transfers directly to real hardware.

Active mass dampers in skyscrapers: Tall buildings and long bridges fight earthquakes and wind with Active Mass Dampers — large weights that move at the top of the structure to counteract sway. The vibration modes are written in state-space form and LQR sets the actuator command. Taipei 101's wind damper and Tokyo Skytree's central pillar use cousins of LQR for stabilization.

Humanoid robots and exoskeletons: Walking control on ASIMO, Atlas, and Digit blends ZMP (Zero Moment Point) frameworks with LQR-family posture controllers. Rehabilitation exoskeletons model the wearer as an inverted pendulum and assist with LQR-style torques at the joints, helping patients regain standing balance.

Common Pitfalls and Cautions

The most common mistake is believing that "bigger Q or bigger R always means better performance." Only the ratio Q/R matters for LQR — multiplying both by the same constant gives an identical K. What actually matters are the relative weights between state components (Q_p vs Q_θ) and between state and input (Q_θ vs R). Cranking Q without thought just blows up |u|_max. Try dropping R to 0.01 in the simulator: settling time falls but the input peaks in the hundreds of newtons. Choosing realistic R values in light of actuator limits is where the real engineering judgment lives.

The second pitfall is treating the linearized model as valid for any initial condition. The A matrix here is derived under $\sin\theta\approx\theta$ and is only accurate for $|\theta|<20°$ or so. With the initial-angle slider at $30°$ the linear model still reports "stabilizes," but a real cart-pole would swing back too hard and the linearization would collapse, often making the controller fail. Practical inverted-pendulum systems use a swing-up controller for large angles and only switch to LQR once $|\theta|<15°$.

Finally, remember that this tool is a teaching LQR, not the exact optimum. The true LQR gain requires solving a 4D algebraic Riccati equation, which is overkill to implement in a browser. Instead we scale a known-good baseline gain K=[1, 1.5, 30, 5] by $\sqrt{Q/R}$ per state. You can correctly observe how Q and R reshape the response qualitatively, but for production work use MATLAB's `K = lqr(A,B,Q,R)` or Python's `scipy.linalg.solve_continuous_are`. The simulator's job is to make LQR's behavior intuitive, not to replace a proper numerical solver.