Input dim D = 10 fixed. The chosen signal sets the intrinsic dimension; noise is added and the model is SGD-trained on N = 50 samples. Each epoch lowers the loss and the reconstruction x̂ approaches the input x.
Top: input signal x (blue solid) and reconstruction x̂ (orange dashed) / bottom-left: bottleneck z (K values) / bottom-right: training step vs loss MSE. As epochs progress x̂ overlaps x and the loss drops.
A 1D linear autoencoder compresses an input $x \in \mathbb{R}^D$ with the encoder matrix $W_e \in \mathbb{R}^{K \times D}$ into a low-dimensional latent code $z$, then decodes it back through $W_d \in \mathbb{R}^{D \times K}$.
Encoding, decoding, and the reconstruction loss:
$$z = W_e\,x, \qquad \hat{x} = W_d\,z = W_d\,W_e\,x$$ $$L = \lVert x - \hat{x} \rVert^2$$SGD updates (learning rate $\eta$):
$$W_e \leftarrow W_e + 2\eta\,W_d^{\top}(x - \hat{x})\,x^{\top}$$ $$W_d \leftarrow W_d + 2\eta\,(x - \hat{x})\,z^{\top}$$At convergence, $W_d W_e \approx V V^{\top}$ where $V$ holds the top $K$ principal components, so the linear AE acts as a projection onto the leading PCA subspace.