Input dim D = 10 fixed. The dataset is a 3-dim intrinsic signal plus noise, with N = 50 samples for training.
Top: input x (blue) / Middle: bottleneck z / Bottom: reconstruction x̂ (orange, overlaid on input) / Right: iterations vs MSE
A 1D linear autoencoder compresses an input $x \in \mathbb{R}^D$ with the encoder matrix $W_e \in \mathbb{R}^{K \times D}$ into a low-dimensional latent code $z$, then decodes it back through $W_d \in \mathbb{R}^{D \times K}$.
Encoding, decoding, and the reconstruction loss:
$$z = W_e\,x, \qquad \hat{x} = W_d\,z = W_d\,W_e\,x$$ $$L = \lVert x - \hat{x} \rVert^2$$SGD updates (learning rate $\eta$):
$$W_e \leftarrow W_e + 2\eta\,W_d^{\top}(x - \hat{x})\,x^{\top}$$ $$W_d \leftarrow W_d + 2\eta\,(x - \hat{x})\,z^{\top}$$At convergence, $W_d W_e \approx V V^{\top}$ where $V$ holds the top $K$ principal components, so the linear AE acts as a projection onto the leading PCA subspace.