Data is generated deterministically by an LCG with the given seed (30 points per class, 60 total, sigma=1.2). Gradient descent updates use the full batch.
Red = class 0 / Blue = class 1 / Green = decision boundary (p=0.5) / Shading = probability contours (p=0.3, 0.5, 0.7) / Bottom-left = (w₁, w₂, b)
Logistic regression is a linear classifier that turns a linear score $z = \mathbf{w}\cdot\mathbf{x} + b$ into a probability with the sigmoid function and minimizes the cross-entropy loss by gradient descent.
Predicted probability via the sigmoid function:
$$p(y=1 \mid \mathbf{x}) = \sigma(z) = \frac{1}{1 + e^{-z}}$$Cross-entropy loss ($n$ samples, $y_i \in \{0,1\}$):
$$L = -\frac{1}{n}\sum_{i=1}^{n}\bigl[y_i \log p_i + (1 - y_i)\log(1 - p_i)\bigr]$$Gradient and weight update with L2 regularization $\lambda$:
$$\frac{\partial L}{\partial \mathbf{w}} = \frac{1}{n}\mathbf{X}^{\top}(\mathbf{p}-\mathbf{y}) + \lambda\mathbf{w}, \quad \mathbf{w} \leftarrow \mathbf{w} - \eta\frac{\partial L}{\partial \mathbf{w}}$$The decision boundary (the p=0.5 contour) is the straight line $\mathbf{w}\cdot\mathbf{x}+b=0$, i.e. $w_1 x_1 + w_2 x_2 + b = 0$.