Pipeline: Logits → Softmax → Probabilities → Cross-Entropy → Loss
Logits
z₀
2.00
z₁
1.00
z₂
0.10
e^(z/T)
e^z₀
7.39
e^z₁
2.72
e^z₂
1.11
Σ
11.21
Softmax
p₀
0.659
p₁
0.242
p₂
0.099
Σ
1.000
-log(p)
p_correct
0.659
log(p)
-0.42
-log(p)
0.42
Loss
L = -log(py)
0.42
⚠️ Harsh penalty!
Gradients ∂L/∂z = p - y
∂L/∂z₀
-0.341
∂L/∂z₁
+0.242
∂L/∂z₂
+0.099
Loss = -log(0.659)
0.42