Attention heads as connection strengths between tokens.
The educational explanation for this topic is being developed.