目录 Transformer Attention结构 Self-Attention结构 Multi-head Self-Attention BERT:Bidirectional Encoder Representations from Transformers Summary Reference Transformer Transformer是完全由Attention...