Exclusive Self Attention
arXiv:2603.09078v1 Announce Type: new Abstract: We introduce exclusive self attention (XSA), a simple modification of self attention (SA) that improves Transformer's sequence modeling performance. The …
Shuangfei Zhai
11 views