帖文详情
@_akhaliq@x.good.news
DiJiang
Efficient Large Language Models through Compact Kernelization
In an effort to reduce the computational load of Transformers, research on linear attention has gained significant momentum. However, the improvement strategies for attention mechanisms typically