Top
New
Ask
Show
redlock
23 karma
New deepseek paper: Natively Trainable Sparse Attention mechanism
5 points by
redlock
4 months ago |
1 comment