Top
New
Ask
Show
fovc
1168 karma
NoLiMa: Long-Context Evaluation Beyond Literal Matching
2 points by
fovc
3 months ago |
0 comments
DeltaNet Explained
2 points by
fovc
4 months ago |
1 comment
Mamba-Shedder: Post-Transformer Compression for Efficient SSMs
1 point by
fovc
4 months ago |
0 comments
Reflections on 'The Bitter Lesson' (2021)
1 point by
fovc
4 months ago |
0 comments
Theoretical limitations of multi-layer Transformer
107 points by
fovc
5 months ago |
22 comments