Writing an LLM from scratch, part 16 – layer normalisation

1 point by gpjt 1 day ago | 0 comments