Top
New
Ask
Show
New Anthropic research: Alignment faking in large language models
8 points by
casslin
6 months ago |
0 comments