TensorRT-LLM runtime now open-source
4 points by mmoskal 3 months ago | 1 comment- mmoskal 3 months agoPreviously, the "Executor" runtime was shipped as binary blobs. This is the bit that schedules requests and manages KV cache (similar to vLLM or SGLang server).
- 3 months ago