ExLlama
by turboderp3.2k
Memory-efficient LLM inference
About
Memory-efficient LLM inference Provides comprehensive functionality for seamless integration.
Installation
pip install exllamav2Details
- Category
- Self-Hosted AI
- Added
- Aug 14, 2024
- GitHub Stars
- 3.2k