ExLlama

by turboderp3.2k

Memory-efficient LLM inference

About

Memory-efficient LLM inference Provides comprehensive functionality for seamless integration.

Installation

pip install exllamav2

Tags

Details

Added
Aug 14, 2024
GitHub Stars
3.2k