Local LLMs, private AI tools, and self-hosted solutions
58 tools
casper-hansen
AWQ quantization for efficient inference
AutoGPTQ
Easy-to-use GPTQ quantization
ggerganov
Tensor library for ML on consumer hardware
File format for LLM weights
Apple
Apple's ML framework for Apple silicon
LLM inference optimized for Apple silicon
Hugging Face
Rust ML framework with minimalist approach
OpenNMT
Fast inference engine for Transformer models
Microsoft
Cross-platform ML inferencing
NVIDIA
NVIDIA's optimized LLM inference
BentoML
Operating LLMs in production
vLLM
High-throughput LLM serving
Text Generation Inference from Hugging Face
Anyscale
Scalable model serving with Ray
NVIDIA's inference serving platform
abetlen
Python bindings for llama.cpp
Hugging Face's model library
Easy distributed training and inference
Microsoft's deep learning optimization library
Parameter-efficient fine-tuning methods
Low-Rank Adaptation for fine-tuning
artidoro
Efficient fine-tuning of quantized LLMs
unslothai
2x faster LLM fine-tuning
OpenAccess-AI
Tool for fine-tuning various AI models
Showing 25-48 of 58 tools