Home/Self-Hosted AI

Self-Hosted AI

Local LLMs, private AI tools, and self-hosted solutions

58 tools

Sort:

Default Most Stars Recent Name

Filters

Has websiteFeatured only

AutoAWQ

casper-hansen

1.4k

AWQ quantization for efficient inference

quantizationawqefficient

AutoGPTQ

3.2k

Easy-to-use GPTQ quantization

quantizationgptqeasy

GGML

ggerganov

9.2k

Tensor library for ML on consumer hardware

inferenceefficientc

GGUF

ggerganov

4.2k

File format for LLM weights

formatweightsllama

MLX

Apple

14.0k

Apple's ML framework for Apple silicon

applesiliconinferenceFeatured

MLX-LM

Apple

2.8k

LLM inference optimized for Apple silicon

applesiliconllm

Candle

Hugging Face

13.0k

Rust ML framework with minimalist approach

rustinferencefast

CTranslate2

OpenNMT

2.8k

Fast inference engine for Transformer models

inferencefastoptimized

ONNX Runtime

Microsoft

12.0k

Cross-platform ML inferencing

inferencecross-platformmicrosoft

TensorRT-LLM

NVIDIA

6.8k

NVIDIA's optimized LLM inference

nvidiainferenceoptimized

OpenLLM

BentoML

8.2k

Operating LLMs in production

productionservingbentoml

vLLM

18.0k

High-throughput LLM serving

servingthroughputpagedFeatured

TGI

Hugging Face

7.5k

Text Generation Inference from Hugging Face

servinghuggingfaceproduction

Ray Serve

Anyscale

3.2k

Scalable model serving with Ray

servingrayscalable

Triton Inference Server

NVIDIA

7.2k

NVIDIA's inference serving platform

nvidiaservingenterprise

llama-cpp-python

abetlen

5.8k

Python bindings for llama.cpp

pythonllamabindings

Transformers

Hugging Face

124.0k

Hugging Face's model library

huggingfacemodelslibraryFeatured

Accelerate

Hugging Face

6.8k

Easy distributed training and inference

distributedtraininghuggingface

DeepSpeed

Microsoft

32.0k

Microsoft's deep learning optimization library

optimizationmicrosofttraining

PEFT

Hugging Face

14.0k

Parameter-efficient fine-tuning methods

fine-tuningefficientlora

LoRA

Microsoft

8.5k

Low-Rank Adaptation for fine-tuning

fine-tuningefficientadaptation

QLoRA

artidoro

9.2k

Efficient fine-tuning of quantized LLMs

fine-tuningquantizedefficient

Unsloth

unslothai

8.5k

2x faster LLM fine-tuning

fine-tuningfastefficient

Axolotl

OpenAccess-AI

5.8k

Tool for fine-tuning various AI models

fine-tuningtoolflexible

Showing 25-48 of 58 tools

1 2 3

← Back to all categories