Open source alternatives to AI products
2 tools (filtered)
Common Crawl
Open repository of web crawl data
Hugging Face
Hugging Face's 15T token web dataset