llama.cpp is an open source software library that performs inference on various large language models such as Llama. It is co-developed alongside the project, a general-purpose tensor library.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).