Files
2026-04-26 16:35:12 +00:00

32 lines
1.3 KiB
Markdown

# cxos/vendor/llama-cpp/ — pinned llama.cpp / ggml inference engine
CxLLM-Arch's Core inference backend embeds [llama.cpp](https://github.com/ggerganov/llama.cpp)
through this vendor shim. We do **not** commit the multi-hundred-megabyte
source tree; this directory holds:
* **`PINNED.json`** — exact upstream tag, tarball URL, and SHA-256
CxLLM trusts. Bumping is a single-commit operation: update both
`version` and `sha256` together, ideally with a co-located CI run
that proves reproducibility.
* **`fetch.sh`** — downloads the tarball, verifies SHA-256, and extracts
to `src/llama.cpp-<ver>/` (gitignored). Refuses to run when
`PINNED.json` still has the placeholder all-zeros sha.
* **`build.sh`** — invokes `cmake … --install` into
`dist/cxllm-arch/llama-cpp/`, with backend toggles via
`--backend {cpu,vulkan,cuda,hip,opencl}` (multi-flag).
Run via the top-level Makefile:
```sh
make cxos-vendor-llama # fetch + verify
make cxos-vendor-llama-build # CPU only
make cxos-vendor-llama-build BACKENDS="vulkan cuda"
```
CxLLM-Arch's `Core/CMakeLists.txt` consumes the install prefix produced
here when `CXLLM_USE_LLAMA_CPP=ON` (default for production builds).
Trust model: tarballs from upstream are not GPG-signed, so we anchor on
SHA-256 in PINNED.json. Bumps are reviewed and reproduced in CI before
merging.