OpenAI-Compatible API
Drop-in replacement for any OpenAI client. POST /v1/chat/completions with streaming SSE support.
MerlionOS Inference is a from-scratch operating system that does exactly one thing: serve LLM inference as fast as the hardware allows.
No scheduler overhead, no syscall boundary, no unnecessary abstractions. Boot in under 5 seconds, load a model, serve an OpenAI-compatible API.
Linux is a general-purpose OS. When running LLM inference, you pay for that generality:
MerlionOS Inference eliminates these by construction:
OpenAI-Compatible API
Drop-in replacement for any OpenAI client. POST /v1/chat/completions with streaming SSE support.
GGUF Model Support
Load quantized models (Q4_0, Q8_0) directly. Supports Llama, SmolLM, and compatible architectures.
AVX2/AVX-512 Kernels
Hand-optimized SIMD kernels for x86_64. Automatic runtime detection and dispatch.
AMD GPU Compute
Native RDNA3 driver — no ROCm, no Linux, no DRM. Direct hardware access for maximum GPU utilization.
# Buildgit clone https://github.com/MerlionOS/merlion-infer.gitcd merlion-infermake build
# Download a model./tools/download_model.sh
# Run in QEMU with disk + networkmake run-full
# In the shell:merlion> ai-load # Load GGUF model from diskmerlion> ai Hello # Generate textmerlion> ai-serve 8080 # Start OpenAI API server# From any client:curl http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"smollm-135m","messages":[{"role":"user","content":"Hello"}]}'| merlion-kernel | merlion-infer | |
|---|---|---|
| Purpose | General-purpose hobby OS | LLM inference server |
| Architectures | x86_64, aarch64, RISC-V, LoongArch | x86_64 only |
| User mode | Ring 3 processes, syscalls | Everything in Ring 0 |
| Shell commands | 358 | ~25 (essentials only) |
| Modules | 253 | ~35 (stripped to minimum) |
| Networking | Full stack (HTTP, SSH, MQTT, …) | TCP + HTTP (API only) |
| GPU | Software shaders | AMD RDNA3 compute |
| Focus | Feature completeness | Performance per watt |
| Component | Supported |
|---|---|
| CPU | AMD Ryzen 7000/9000, Intel 12th gen+ (AVX2 required) |
| GPU | AMD Radeon RX 7000 series (RDNA3) |
| RAM | DDR5, 32GB+ recommended |
| Storage | NVMe SSD, virtio-blk (QEMU) |
| Network | virtio-net (QEMU), Intel e1000e |
| Boot | UEFI via Limine bootloader |