local-llm-server
A robust, production-ready API for managing and serving local language models with comprehensive performance monitoring. It provides an OpenAI-compatible API layer over local inference engines (llama.cpp, etc.), enabling secure, air-gapped AI capabilities for the enterprise.
Key Features
- OpenAI-compatible API Interface
- Real-time GPU/TPS Performance Monitoring
- Model Management & Switching UI
- Efficiency Mode (No-Log Inference)
- RBAC Integration for Model Access Controls
- Support for ROCm/CUDA and CPU Inference
API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | `/v1/models` | List currently loaded and available models |
| POST | `/v1/chat/completions` | OpenAI-compatible chat completion endpoint |
| POST | `/api/orchestrate/load` | Load a specific model into VRAM |
| POST | `/api/orchestrate/unload` | Unload current model to free VRAM |
| GET | `/api/performance/metrics` | Get real-time token generation and GPU stats |
Usage Example
import requests
# Example interaction
response = requests.get(
url="https://api.arcore.internal/v1/models",
headers={"Authorization": "Bearer <token>"}
)
print(response.json())Tech Stack
Authentication
- •**Header:** `Authorization: Bearer <token>`
- •**Scopes:** RBAC is enforced at the object level via `ArcoreCodex` policies.
Compliance & Security
Compliance
- ✓Network: Air-gap capable
- ✓Access: API Key auth
- ✓Data Privacy: No external data egress
Security
- ✓Access: API Key auth
Related Services
Arcore Maestro
Arcore Maestro is a hybrid, agent-based orchestration conductor for AI and data workflows. It intelligently routes tasks to efficient local LLMs or secure, sandboxed worker tools, reserving large external models for planning, creative generation, and self-healing analysis. It serves as the central nervous system for autonomous agents within the Arcore ecosystem.
Chapterize
An intelligent document processing engine that converts static PDFs into structured, web-ready HTML chapters. Chapterize uses AI to detect logical breaks, clean content, and make legacy documentation accessible, searchable, and mobile-friendly.
Career Forge
A Career Knowledge Graph System that treats individual career data (skills, achievements, roles) as a queryable database. CareerForge maps skills to market demand, enables dynamic resume generation, provides interview preparation, and facilitates skill gap analysis.