API · v0.2.0

LocalEngine Local API Reference

LocalEngine ships an opt-in HTTP API served entirely from your Mac. It's designed as a low-latency bridge for browser extensions and Privy apps — a Chrome extension can ask the on-device model to translate a selection without anything leaving the device.

Overview

The local API is documented and opt-in: it is not enabled by default. Turn it on from the app's Dashboard, and the engine begins listening on loopback. All endpoints are served by the active runtime, so the same Metal-accelerated GGUF model that powers the Chat tab answers API requests.

Base URL

http://127.0.0.1:8765

The API binds to loopback only. To keep traffic on-device, callers (extensions, Privy apps) connect over 127.0.0.1 — there is no cloud endpoint and no telemetry.

Authentication

When a token is configured, send it as a bearer token on every /v1/* request:

Authorization: Bearer <token>

/health stays unauthenticated for local readiness checks. Tokens are stored in the macOS Keychain and never written to disk in plain text.

GET /health

Unauthenticated readiness probe.

{
  "status": "ok",
  "engine": "LocalEngine",
  "version": "0.2.0"
}
GET /v1/status

Returns the active runtime and model readiness.

{
  "runtime": "llama",
  "active_model": "local-gguf",
  "backend": "metal",
  "ready": true
}
POST /v1/translate

Translate a single string. Supported fields:

  • text
  • source
  • target
  • mode

Request

curl http://127.0.0.1:8765/v1/translate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token-if-configured>" \
  -d '{"text":"hello world","source":"auto","target":"zh-CN","mode":"selection"}'

Response

{
  "translation": "你好,世界",
  "source": "en",
  "target": "zh-CN",
  "latency_ms": 120,
  "engine": "LocalEngine"
}
POST /v1/translate/batch

Translate many strings in one round-trip — ideal for whole-page translation in a browser extension, where each text node maps to one array entry.

curl http://127.0.0.1:8765/v1/translate/batch \
  -H "Content-Type: application/json" \
  -d '{"items":["hello","world"],"source":"auto","target":"zh-CN"}'

Extension priority

The Privy Chrome extension selects a translation engine in this order — LocalEngine first, so on-device translation wins whenever the app is running:

1 · LocalEngine AppOn-device, private, lowest latency
2 · OllamaLocal fallback runtime
3 · LM StudioLocal fallback runtime
4 · Custom OpenAI-compatibleRemote provider of last resort