AppFunctions
Calls an app's own exposed functions when the platform offers them. Cleanest path; feature-detected on newer Android.
feature-detectedOn-device · Zero cloud · Zero account
Zeruel listens, reasons, and taps through your real apps — WhatsApp, your inbox, your rides — with the wake word, the speech-to-text, and the model all running on the phone. No server, no account, no telemetry. Every action is hash-chained so you can prove what it did.
The loop
A single request runs the whole way through without leaving the device. Each stage hands off to the next — the order is the product.
An always-on wake word (Vosk) starts on-device speech-to-text. The mic only records while listening.
Gemini Nano via AICore plans on-device, with a deterministic heuristic fallback when the model is absent.
The request becomes a typed, validated plan of skill actions — never a free-form script.
The three-tier action ladder drives the target app. Payment and biometric surfaces are never bypassed.
Personal data is embedded and indexed locally for semantic plus keyword recall — and forgotten on demand.
An MCP server lets Claude Code drive the phone over your LAN — API-key-gated, every call ledgered.
How it acts
Zeruel tries the most reliable, most private method first and climbs down only when it must. One step on the rung is the live workhorse; the other two are feature-detected.
Calls an app's own exposed functions when the platform offers them. Cleanest path; feature-detected on newer Android.
feature-detectedDrives any app through the accessibility tree with per-app YAML drivers. The workhorse that needs no app cooperation.
liveFalls back to on-device screen understanding when the tree is opaque. Used only when Tier 2 cannot resolve a target.
feature-detectedThe signature
Each action Zeruel takes is written to an on-device trust ledger. Every entry binds its own SHA-256 to the previous entry's hash, so editing any past record breaks every record after it. Tampering is not hidden — it is provable.
What it carries
Wake word and speech-to-text via Vosk — open, no key, no account. Foreground service so it works with the screen off.
ONNX MiniLM embeddings plus full-text search over your SMS and messages. Per-source retention and a one-tap forget.
A Ktor HTTP+SSE server on your LAN. Claude Code can query memory, list skills, and act — every call authenticated and ledgered.
Per-app skills described in YAML and run against the accessibility tree. WhatsApp ships first; the rest are just manifests.
Under the hood
Native Kotlin and Jetpack Compose, a Gradle multi-module graph with convention plugins, Hilt, and Room. Every device-only capability sits behind an interface with a fake, so the logic is unit-tested on the JVM — 144 tests and counting.
Get it running
No Play Store, no account — broad-accessibility apps are sideload-only by design. Build the APK from source, then walk through the one-time grants. Tuned for the Galaxy S23.
Sideload the debug build and open Zeruel.
Settings → Services & access → enable the Zeruel accessibility service.
Zeruel asks on first launch. Both stay on the device.
The voice + memory models are bundled in the APK and install themselves on first launch.
“message Amma on WhatsApp…”, confirm, and watch the ledger.
90-second on-device demo
airplane mode · no SIM · zero network