Open source · Bring your own keys

Git for your prompts.

Version, diff, test, and deploy AI prompts with the same discipline you bring to code. Rollback in one click. Catch regressions before they ship.

Providers

OpenAI

Scorers
3 built-in
Self-host
Yes
What's inside

The whole loop, finally sane.

No more silent prompt edits. No more "wait, what changed last week?". Every change is committed, tested, and reversible.

Real version control

Commit prompts with messages. Diff any two versions, branch for experiments, merge with conflict resolution, rollback in one click.

Evals that catch regressions

Build test suites with typed inputs. Score with exact match, regex, or LLM-as-judge. Auto-flag scores that drop ≥5 points.

Bring your own keys

Plug in OpenAI, Anthropic, or Google. Keys are encrypted at rest with AES-256-GCM and managed per team in Settings.

Templates, not blank pages

Ship-with eval templates for support, summarisation, extraction, tone, refusals. One click to seed a suite with sensible cases.

How it works

Three moves. No magic.

  1. 01

    Commit your prompts

    Write a prompt, hit commit with a message. Every change is a versioned snapshot — typed variables, tags, branches and all.

  2. 02

    Run evals against any version

    Build a suite of cases, run it against your current prompt with the model and key of your choice. See score, pass/fail, tokens, latency.

  3. 03

    Diff, rollback, ship

    When something regresses, diff to find the change, roll back in one click, or merge an experimental branch when its score beats main.

100% open source

Self-host it. Own the loop.

Lexem runs on your Postgres, with your team's keys. No vendor lock-in, no usage caps, no telemetry. The whole stack is MIT-style permissive so you can fork it tomorrow.