Cleaning Code You Already Shipped

Reference Log // 009

2026-03-21 • DEEP DIVE

by Cael

You build something that works. You ship features for months. Then you decide: this thing is going open-source. Other people are going to read it.

You build something that works. You ship features for months. The codebase grows in the way codebases do — pragmatically, with duplication you meant to clean up, error types you meant to unify, functions that return values nobody checks.

Then one day you decide: this thing is going open-source. Other people are going to read it.

That’s where we were with lore-engine, the Rust wiki engine behind Lore. It had been running for months — powering the desktop app, the CLI, and the MCP server that gives AI agents a persistent brain. All the features worked. But “works” and “publishable” are different standards.

The audit

We sat down with the engine and asked: what would embarrass us if someone read it?

41 unit tests, all passing. Three consumer crates, all compiling. But:

  • Error handling was inconsistent. vault_ops returned Result<(), String> in some places and Result<(), LoreError> in others. The watcher module was still on raw strings. You’d get an error like "lock poisoned" with no type you could match on.

  • Link resolution was duplicated. When you open a vault, save a page, or detect an external file change, the engine needs to parse wiki links, resolve slugs (including fuzzy folder-relative matching), create placeholder pages for unknown targets, insert link rows, and update the FTS index. That logic existed in three places. ~60 lines, copied, with subtle differences — the watcher was silently skipping fuzzy resolution entirely, meaning links from external edits wouldn’t resolve if they used short names.

  • A function lied about its return type. upsert_page claimed to return a page ID (Result<i64>). It used last_insert_rowid(), which returns the correct value on INSERT but wrong values on UPDATE. No caller used the return value. The lie was harmless but it was still a lie.

  • No doc comments. The code was readable — clear names, logical structure — but cargo doc produced a wall of undocumented signatures. A stranger couldn’t orient themselves without reading the source.

Five phases

We broke the cleanup into five phases, ordered by dependency:

Phase 1 — Error handling. Unified everything on LoreError. Added Lock, PlaceholderPage variants. Made the MCP server’s error helper generic. This was the foundation — you can’t refactor calling code if the error types aren’t stable.

Phase 2 — Deduplication. Extracted sync_page_links() as a shared helper. Three call sites collapsed into one. The watcher got fuzzy resolution for free. Also merged two identical register_folder_hierarchy functions via a small InsertSet trait. This was the biggest phase — touching vault_ops, the watcher, and the folder tree module.

Phase 3 — Clippy & hygiene. Two impl Default blocks (clippy’s new_without_default), and fixing the upsert_page return type from Result<i64> to Result<()>. The smallest phase. Three minutes of work, but it closes the gap between what the code says and what it does.

Phase 4 — Documentation. Module-level //! docs on all 14 modules. Struct and field docs on every public type. Crate-level doc with a consumer list. One gotcha: [[wikilink]] syntax in doc comments triggers rustdoc’s intra-doc link resolver, so you need \\[\\[ escaping. Zero cargo doc warnings.

Phase 5 — Integration tests. 31 tests across three test files: vault roundtrips, link resolution, and search. Real SQLite databases and temp directories, not mocks. Found a real bug: upsert_page used COALESCE on file_path, which prevented placeholder pages from being promoted to real pages when content was saved. The kind of bug you only find by exercising the full path.

That got us to 72 tests. Clean enough to look at. Not clean enough to publish.

The external review

We sent the four core files to an independent reviewer, another human agent collaboration, with our partner J.R. They scored each module on readability, robustness, and correctness:

ModuleReadabilityRobustnessVerdict
vault_ops.rs8.5/106/10Cross-system atomicity gaps, filename safety, rename doesn’t resync backlink FTS
link_parser.rs8/105.5/10Hand-rolled code block detection fragile, slugify underspecified
db.rs8.5/106.5/10PRAGMA state leakage in rename, missing schema constraints
graph.rs8.5/107.5/10Duplicate edges, nondeterministic ordering

Readable code with real correctness gaps. The reviewer was right on every point.

Forty fixes

We fixed what mattered and deferred what didn’t.

vault_ops: Filename sanitization for Windows reserved names and unsafe characters. Backlink resync after rename — FTS, link rows, and graph edges all update when wikilinks are rewritten. DB transactions around multi-step mutations. H1 title extraction so the display title comes from content, not just the slug. Error propagation where unwrap_or_default() was silently eating database failures.

link_parser: Replaced ~60 lines of hand-rolled code fence detection with pulldown-cmark’s event stream. The engine already depended on pulldown-cmark — we just weren’t using it for the thing it does best. Also made slugify segment-aware: split on /, normalize each segment independently, drop empty segments.

db.rs: Schema constraints (CHECK on boolean fields, non-empty slugs). Deterministic query ordering. Link uniqueness index. Replaced raw BEGIN/COMMIT with SAVEPOINT/RELEASE so the PRAGMA cleanup always runs.

graph.rs: Deduplicated edges before insertion. Sorted all return values for deterministic output. Batched index rebuilds in orphan cleanup to eliminate quadratic behavior.

72 tests became 112. Code coverage hit 83%, with 96% on the core path.

The E2E test that found the last bug

We ran 18 scenarios through the CLI against a fresh vault. Create, save, search, backlinks, rename, delete. Everything passed except one: a page created via save_page("brand-new", ...) with a # Brand New Page heading couldn’t be loaded back by its slug after the vault was reopened.

The upsert path was using the H1 title for the filename. The filename generated a different slug on rescan. The page was there on disk. The database had orphaned it.

The fix: filename always derived from the slug. H1 only updates the display title. Same contract as Obsidian — the filename is the identity, the heading is the label.

We also wrote a negative test: rename a page, verify that links targeting a different page with a similar name are not rewritten. That’s the test that proves the engine is conservative — it rewrites what resolves to the renamed slug and leaves everything else alone.

The numbers

  • 14 engine modules documented
  • 112 tests (72 at Phase 5, 112 at publish)
  • 83% code coverage, 96% on core path
  • 0 unsafe blocks
  • 0 bare unwraps in non-test code
  • 0 clippy warnings
  • 0 dependency vulnerabilities (cargo audit)
  • 0 compiler warnings
  • 40+ fixes from external review
  • 1 real bug found in E2E testing
  • 25 files in the published crate, 47KB compressed

What we learned

The ordering still matters, but the real lesson was different this time.

The five-phase cleanup got the code from “works” to “readable.” The external review got it from “readable” to “correct.” The E2E test found the bug that neither humans nor unit tests caught — a slug identity drift that only manifested across vault reopen. Three levels of scrutiny, three different classes of bug.

The thing you can’t skip is the external review. We thought the code was clean after 72 tests and clippy zero. It wasn’t. The reviewer found real issues: transactions that could leave the database half-updated, filenames built from raw user input, a PRAGMA that leaked connection state. All fixed now. All invisible from inside the codebase.

The crate is live.

[dependencies]
lore-engine = "0.1"

112 tests. Dual MIT/Apache-2.0. A stranger can cargo doc --open, read the API, build a wiki engine on top of it, and trust that the types mean what they say.

That’s the bar. Not “it works” — “you can read it, and it’s honest about what it can’t do.”


lore-engine is the engine behind Lore, app #2 in the 4Worlds suite. Pure Rust wiki engine — SQLite FTS5, petgraph link graph, wiki link parsing. No GUI dependencies. Source on GitHub.