Changelog
What shipped, when, and why
Entry-by-entry notes for augment-it, following the
Lossless changelog conventions.
-
Sources capture their own bibliography — authors, publisher, date
Fetch a source and it now fills in its own author(s), publisher, and publication date, straight from Jina's structured metadata — authors as an array, all editable, all written to both the file and the registry. Plus a green save-pulse on every field, and a fix so renamed files actually stay connected.
-
Strategy Curator ships its source surface — curate sources by domain:type
Pick a strategy, gather sources for it, and fully control each one: fetch full content, retry past anti-bot pages, edit the bibliographic fields, rename the file, attach a PDF you downloaded yourself, tag it — all writing straight to the corpus filesystem and a SurrealDB registry that never decouples. The landing-page-vs-PDF problem finally has an answer.
-
DB Resolver v0.0.0.3 — opportunities, auto-minted per record, many per org
Resolving a record now mints an opportunity: a first-class, client-scoped entity 1:1 with the source record and many:1 to the canonical org. Three pipeline rows for Accelerate the Future become three opportunities on one org — none merged, none lost. This is also where the client's CRM data finally lives.
-
Edit the canonical entity on the DB — with traceability back to the source records
Confirmed working in-app: you can now correct a canonical organization's name and slug after research, and the match to the original record holds. The bond is the immutable id; the slug is just its editable face, and every match rides back onto the source row.
-
DB Resolver v0.0.0.2 — the match rides home, and the canonical name becomes yours to fix
The resolver now stamps each match straight back onto the source record (so it survives a crash and exports back to the client's CSV), and lets the operator rename the canonical org's name/slug without ever breaking the bond — the immutable id holds, old slugs become aliases.
-
Record · DB Resolver — the first bridge from row-store records to canonical orgs
augment-it grew in two eras that never met: the row-store records and the SurrealDB canonical orgs. A new per-record surface bridges them — match each record to a canonical organization (or create one) and land its web presence as additive, deduped facts. First iteration works out of the box; a little UI polish before it ships.
-
Decile Hub connector — augment-it's first per-client API integration, built from the OpenAPI spec out
A VC client runs their fund + CRM on Decile Hub. Tonight augment-it learns to talk to it — a skill that codifies the contract and a TypeScript MCP server that exposes it as agent tools, both grounded in Decile's own OpenAPI spec rather than guesswork. It's the first connector in the per-client seam we've been building toward.
-
The corpus meets its org — a content_items ledger connects 500 fetched files to canonical entities, and a publisher / about / mentions model says how
528 markdown files on a laptop and 129 organizations in the cloud database knew nothing about each other. Tonight a reconcile pass joins them: every fetched URL becomes a content_items row that records who published it, who it's about, and (soon) everyone it mentions — 500 rows, zero leakage, idempotent and re-runnable.
-
Affiliations go many-per-person — the single-org slot becomes a stack of cards, orgs autocomplete instead of retype, and one org can own many email domains
The relationship between people and organizations is a lie at the data model level. How careers work: people sit on boards, advise, change jobs, keep a past CFO title. Tonight the single org slot becomes an open-ended stack of affiliation cards — each with its own free-text role written straight onto the graph edge, each able to find-or-create its org. Start typing a name and existing orgs autocomplete from canonical (so you pick 'Institute for Humane Studies' once, not retype it on every attendee).
-
Person-enrichment surface ships — the pulse pattern lands as our second flow shape, per-field Enter-to-save with cross-doc UUID verification, and orgs auto-detect across attendees
Last night the canonical layer landed in SurrealDB; tonight the operator-facing surface that turns sparse rows into rich ones starts running in the browser. The new `apps/person-enrichment/` remote is a *pulse-surface* — a per-event UI where the operator iterates one attendee at a time and fills as many dimensions as one Google search surfaces: name (paste a full name and it splits surname + first_name; or edit either field directly), additional emails, personal links (LinkedIn, X, Substack, GitHub — kind auto-inferred from URL), personal corpus (the LLM/RAG ingest target — articles, podcasts, interviews), an organization (find-or-create by slug), org links, and org corpus. Every input commits on Enter; every save announces the doc-table AND relational-table writes it caused; corpus URLs get a shared UUID across the `content_items` relational row and any number of `personal_corpus` / `org_corpus` array entries, with in-browser verification confirming the cross-doc id matches; orgs auto-detect on the next attendee from the same domain so the operator stops typing 'Stand Together Foundation' fourteen times. Two new context-v docs — the spec naming the pulse pattern, the issue file capturing the query lenses we'll want once `has_personal_link` observations accumulate. 14 of 177 reach-edu attendees enriched in real use this evening before the operator paused for a safety commit.
-
SurrealDB canonical layer lands with cross-client visibility — 1,059 persons, 10 orgs, 1 event, 1,245 observations, all client-tagged from day one
Yesterday's [[Joined-People-UI-and-the-Network-First-Pivot]] exploration ended with a clear conclusion: the filesystem-as-substrate posture that has carried augment-it this far stops working the moment we try to blend canonical entity data (LinkedIn truth, refreshable, shareable across clients) with proprietary per-engagement commentary. Tonight that posture changes. SurrealDB Cloud is live on `main/main`, wired through the Node SDK with a working connector, seeded with the 882 humain-vc LinkedIn-network persons from last night's tagged-briefing pipeline AND the 177 reach-edu attendees from a Stand Together event held last month — two clients sharing one canonical schema with not a single row of leakage between them, courtesy of a `client_access` array materialized on every entity. Plus the spec for the UI that closes the loop — a per-event enrichment surface where the operator turns 177 email-only sparse rows into named persons + properly-tagged organizations one row at a time. The fact log (`observations` table) captures every claim with provenance — who said what, when, on behalf of which client — so the next time we refresh from LinkedIn or pull a new attendee list, nothing overwrites an operator's hand-curated note. The bus that left yesterday with a network-first pivot is back tonight with the data substrate to make the pivot real.
-
LinkedIn network → tagged briefing pipeline lands end-to-end; augment-it's network-first sibling-flow gets named
A real consulting engagement walked in asking for a curated NYC dinner list, and the org-first frame augment-it has been living in didn't fit. Tonight: ~250 LinkedIn profiles captured via Crawlbase land as flat CSV + full-shape JSONL (skeletons flagged for retry, no fields silently dropped), ~200 LinkedIn 'Save to PDF' downloads get slug-renamed via pdftotext extraction, a browser snippet writes a forward manifest so we never need to reconcile by hand again, the deep-profile DOM extractor catches up to LinkedIn's late-2025 hashed-class layout (pronouns / followers / connections / website / cover photo / robust headline classifier), and three composition scripts join everything into a tag-grouped markdown briefing and a brand-themed HTML export. The exploration sitting alongside the code names what just happened structurally: org-first and people-first are two pivots over the same org↔people join, and a canonical layer (the LinkedIn pipeline) belongs cleanly separated from a proprietary layer (per-engagement commentary). Sub-scale (~30K enumerable venture entities) is the differentiator the big CRMs structurally can't occupy.
-
Workspaces become a tenant primitive — top-right switcher in the shell, per-workspace `.env` loaded scoped (not into `process.env`), `client_id` flows through every chat turn, hardcoded reach-edu retires
augment-it was filing all its work under one tenant (reach-edu) and the slab that told the chat what `client_id` to use literally hardcoded the string. Tonight that ends. The shell grows a workspace switcher in the top-right of the header that lists every directory found under `clients/` (today: humain-vc, reach-edu), and clicking one persists the choice to localStorage, broadcasts it via `augment-it:workspace-changed`, and tells the workspace-service which `client_id` is process-wide active. The chat slab now reads the operator's pick from `ctx.client_id` (forwarded on every chat turn) and falls back to the server's process-wide active value — the hardcoded reach-edu string is gone. The load-bearing piece is the env-var seam: `clients/<slug>/.env` gets loaded into a frozen, in-memory map keyed by slug at workspace-service boot. We deliberately do NOT merge into `process.env` because the workspace-service runs in one process and serves all tenants — `process.env` would bleed Decile keys from humain-vc into a reach-edu chat turn. The connector-config seam (LLM / search / CRM / MCP / storage) the spec calls for lands in step 2; tonight surfaces the raw env, hides nothing, and gives Decile a clean home to slot into next.
-
Corpus chips stop lying — the records sheet column the operator was already editing becomes the join key; and Jina's `Published Time:` stops sitting unread in the body of every capture, lifting to top-level frontmatter on every write going forward and backfilled into 244 .md files retroactively
Tonight started with a small symptom — three rows in the Sort & Filter Lens on the v10 record set showing `corpus 0` despite per-funder corpus directories full of files on disk: hewlett-foundation (2 files), howard-schultz-foundation (6), kellogg-foundation (6). The yesterday-night issue file had named four others (sobrato, stand-together, cohen, todd-fisher) and proposed a per-client `corpus-overrides.yaml` as the defense-in-depth fix. Hard-refresh resolved those four cleanly — but exposed these three. A second NATS probe confirmed the backend was returning the correct counts (2 / 6 / 6) end-to-end for the new symptoms too. The actual bug was in the wire, not the join: `corpus.list_for_record` had no `CAPABILITY_TIMEOUTS_MS` entry so dispatch fell through to the 5000ms default, the content-ingest handler is a serial `for await` loop, and each call walked every funder directory under `clients/<id>/corpus/` (~40 dirs × ~300 .md files). With 96 visible rows firing 96 parallel requests on view load, late ones in the burst timed out and the lens swallowed the error silently — race timing explained why sobrato et al. recovered post-refresh and hewlett/schultz/kellogg didn't. The operator's framing: *'shouldn't there be a pretend join where the column in the records sheet for corpus references a folder name/relative path?'* The records sheet already populates `corpus_funder_slug` per row. Wiring that into `listForRecord` as the primary join key dissolved THREE things at once: the bug (operator's explicit assertion routes around every lineage edge case), the timeout race (per-request walk went from 300 files to ≤13), and the planned `corpus-overrides.yaml` proposal (the cell IS the override surface — no new file format needed). Lineage stays as the fallback for rows without a slug. While the door was open we noticed something else: Jina's response body has a `Published Time:` preamble line on every capture, and we'd been writing it to the body but never lifting it into frontmatter. The sort/filter UIs had no first-class authored-date field. A six-line addition to `jina.ts` parses the preamble + a `liftPublishedAt` helper in `corpus.ts` promotes it between `fetched_at:` and `record_id:` on both writers. A sibling backfill script rescued 244 dates from existing files (some going back to Wikipedia article creation dates from 2004 — the corpus carries deep history we couldn't see). Also tonight: 226 hand-curated captures across 20 funder directories + 86 inbox triages from the operator's v10 hand-search rhythm, including the first `manual-local-pdf` convention (operator drops a ResearchGate PDF from a local download; sidecar .md flags the missing canonical URL for later replacement). The slug-join means each of those 20 new funder dirs surfaces on its row's chip the moment the dir exists, no record_uuid plumbing required.
-
Inbox verb stops losing PDFs — the binary lands alongside Jina's markdown, sha256-verified, LFS-tracked; the chat composer grows an upward-opening commands popover so the verb is discoverable from the surface itself
Yesterday `/inbox <url>` shipped and the very first capture exposed the load-bearing failure: the DOL workforce-strategy PDF Jina-extracted into a 5KB markdown stub and the original binary was *gone* — no page-numbered citation, no figure reference, no re-extraction by a better tool, no copy to hand a co-researcher. Today the inbox handler grew a HEAD-probe → URL-suffix → streamed-GET binary downloader behind the same NATS subject; the .pdf lands as a sibling of the .md with a matching dated slug, sha256-verified end-to-end (the response envelope hash matches the on-disk hash, byte for byte), under a per-client `.gitattributes` that LFS-tracks `*.pdf` so the per-client repo's git pack stays sane. Verified live: `/inbox https://insights.hanoverresearch.com/.../Top-Career-Skills-for-2-Year-Grads-2026.pdf?_gl=…` (full Hanover URL with eight tracking params) landed a 1.15 MB PDF (`file` confirms PDF v1.7) next to a markdown index with the `binary_asset:` frontmatter block populated. Same scaffolding extends to docx/pptx/xlsx the moment those become operator pain. While we were in `apps/chat/` we added a small unrelated affordance the inbox verb had been quietly begging for: a Commands popover under the composer that opens *upward*, lists the slash-verb registry, and inserts the chosen verb at the start of the textarea — discoverability for a surface where the only way to know `/inbox` existed was to read the changelog.
-
Agent-chat ships its first non-prompt verb — `/inbox <url>` saves to the operator's Corpus Inbox via Claude Sonnet 4.6 with active-client context; companion architecture spec defines the three-layer cleanup path
The chat surface (`apps/chat/`) has been mechanically functional for weeks — Anthropic SDK in `services/prompt-runner/src/chat-turn.ts`, four-cache-eligible system slabs assembled in `services/workspace/src/chat.ts`, three-mode tool dispatch (chat_answer / chat_propose / chat_invoke), Svelte transcript with accept-proposal cards — but it knew about three verbs (`prompt.draft`, `prompt.improve`, `prompt.apply`) and one context dimension (`record_set_id`). For the funder content corpus workflow the operator is actually in, that meant the chat had nothing to offer. This commit ships the first non-prompt verb: `/inbox <url>` recognized by the model, dispatched as `corpus.inbox.add`, writes to `clients/reach-edu/corpus/inbox/` with the extended frontmatter contract from the Corpus-Inbox-Capture-and-Triage spec. Claude is the default and was the default; what changed is the chat now has something useful to do with the API key it already had. Companion spec `Chat-Context-Awareness-Architecture.md` lands alongside, naming the three v0.0.1 design holes the inbox verb exposed (hand-written verb roster, hardcoded active_client_id in the context slab, ActiveView covers only half the microfrontends) and defining a three-layer migration plan (workspace as context broker → slab-assembly contract → verb registry) that each layer ships independently. Path B per the user's pick — ship the verb tonight, spec the architecture once it's working.
-
Content Reader manual URL add — operator pastes a URL from any search, lands in the funder's corpus regardless of domain; spec clarifies Rule 1 binds pack outputs only; exploration filed for in-app browser vs plugin
The Funder Content Corpus Workflow's pack layer is doing its job for the records that have crawlable indexes, but for the records where the pack finds nothing the operator was stuck. Manual URL add closes the loop: every Content Reader card now has a collapsed '+ add URL manually' affordance that takes any URL the operator found via their own search, Jina-fetches it, and lands it as a corpus markdown file with `pack_id: manual`. Same-host (Rule 1) is NOT enforced — that rule binds pack outputs only; Rule 5 (operator decides per item) trumps for manual flows. Verified end-to-end with `lawserver.com/law/state/alabama/al-code/alabama_code_41-29-333` posting into `alabama-state-legislature-appropriations-funds/` despite the URL being off-domain. The spec was updated (v0.0.0.2) to make this scope explicit so a future agent doesn't re-add the same-host filter to manual paths. A separate exploration captures the next-step trade-off — embedded in-app browser vs browser plugin/bookmarklet — for shrinking the operator's search → paste round-trip further.
-
Funder Corpus First Session Failed — 75 markdown files across 15 of 96 funders, six hours invested, multiple workflow failure modes catalogued for tomorrow's restart
An end-of-day shipping note that is not about success. The reach-edu funder content corpus workflow was supposed to land tonight against the 96-row Master-Pipeline-Tracker. Final state: 75 markdown files written across 15 funder subdirectories under `clients/reach-edu/corpus/`. 81 of 96 funder records have zero corpus content. The operator's framing at session-end: 'this is all fucking bullshit, nothing is working the way it should, I was only able to process like 1/3 of the plausible records.' This changelog is the honest version. The goals spec landed (Funder-Content-Corpus-Workflow.md, eight rules + six-step workflow), the implementation rebuild against it landed in code (content-ingest service, Content Reader UI, pack honors operator curation, dispatch refuses invalid URLs, fire_id stamping), and 25 broken-URL rows were force-repaired across 5 promotion generations. But the end-to-end validation — fix URLs, re-fire entity-blog, evaluate fresh data — never happened because the session consumed itself patching display-layer symptoms of yesterday's broken-pack-broken-URL data instead of producing fresh clean data via a re-fire. The honest accounting: substantial code shipped, partial data landed, root causes mostly unfixed, eight named failure modes documented in `context-v/issues/Funder-Corpus-First-Session-Failed-Most-Records-Unprocessable.md` for morning-self to attack as a punch list.
-
URL auto-detector ships clickable links for socials / helpful_links / official_updates_index_urls; Per-Client-Privacy exploration lands; reach-edu becomes the first client submodule
Three coupled drops in the same session. First, the URL auto-detector + clickable rendering plan dropped earlier today implemented end-to-end in Record Collector — `apps/record-collector/src/logic/format.ts` grew a `FieldShape` discriminated union (`empty | scalar | scalar_url | url_list | json`) that picks by VALUE-shape never field-name, plus a `UrlEntry` extractor that surfaces `url + chip (from pack_id) + label (from label or display_name)` from `Array<{ url, ... }>` payloads. The structured-value branch in `App.svelte` switches on `shape.kind`: `url_list` renders as a vertical flex of clickable `<a target="_blank" rel="noopener noreferrer">` entries showing hostname+path (truncated at 60 chars with full URL in `title=`) so the operator can verify links resolve without leaving Record Collector. Auxiliary metadata (display_name, confidence, source_metadata, response_id, accepted_at, ...) drops from the rendered view but stays on the row and round-trips through CSV. Read-only by design; the per-entry remove affordance lives in a sibling plan, not this one. The keyed-each was unkeyed mid-implementation when `(entry.url)` was identified as a Svelte 5 duplicate-key crash risk against any row whose URL array has repeats — for read-only lists, positional iteration is the right call. Second, a Response Reviewer drill-through audit on the live response-store revealed 1,982 responses with 1,109 (56%) from the three Entity-Pulse OfficialPulse packs at a combined 3-accept rate — `official-blog-pack` 99.5% reject, `official-social-posts-pack` 100% reject, `official-pressrelease-pack` 100% reject — confirming the operator's 'tons of junk links' read of the surface. Domain breakdown showed 65% of pressrelease responses come from `news.google.com` (the google-news-rss connector with weak entity-name binding); 45% of social-posts responses come from `youtube.com` (weakly bound). The packs with strong identity binding — wikipedia (matches article title to entity), linkedin (returns entity's own page) — reject at the normal 73-82% rate for a triage workflow. Third, the operator's response to the junk-volume reframed the next-feature direction entirely: STOP triaging URLs one-at-a-time in Response Reviewer; START treating those URLs as a seed list for a Jina.ai content-ingestion pipeline that writes deduped markdown per funder; build a per-client content corpus that powers cross-funder fundraising-strategy synthesis and per-funder outreach customization. That direction surfaced the broader ai-labs architectural question the operator named explicitly — when does the single-operator-on-localhost posture stop scaling, what stack do we reach for, and how do we architect today so the move is cheap when it comes? — and produced `context-v/explorations/Per-Client-Privacy-and-the-Path-Off-Local.md` (v0.0.0.1, Draft) mapping five candidate stacks (defer-everything, per-client Railway single-tenant, multi-tenant SaaS shape, hybrid posture, managed BaaS) across five axes (repo topology, storage substrate, identity/auth, multi-tenant data model, sensitivity constraints), naming six decision-forcing functions, and five architecture choices that cost almost nothing today but preserve cheap optionality for every path-flip. Operator signed off on Path D — hybrid posture — and committed to start local: `lossless-group/augment-reach-edu` created as a private repo (default README); `clients/reach-edu` registered as a git submodule pointing at it. The reach-edu corpus + operational data + Jina ingest pipeline land in that submodule in follow-up sessions; the augment-it parent stays on the docker-local posture until one of the named forcing functions actually fires.
-
Record-Set Family Grouping ships — the spec's six capabilities, family-tree rendering, post-ingest suggestion prompt, and uniform newest-on-top sidebar
The spec dropped earlier today (`context-v/specs/Record-Set-Family-Grouping.md`, v0.0.0.1) implemented end-to-end against the live :3002 surface. RecordSet picked up the two new fields (`variant_family_id`, `variant_family_label`); the row-store grew a top-level `variant_families` slot plus five mutators (create / update / add / remove / dissolve) and one read-only heuristic (`suggest_variant_family`) that normalizes a filename stem — strips date prefix, `_vN` suffix, extension — and matches against existing sets with the same stem and ≤ 3-column schema delta. The workspace WS bridge registered the six new capabilities. Browser-side: `apps/record-collector/src/logic/family.ts` walks every leaf back through `promoted_from` to assemble lineage chains, groups leaves by `variant_family_id`, sorts every list — groups, members, archived ancestors — newest-on-top per the user's mid-implementation correction (spec edited to match: §Sidebar rendering now reads 'Newest-first is uniform across the sidebar'). `RecordSetsList.svelte` renders a collapsible family card with chevron header, member count, optional generation count, two inline action icons (✎ rename, × dissolve), and an 'Earlier generations (N archived)' sub-section under any leaf with promoted predecessors. Default sidebar query excludes archived sets at the top level. `App.svelte` calls `record_set.suggest_variant_family` after every ingest; if a match comes back AND the user hasn't dismissed that stem before (sticky per-stem in localStorage), a non-blocking 'Looks like a variant of X (N existing sets) — link as family?' prompt appears under the ingest status with Link / Dismiss buttons. Backwards-compatible — every new field defaults `undefined`; pre-existing stores deserialize without a migration. Existing `…_v4`–`…_v8` sets render as solo cards until the next ingest fires the suggestion.
-
Record Collector renders every field regardless of type, splits the Augment-this-Set CTA into Prompt vs Bundle/Packs, and drops a spec for record-set family grouping
Three things shipped in one short session against the live :3002 surface. First, the per-record field renderer in Record Collector stopped silently hiding structured values — arrays and objects now serialize through the same path as the CSV export (`logic/format.ts`, mirroring `download.ts:csvEscape`), and any empty value (scalar `''`, `null`, `undefined`, `[]`, `{}`) renders a muted italic `(empty)` placeholder so the operator can confirm the prior enrichment pass populated the field before kicking off the next one. The 11px muted JSON style bumped to a readable 12px, normal weight, on the field background. Second, the single `Augment This Set →` button became a small `Augment this Set` header with two stacked CTAs — `Run a Prompt →` (navigates to `promptTemplateManager`) and `Run a Bundle / Packs →` (navigates to `packRunner`) — so the user picks the divergence at the Record Collector surface instead of landing on a default and toggling in-slot. The set-header also restacked vertically so long versioned filenames (`2026-06-05_Master-Pipeline-Tracker--Active-Pipeline_v8.csv`) wrap above the CTA panel instead of squeezing it offscreen. Third, when the operator surfaced that the sidebar treats five sequentially-uploaded variants of the same external tracker (`…_v4.csv` through `…_v8.csv`) as five unrelated peers, that produced `context-v/specs/Record-Set-Family-Grouping.md` (v0.0.0.1, Draft) — a spec that separates internal-lineage families (already in `promoted_from`) from external-variant families (new `variant_family_id` field), names five new workspace capabilities, defines the suggestion-only filename-stem heuristic, and locks that the Augment-this-Set CTAs target the active leaf, never the family.
-
Records Surface ships an end-to-end per-record augmentation loop — fire a connector, accept inline, save as the next version, navigate to download or augment-again
After a long arc that re-spec'd the bundle/pack flow from scratch and ended a stack of architectural overreach, augment-it now has a working per-record surface where the operator picks ONE record, fires ONE connector (Firecrawl scan, Firecrawl + Haiku agent, or SerpApi `site:` search), watches the candidate URLs render inline, picks one (or pastes/edits a custom URL), and moves on. Acceptances land in an array column so an entity with multiple canonical paths (Arthur Blank Foundation's /news/ + /blogs/, Ascendium's /newsroom + /our-stories) carries all of them. A save bar at top AND bottom of the records list invokes the existing `record_set.promote` capability with confirmation, the new version auto-becomes the active set, and a post-promote nav surface offers Enhanced Records (download), Augment (next pass), or stay. The same session also produced three new context-v documents that explain the rearchitecture, audit the v3 → v4 → v5 chain to clear promote of a wrongly-pinned bug, and lock the row-count-stable-across-versions invariant. Record Collector got decomposed into proper components with a descending-by-created_at sort and a per-card CSV download button. The user ran the full loop end-to-end on the 96-row pipeline tracker, accepted 98 URLs across 45 rows, and promoted v5 → v6 cleanly.
-
Entity Pulse step 1 — official-blog-pack standalone + iso-helper for date normalization across packs
First implementation step of the Entity-Pulse-Bundle spec lands as a working two-stage pack (find-index → extract) plus a generic ISO-8601 normalizer that solves date extraction across heterogeneous web sources. Smoke-tested against reach.edu: 5 real blog posts pulled end-to-end with ISO publish dates + computed age_days in 4.4s. SerpApi + Firecrawl wired as new connectors; new NATS subject `pack.entity_pulse.requested`; standalone CLI driver for one-off fires and future Agent Chat integration.
-
Composite Slots & the Flow Rename — One Toggle, Every Layout Mode
We thought the spec called for two patches and a rename. Halfway through the second patch the architecture broke in our face — a Split view with a working ✎/⊞ toggle but a stubbornly empty pane next to it — and we realized the toggle didn't belong inside either remote, it belonged to the slot. The fix turned into a shell-level concept that pays off in every layout mode.
-
Shell & micro-frontend UX coherence — a demo-prep session pivots from patching into a broad-scope spec, four sibling stubs, and the framing of augment-it's dual identity
What started as live demo-prep — patching a sticky Fire button, wiring a dead 'Do another round' affordance, adding a navigation to where results land — became a verdict: augment-it's affordances hide, die, mismatch, and (a newly emerging fourth shape) misname. Rather than patching one more thing, the session pivoted to spec mode. The result is a whole-shell UX coherence audit with eight locked decisions, twelve evidence items, four surfaces audited, and three open architectural questions; four sibling context-v stubs (per-remote in-app API docs, initial user experience, the dual-identity blueprint, the auth-patterns-following-Astro-Knots blueprint); and a substantive pickup doc so the next session can resume cold. No code shipped from the spec arc itself — everything held at the sign-off gate — except three small forward-affordance fixes that landed alongside as a separate fix commit and are recorded in the spec's evidence log.
-
SearXNG joins as a peer provider — social packs default to the free metasearch container; Tavily stays for content-RAG; provider_override seam wired through the stack
The reframe that took shape in late May lands as code: search-provider choice is now an architectural axis of the stack, not a Tavily-vs-anything-else swap. SearXNG arrives as a self-hosted container peer to Tavily, with its own connector under a new connector registry; the common-seven social packs flip their default to SearXNG (no key needed); Tavily stays wired in as the peer for content-RAG packs that need it; and a `provider_override` parameter threads from the response-reviewer's per-record buttons through workspace and social-search so the user can fire any pack through either provider on any row without touching the pack definition. The pre-flight surface for per-row iteration is now in place — the iteration loop itself still pending.
-
Packs and Bundles, end-to-end — the extension pattern goes from exploration to a live microfrontend + microservice + triage cockpit in two days
Ten commits between 2026-05-25 17:08 and 2026-05-26 02:53 carry augment-it through a complete arc: a tidy-and-status-sweep across the existing context-v tree, a new exploration (Entity-Profile-Augmentation-Workflow) that surfaced a two-tier abstraction, a forked blueprint (Packs-and-Bundles-Pattern) that locked the contracts, a new shared package (@augment-it/shared-ui with the first reusable component — ConfidencePill), a structured-output extension to the response-store schema, a new federated remote (pack-runner at :3009), a new backend microservice (social-search) with six (then seven) packs and Tavily as the v1 connector, a row-store write-back surface (row.fields.socials), a paired-authoring shell rewire that took pack-runner out of the rotation and made it a peer to prompt-template-manager, a complete by-record triage view in response-reviewer (:3005) with inline URL / display_name / entity-name editing, two design pivots that got documented honestly (the profiles.<source> column scrap, the Run-entity plan deferred to its own arc), and emergent-requirements promoted into the blueprint as a §Triage Surface UX Requirements section for the rest of the Lossless family. The system that exists at the end of these two days is recognizably the same shape as the system at the start and recognizably a different product.
-
augment-it's response-reviewer remote (localhost:3005) is in a good place — the triage surface for the structured-output pattern stabilizes after many commits
Over the last two days the response-reviewer microfrontend at localhost:3005 went from a basic review UI to the triage surface for every pack augment-it will ever run. The structured-output extension that landed on 5/25 was the seam; what followed was a fast sequence of commits — many attempts, two scraps, real-data smokes — that turned the remote into a per-row, per-pack triage cockpit with inline correction (URL), human-supply (display_name), and entity-name cleanup (the row's identity column) all riding the same response.set_structured subject. Two emergent-requirements rounds also got captured into the Packs-and-Bundles blueprint so future packs across the Lossless family inherit the discipline. The SearXNG-substrate decision is filed as an issue but deliberately not yet executed — the remote is good enough that the next session can start somewhere new.
-
All Data Continues — the rule augment-it actually obeys now
augment-it is a multi-tenant pipeline tool. Different clients will upload spreadsheets with wildly different columns. The code cannot have opinions about specific field names. Today we ripped out every place where it did: no hardcoded reserved-field lists, no field-name-aware special rendering, no schema decisions imposed on tenant data. The cell renderer is type-driven (scalar → text, object/array → JSON), the canonical-schema union is presence-driven (any key with values is a column), the promote-fold is type-driven (arrays merge, objects merge, scalars overwrite). One rule everywhere: whatever a tenant puts in row.fields stays in row.fields and ends up in the promoted record. This entry shows the messy middle — what we got wrong, how the user pushed back twice, and what the corrected discipline reads like.
-
Enhanced Records List and the Promotion Mechanic — the iterative-enrichment loop closes
augment-it now has a checkpoint surface between enrichment rounds. You finish a pass of LLM-driven enrichment, open Enhanced Records, see every record in one place with its accumulated columns, and click Promote — the system snapshots the current state into a new canonical record set under a deterministic, versioned name, archives the predecessors, and gives you a clear success message with a one-click path back to the start of the arc to author the next round. record_uuid follows each record across every promotion so a single conceptual record stays traceable from original CSV through every round of refinement. This is the loop the walking skeleton has been building toward: enrich, triage, checkpoint, re-enrich on a cleaner base.
-
The post-flight triage round — the loop survives contact with 207 real records
Last night's commit shipped Request Reviewer + Response Reviewer as code that typechecked. Tonight they survived a 207-record enrichment run against a real CSV with the user as the human in the loop — and every place the loop was thin, this round repaired. The Save / Apply ergonomics on Prompt Templates, the firing indicator and Cancel button and coverage strip on Request Reviewer, the per-row 90-second timeout and run-level cancellation in prompt-runner, the autosave-on-blur on Response Reviewer's edit textarea (after the first run lost 75 manually-typed URLs), helpful_links as a side-channel field on every row, a needs-human triage flag, an icon action row with instant CSS tooltips, per-bucket counts on the filter chips, a fire-time blank-row filter at csv ingest, and the spec for the next checkpoint — enhanced-records-list and the promotion loop — captured in context-v as the build target for the next session.
-
The review UIs go live — and the request/response line gets drawn hard
This morning, Request Reviewer and Response Reviewer were empty placeholder panels. Tonight they work. Request Reviewer shows the resolved request — the prompt with its {{tokens}} filled from a real record row — lets you pick the model, see the literal JSON body, and fire it. Response Reviewer steps through what came back and lets you flag or accept each response. And the boundary between the two is now strict: the request belongs to Request Reviewer, the response to Response Reviewer, and authoring stays in Prompt Templates — which lost its run button entirely. Four stages, four remotes, one gate each.
-
Request Reviewer and Response Reviewer — two new remotes mount in the Deck
The shell's tiling Deck now shows the whole enrichment pipeline. request-reviewer and response-reviewer join record-collector and prompt-template-manager as federated remotes — scaffolded, wired into the shell, connecting to the workspace, and tiling in pipeline order. The panels are deliberately placeholders: this entry is the federation wiring, not the review UIs. But for the first time the Deck reads end to end — collect, author, pre-flight, post-flight — in one window.
-
Pre-flight and post-flight — the service layer beneath two new review stages
augment-it is growing two new pipeline stages: a request-reviewer that shows you exactly what is about to be sent to the model, and a response-reviewer that lets you triage what came back. This entry is the service layer beneath both — the no-UI plumbing. Its load-bearing idea is small and strict: there is now exactly one function that assembles a model request, so the request you preview is byte-identical to the request that fires. Alongside it, a new response-store service turns every fired response from a throwaway cell into a first-class object you can inspect and flag. The two frontends that sit on top are their own story — coming next.
-
The shell becomes a window manager — peek-deck, co-existence split, and a draggable seam
augment-it's shell stops taking turns. Where it used to mount one federated frontend at a time and destroy it to show another, it now mounts several at once and tiles them like a desktop window manager. Three layout modes: a peek-deck where the focused app sits at 90% with its neighbours peeking from the edges and hover-expanding, a co-existence split where two apps share the viewport behind a draggable seam, and plain full-width. Layout state persists per-user behind a seam the shared-auth store will back later.
-
Light, dark, vibrant — augment-it adopts the Lossless theme religion
augment-it had its colours hardcoded as hex scattered across five CSS surfaces. Now it has a theme system: a shared @augment-it/theme package with one theme.css holding a two-tier token architecture and three mode blocks, a mode-switcher, and a sun/moon/star toggle in the shell chrome. Light, dark, and vibrant all work across both federated remotes — switched by one data-mode attribute on a shared <html>. The Astro Knots theme blueprint, adapted for a non-Astro, non-Tailwind, federated app.
-
The enrichment loop gets a face — prompt-template-manager, and the walking skeleton is whole
Yesterday's question was 'how do I see it?' — the enrichment loop worked but only from a smoke script. Now it has a UI. prompt-template-manager is augment-it's second federated remote: a prompt editor with a live {{token}} strip, a bind check that turns green or red against whichever record set you pick, and a run control that streams progress. It mounts as the second tab in the shell. The five-phase walking skeleton from the plan is complete — you can author a prompt, run it, and watch a derived record set appear, all without touching a terminal.
-
augment-it augments — the first LLM call, and the loop that turns a column name into a column of answers
augment-it now does the thing it is named for. Two new microservices land — prompt-store holds prompt templates, prompt-runner makes the LLM calls — and together they close the augmentation loop: write a prompt with {{column}} placeholders, run it per-row against a record set, get back a derived record set with the model's answers as a new column. The proof-of-life was a web-search-backed prompt that turned a column of organisation names into a column of website URLs. This is the first time augment-it has called an LLM.
-
Federation host meets Svelte 5 — three lessons we couldn't find in the docs
augment-it now has a real Module Federation host. The shell at :3000 mounts the record-collector remote at :3002, and the architecture has a federation boundary for the first time. Getting there meant fighting three subtle Svelte-5-plus-MF interactions that the official docs gloss over: shared-singleton factories that wouldn't register, $effect_orphan errors from cross-runtime mounts, and component-scoped CSS that quietly disappears across federation chunk boundaries. Each one has a fix worth keeping.
-
augment-it goes end-to-end — your spreadsheets, your browser, four phases in one push
Phases 2 through 5 of the walking skeleton landed in a single session. Five containers running, the two real fundraise spreadsheets ingested, cells editable from a Svelte 5 frontend that consumes the workspace singleton, and the architecture demo's strongest claim — 'adding a new service is cheap' — has gone from slide to docker-compose diff. The 27th service made its first appearance as xlsx-ingest, with zero changes to row-store.
-
augment-it grows a backbone — workspace, services, and a message bus that boots
Phase 1 of the augment-it rewrite landed today. Three containers come up cleanly with one command: a NATS message bus, a thin Workspace Service that owns no domain data, and a row-store microservice that owns its own rows. The shape of every future service is set; the wire is now visibly a microservices wire instead of point-to-point HTTP. The in-app chat surface has something real to plug into.
-
Add a GitHub Pages splash for augment-it
A small Astro site at splash/ that ships to GitHub Pages on push to main, renders the repo's changelog/ and context-v/ alongside curated copy about the six module-federated microfrontends.
-
RecordCollector analysis spec — 479 lines on the first microfrontend's shape, data contract, and integration points
Before writing the next line of code in the record-collector submodule, we wrote down what it actually has to do — input shapes, output contract, edge cases, and where it sits in the broader workshop.
-
Scope the next phase: five GitHub issues opened across specs, visualization, and the legacy bolt monolith
A morning of issue-triage turned the next month of augment-it work into five named threads — and named the loose ends honestly.
-
Move five loose specs into a specs/ folder — the start of a real documentation structure
Five planning docs that had been sitting at the repo root got consolidated into specs/. A small move, but it set the precedent for where future specs would land.
-
Comms documentation push — vision spec, architecture explainer, and prompts that travel with the project
A 3,064-line documentation drop that gave augment-it a written voice — the kind of artifacts a new collaborator can read end-to-end before touching code.
-
Promote RecordCard out of the microfrontends and into shared/ui — one Card, many consumers
The Card component was duplicated across two microfrontends. Tanuj lifted it into packages/ui as RecordCard, the first non-trivial component to live in the shared package — and the proof point for how cross-MFE components will work going forward.
-
Microfrontends become real submodules: record-collector, prompt-manager, request-reviewer get their own repos
The two placeholder microfrontends got promoted out of the parent repo and into three named, independently-versioned submodules — the first concrete instance of the federation contract.
-
Module-federation starter lands: host shell, two microfrontends, shared UI package
Tanuj brought up a minimal-but-real module-federation scaffold — a host shell that loads two remote microfrontends from the same monorepo, each exposing a Card component, all sharing a single ui package.
-
Restart augment-it on turbo, rsbuild, and Docker — fresh scaffold, brand assets, monorepo discipline
The bolt.new monolith had served its purpose. We tore the build system down to studs and rebuilt on a turborepo + pnpm + Docker base ready to host federated microfrontends.
-
First extraction attempt: record-collector as a standalone frontend — turso, then Supabase, then frozen
Two months after the Bolt monolith hit working-demo state, we tried to peel record-collection off into its own frontend repo. Two days, three commits, one DB swap (turso out, Supabase in), then we set it down to think harder.
-
Bolt-era Augmenter monolith reaches a working demo — React + Supabase + three LLM providers in one repo
Before there was a federation, there was one big React app: Bolt-vibed, Supabase-backed, three LLM providers wired in, every screen in the workshop already shipping as a component. This is where we found out what the workshop wanted to be.
-
Augmenter App progress — API editor, response list, user profiles, highlighter, group-by-record
A late-January progress update with embedded GIF demos from the pre-restart Augmenter App: API options and editor, response list, user profiles, highlighter, and grouping by customer.