← Changelog

URL auto-detector ships clickable links for socials / helpful_links / official_updates_index_urls; Per-Client-Privacy exploration lands; reach-edu becomes the first client submodule

Three coupled drops in the same session. First, the URL auto-detector + clickable rendering plan dropped earlier today implemented end-to-end in Record Collector — `apps/record-collector/src/logic/format.ts` grew a `FieldShape` discriminated union (`empty | scalar | scalar_url | url_list | json`) that picks by VALUE-shape never field-name, plus a `UrlEntry` extractor that surfaces `url + chip (from pack_id) + label (from label or display_name)` from `Array<{ url, ... }>` payloads. The structured-value branch in `App.svelte` switches on `shape.kind`: `url_list` renders as a vertical flex of clickable `<a target="_blank" rel="noopener noreferrer">` entries showing hostname+path (truncated at 60 chars with full URL in `title=`) so the operator can verify links resolve without leaving Record Collector. Auxiliary metadata (display_name, confidence, source_metadata, response_id, accepted_at, ...) drops from the rendered view but stays on the row and round-trips through CSV. Read-only by design; the per-entry remove affordance lives in a sibling plan, not this one. The keyed-each was unkeyed mid-implementation when `(entry.url)` was identified as a Svelte 5 duplicate-key crash risk against any row whose URL array has repeats — for read-only lists, positional iteration is the right call. Second, a Response Reviewer drill-through audit on the live response-store revealed 1,982 responses with 1,109 (56%) from the three Entity-Pulse OfficialPulse packs at a combined 3-accept rate — `official-blog-pack` 99.5% reject, `official-social-posts-pack` 100% reject, `official-pressrelease-pack` 100% reject — confirming the operator's 'tons of junk links' read of the surface. Domain breakdown showed 65% of pressrelease responses come from `news.google.com` (the google-news-rss connector with weak entity-name binding); 45% of social-posts responses come from `youtube.com` (weakly bound). The packs with strong identity binding — wikipedia (matches article title to entity), linkedin (returns entity's own page) — reject at the normal 73-82% rate for a triage workflow. Third, the operator's response to the junk-volume reframed the next-feature direction entirely: STOP triaging URLs one-at-a-time in Response Reviewer; START treating those URLs as a seed list for a Jina.ai content-ingestion pipeline that writes deduped markdown per funder; build a per-client content corpus that powers cross-funder fundraising-strategy synthesis and per-funder outreach customization. That direction surfaced the broader ai-labs architectural question the operator named explicitly — when does the single-operator-on-localhost posture stop scaling, what stack do we reach for, and how do we architect today so the move is cheap when it comes? — and produced `context-v/explorations/Per-Client-Privacy-and-the-Path-Off-Local.md` (v0.0.0.1, Draft) mapping five candidate stacks (defer-everything, per-client Railway single-tenant, multi-tenant SaaS shape, hybrid posture, managed BaaS) across five axes (repo topology, storage substrate, identity/auth, multi-tenant data model, sensitivity constraints), naming six decision-forcing functions, and five architecture choices that cost almost nothing today but preserve cheap optionality for every path-flip. Operator signed off on Path D — hybrid posture — and committed to start local: `lossless-group/augment-reach-edu` created as a private repo (default README); `clients/reach-edu` registered as a git submodule pointing at it. The reach-edu corpus + operational data + Jina ingest pipeline land in that submodule in follow-up sessions; the augment-it parent stays on the docker-local posture until one of the named forcing functions actually fires.