feat(phase2): fact/inference labeling, change-driven alerts, admin cleanup
- Add label: cited_fact | inference to LLM brief schema (all 4 providers) - Inferred badge in AIBriefCard for inference-labeled points - backfill_brief_labels Celery task: classifies existing cited points in-place - POST /api/admin/backfill-labels + unlabeled_briefs stat counter - Expand milestone keywords: markup, conference - Add is_referral_action() for committee referrals (referred to) - Two-tier milestone notifications: progress tier (all follow modes) and referral tier (pocket_veto/boost only, neutral suppressed) - Topic followers now receive bill_updated milestone notifications via latest brief topic_tags lookup in _update_bill_if_changed() - Admin Manual Controls: collapsible Maintenance section for backfill tasks - Update ARCHITECTURE.md and roadmap for Phase 2 completion Co-Authored-By: Jack Levy
This commit is contained in:
104
ARCHITECTURE.md
104
ARCHITECTURE.md
@@ -238,9 +238,11 @@ Indexes: `bill_id`, `topic_tags` (GIN for JSONB containment queries)
|
||||
{
|
||||
"text": "The bill allocates $50B for defense",
|
||||
"citation": "Section 301(a)(2)",
|
||||
"quote": "There is hereby appropriated for fiscal year 2026, $50,000,000,000 for the Department of Defense..."
|
||||
"quote": "There is hereby appropriated for fiscal year 2026, $50,000,000,000 for the Department of Defense...",
|
||||
"label": "cited_fact"
|
||||
}
|
||||
```
|
||||
`label` is `"cited_fact"` when the claim is explicitly stated in the quoted text, or `"inference"` when it is an analytical interpretation. Old briefs without this field render without a badge (backward compatible).
|
||||
|
||||
---
|
||||
|
||||
@@ -324,6 +326,7 @@ News articles correlated to a specific member of Congress.
|
||||
| user_id | int (FK → users, CASCADE) | |
|
||||
| follow_type | varchar | `bill`, `member`, `topic` |
|
||||
| follow_value | varchar | bill_id, bioguide_id, or topic name |
|
||||
| follow_mode | varchar | `neutral` \| `pocket_veto` \| `pocket_boost` (default `neutral`) |
|
||||
| created_at | timestamptz | |
|
||||
|
||||
Unique constraint: `(user_id, follow_type, follow_value)`
|
||||
@@ -397,12 +400,13 @@ Stores notification events for dispatching to user channels (ntfy, RSS).
|
||||
| id | int (PK) | |
|
||||
| user_id | int (FK → users, CASCADE) | |
|
||||
| bill_id | varchar (FK → bills, SET NULL) | nullable |
|
||||
| event_type | varchar | e.g. `new_brief`, `bill_updated`, `new_action` |
|
||||
| headline | text | Short description for ntfy title |
|
||||
| body | text | Longer description for ntfy message / RSS content |
|
||||
| dispatched_at | timestamptz (nullable) | NULL = not yet sent |
|
||||
| event_type | varchar | `new_document`, `new_amendment`, `bill_updated` |
|
||||
| payload | jsonb | `{bill_title, bill_label, brief_summary, bill_url, milestone_tier}` |
|
||||
| dispatched_at | timestamptz (nullable) | NULL = pending dispatch |
|
||||
| created_at | timestamptz | |
|
||||
|
||||
`milestone_tier` in payload: `"progress"` (passed, signed, markup, conference, etc.) or `"referral"` (committee referral). Neutral follows silently skip referral-tier events; pocket_veto and pocket_boost receive them as early warnings.
|
||||
|
||||
---
|
||||
|
||||
## Alembic Migrations
|
||||
@@ -442,11 +446,12 @@ Auth header: `Authorization: Bearer <jwt>`
|
||||
|
||||
| Method | Path | Auth | Description |
|
||||
|---|---|---|---|
|
||||
| GET | `/` | — | Paginated bill list. Query: `chamber`, `topic`, `sponsor_id`, `q`, `page`, `per_page`, `sort`. |
|
||||
| GET | `/` | — | Paginated bill list. Query: `chamber`, `topic`, `sponsor_id`, `q`, `page`, `per_page`, `sort`. Includes `has_document` flag per bill via a single batch query. |
|
||||
| GET | `/{bill_id}` | — | Full bill detail with sponsor, actions, briefs, news, trend scores. |
|
||||
| GET | `/{bill_id}/actions` | — | Action timeline, newest first. |
|
||||
| GET | `/{bill_id}/news` | — | Related news articles, limit 20. |
|
||||
| GET | `/{bill_id}/trend` | — | Trend score history. Query: `days` (7–365, default 30). |
|
||||
| POST | `/{bill_id}/draft-letter` | — | Generate a constituent letter draft via the configured LLM. Body: `{stance, recipient, tone, selected_points, include_citations, zip_code?}`. Returns `{draft: string}`. ZIP code is used in the prompt only — never stored or logged. |
|
||||
|
||||
### `/api/members`
|
||||
|
||||
@@ -503,7 +508,7 @@ Auth header: `Authorization: Bearer <jwt>`
|
||||
| GET | `/users` | Admin | All users with follow counts. |
|
||||
| DELETE | `/users/{id}` | Admin | Delete user (cannot delete self). Cascades follows. |
|
||||
| PATCH | `/users/{id}/toggle-admin` | Admin | Promote/demote admin status (cannot change self). |
|
||||
| GET | `/stats` | Admin | Pipeline counters: total bills, docs fetched, briefs generated, pending LLM, missing metadata/sponsors/actions, uncited briefs. |
|
||||
| GET | `/stats` | Admin | Pipeline counters: total bills, docs fetched, briefs generated, pending LLM, missing metadata/sponsors/actions, uncited briefs, unlabeled briefs (cited objects without a fact/inference label). |
|
||||
| GET | `/api-health` | Admin | Test each external API in parallel; returns status + latency for Congress.gov, GovInfo, NewsAPI, Google News. |
|
||||
| POST | `/trigger-poll` | Admin | Queue immediate Congress.gov poll. |
|
||||
| POST | `/trigger-member-sync` | Admin | Queue member sync. |
|
||||
@@ -513,6 +518,7 @@ Auth header: `Authorization: Bearer <jwt>`
|
||||
| POST | `/backfill-sponsors` | Admin | Queue one-off task to populate `sponsor_id` on bills where it is NULL. |
|
||||
| POST | `/backfill-metadata` | Admin | Fill null `introduced_date`, `chamber`, `congress_url` by re-fetching bill detail. |
|
||||
| POST | `/backfill-citations` | Admin | Delete pre-citation briefs and re-queue LLM using stored document text. |
|
||||
| POST | `/backfill-labels` | Admin | Classify existing cited brief points as `cited_fact` or `inference` in-place — one compact LLM call per brief, no re-generation. |
|
||||
| POST | `/resume-analysis` | Admin | Re-queue LLM for docs with no brief; re-queue doc fetch for bills with no doc. |
|
||||
| POST | `/bills/{bill_id}/reprocess` | Admin | Queue document + action fetches for a specific bill (debugging). |
|
||||
| GET | `/task-status/{task_id}` | Admin | Celery task status and result. |
|
||||
@@ -570,6 +576,12 @@ Auth header: `Authorization: Bearer <jwt>`
|
||||
has no sponsor data), upserts Member, sets bill.sponsor_id
|
||||
↳ New bills → fetch_bill_documents.delay(bill_id)
|
||||
↳ Updated bills → fetch_bill_documents.delay(bill_id) if changed
|
||||
↳ Updated bills → emit bill_updated notification if action is a milestone:
|
||||
- "progress" tier: passed/failed, signed/vetoed, enacted, markup, conference,
|
||||
reported from committee, placed on calendar, cloture, roll call
|
||||
→ all follow types (bill, sponsor, topic) receive notification
|
||||
- "referral" tier: referred to committee
|
||||
→ pocket_veto and pocket_boost only; neutral follows silently skip
|
||||
|
||||
2. document_fetcher.fetch_bill_documents(bill_id)
|
||||
↳ Gets text versions from Congress.gov (XML preferred, falls back to HTML/PDF)
|
||||
@@ -641,6 +653,7 @@ All providers implement:
|
||||
```python
|
||||
generate_brief(doc_text, bill_metadata) → ReverseBrief
|
||||
generate_amendment_brief(new_text, prev_text, bill_metadata) → ReverseBrief
|
||||
generate_text(prompt) → str # free-form text, used by draft letter generator
|
||||
```
|
||||
|
||||
### ReverseBrief Dataclass
|
||||
@@ -649,8 +662,8 @@ generate_amendment_brief(new_text, prev_text, bill_metadata) → ReverseBrief
|
||||
@dataclass
|
||||
class ReverseBrief:
|
||||
summary: str
|
||||
key_points: list[dict] # [{text, citation, quote}]
|
||||
risks: list[dict] # [{text, citation, quote}]
|
||||
key_points: list[dict] # [{text, citation, quote, label}]
|
||||
risks: list[dict] # [{text, citation, quote, label}]
|
||||
deadlines: list[dict] # [{date, description}]
|
||||
topic_tags: list[str]
|
||||
llm_provider: str
|
||||
@@ -664,16 +677,28 @@ class ReverseBrief:
|
||||
{
|
||||
"summary": "2-4 paragraph plain-language explanation",
|
||||
"key_points": [
|
||||
{"text": "claim", "citation": "Section X(y)", "quote": "verbatim excerpt ≤80 words"}
|
||||
{
|
||||
"text": "claim",
|
||||
"citation": "Section X(y)",
|
||||
"quote": "verbatim excerpt ≤80 words",
|
||||
"label": "cited_fact"
|
||||
}
|
||||
],
|
||||
"risks": [
|
||||
{"text": "concern", "citation": "Section X(y)", "quote": "verbatim excerpt ≤80 words"}
|
||||
{
|
||||
"text": "concern",
|
||||
"citation": "Section X(y)",
|
||||
"quote": "verbatim excerpt ≤80 words",
|
||||
"label": "inference"
|
||||
}
|
||||
],
|
||||
"deadlines": [{"date": "YYYY-MM-DD or null", "description": "..."}],
|
||||
"topic_tags": ["healthcare", "taxation"]
|
||||
}
|
||||
```
|
||||
|
||||
`label` classification rules baked into the system prompt: `"cited_fact"` if the claim is explicitly stated in the quoted text; `"inference"` if it is an analytical interpretation, projection, or implication not literally stated. The UI shows a neutral "Inferred" badge on inference items only (cited_fact is the clean default).
|
||||
|
||||
**Amendment brief prompt** focuses on what changed between document versions.
|
||||
|
||||
**Smart truncation:** Bills exceeding the token budget are trimmed — 75% of budget from the start (preamble/purpose), 25% from the end (enforcement/effective dates), with an omission notice in the middle.
|
||||
@@ -715,6 +740,7 @@ Renders the LLM brief. For cited items (new format), shows a `§ Section X(y)` c
|
||||
- Blockquoted verbatim excerpt from the bill
|
||||
- "View source →" link to GovInfo (opens in new tab)
|
||||
- One chip open at a time per card
|
||||
- Inference items show a neutral "Inferred" badge (analytical interpretation, not a literal quote)
|
||||
- Old plain-string briefs render without chips (graceful backward compat)
|
||||
|
||||
**`ActionTimeline.tsx`**
|
||||
@@ -729,8 +755,11 @@ Client component wrapping the entire app. Waits for Zustand hydration, then redi
|
||||
**`Sidebar.tsx`**
|
||||
Navigation with: Home, Bills, Members, Following, Topics, Settings (admin only). Shows current user email + logout button at the bottom. Accepts optional `onClose` prop — when provided (mobile drawer context), renders an X close button in the header and calls `onClose` on every nav link click.
|
||||
|
||||
**`DraftLetterPanel.tsx`**
|
||||
Collapsible panel rendered below `BriefPanel` on the bill detail page (only when a brief exists). Lets users select up to 3 cited points from the brief, choose stance (YES/NO), tone (short/polite/firm), and optionally enter a ZIP code (not stored). Stance auto-populates from the user's follow mode (`pocket_boost` → YES, `pocket_veto` → NO); clears if they unfollow. Recipient (house/senate) is derived from the bill's chamber. Calls `POST /{bill_id}/draft-letter` and renders the plain-text draft in a readonly textarea with a copy-to-clipboard button.
|
||||
|
||||
**`BillCard.tsx`**
|
||||
Compact bill preview showing bill ID, title, sponsor with party badge, latest action date, and status.
|
||||
Compact bill preview showing bill ID, title, sponsor with party badge, latest action date, status, and a text availability indicator: `Brief` (green, analysis done) / `Pending` (amber, text retrieved but not yet analysed) / `No text` (muted, nothing published on Congress.gov).
|
||||
|
||||
**`TrendChart.tsx`**
|
||||
Line chart of `composite_score` over time with tooltip breakdown of each data source.
|
||||
@@ -805,7 +834,7 @@ Separate queues prevent a flood of LLM tasks from blocking time-sensitive pollin
|
||||
All LLM providers implement the same interface. Switching providers is a single admin setting change — no code changes, no restart required (the factory reads from DB on each task invocation).
|
||||
|
||||
### JSONB for Flexible Brief Storage
|
||||
`key_points`, `risks`, `deadlines`, `topic_tags` are stored as JSONB. This means the schema change from `list[str]` to `list[{text, citation, quote}]` required no migration — only the LLM prompt and application code changed. Old string-format briefs and new cited-object briefs coexist in the same column.
|
||||
`key_points`, `risks`, `deadlines`, `topic_tags` are stored as JSONB. This means schema changes (adding `citation`/`quote` in v0.2.0, adding `label` in v0.6.0) required no migrations — only the LLM prompt and application code changed. Old string-format briefs, cited-object briefs without labels, and fully-labelled briefs all coexist in the same column and render correctly at each fidelity level.
|
||||
|
||||
### Redis-backed Beat Schedule (RedBeat)
|
||||
The Celery Beat schedule is stored in Redis rather than in memory. This means the beat scheduler can restart without losing schedule state or double-firing tasks.
|
||||
@@ -915,6 +944,55 @@ Nginx uses `resolver 127.0.0.11 valid=10s` (Docker's internal DNS) so upstream c
|
||||
- `introduced_date` shown conditionally (not rendered when null, preventing "Introduced: —")
|
||||
- Admin reprocess endpoint: `POST /api/admin/bills/{bill_id}/reprocess`
|
||||
|
||||
### v0.5.0 — Follow Modes, Public Browsing & Draft Letter Generator
|
||||
|
||||
**Follow Modes:**
|
||||
- `follow_mode` column on `follows` table: `neutral | pocket_veto | pocket_boost`
|
||||
- `FollowButton` replaced with a mode-selector dropdown (shield/zap/heart icons, descriptions for each mode)
|
||||
- `pocket_veto` — alert only on advancement milestones; `pocket_boost` — all changes + action prompts
|
||||
- Mode stored per-follow; respected by notification dispatcher
|
||||
|
||||
**Public Browsing:**
|
||||
- Unauthenticated guests can browse bills, members, topics, and the trending dashboard
|
||||
- `AuthModal` gates follow and other interactive actions
|
||||
- Sidebar and nav adapt to guest state (no email/logout shown)
|
||||
- All public endpoints already auth-free; guard refactored to allow guest reads
|
||||
|
||||
**Draft Constituent Letter Generator (email_gen):**
|
||||
- `DraftLetterPanel.tsx` — collapsible UI below `BriefPanel` for bills with a brief
|
||||
- User selects up to 3 cited points from the brief, picks stance (YES/NO), tone, optional ZIP (not stored)
|
||||
- Stance pre-fills from follow mode; clears on unfollow (ref-tracked, not effect-guarded)
|
||||
- Recipient derived from bill chamber — no dropdown needed
|
||||
- `POST /api/bills/{bill_id}/draft-letter` endpoint: reads LLM provider/model from `AppSetting` (respects Settings page), wraps LLM errors with human-readable messages (quota, rate limit, auth)
|
||||
- `generate_text(prompt) → str` added to `LLMProvider` ABC and all four providers
|
||||
|
||||
**Bill Text Status Indicators:**
|
||||
- `has_document` field added to `BillSchema` (list endpoint) via a single batch `SELECT DISTINCT` — no per-card queries
|
||||
- `BillCard` shows: `Brief` (green) / `Pending` (amber) / `No text` (muted) based on brief + document state
|
||||
|
||||
### v0.6.0 — Phase 2: Change-driven Alerts & Fact/Inference Labeling
|
||||
|
||||
**Change-driven Alerts:**
|
||||
- `notification_utils.py` milestone keyword list expanded: added `"markup"` (markup sessions) and `"conference"` (conference committee)
|
||||
- New `is_referral_action()` classifier for committee referrals (`"referred to"`)
|
||||
- Two-tier notification system: `milestone_tier` field in `NotificationEvent.payload`
|
||||
- `"progress"` — high-signal milestones (passed, signed, markup, etc.): all follow types notified
|
||||
- `"referral"` — committee referral: pocket_veto and pocket_boost notified; neutral silently dropped
|
||||
- **Topic followers now receive `bill_updated` milestone notifications** — previously they only received `new_document`/`new_amendment` events. Fixed by querying the bill's latest brief for `topic_tags` inside `_update_bill_if_changed()`
|
||||
- All three follow types (bill, sponsor, topic) covered for both tiers
|
||||
|
||||
**Fact vs Inference Labeling:**
|
||||
- `label: "cited_fact" | "inference"` added to every cited key_point and risk in the LLM JSON schema
|
||||
- System prompt updated for all four providers (OpenAI, Anthropic, Gemini, Ollama)
|
||||
- UI: neutral "Inferred" badge shown next to inference items in `AIBriefCard`; cited_fact items render cleanly without a badge
|
||||
- `backfill_brief_labels` Celery task: classifies existing cited points in-place — one compact LLM call per brief (all points batched), updates JSONB with `flag_modified`, no brief re-generation
|
||||
- `POST /api/admin/backfill-labels` endpoint + "Backfill Fact/Inference Labels" button in Admin panel
|
||||
- `unlabeled_briefs` counter added to `/api/admin/stats` and pipeline breakdown table
|
||||
|
||||
**Admin Panel Cleanup:**
|
||||
- Manual Controls split into two sections: always-visible recurring controls (Poll, Members, Trends, Actions, Resume) and a collapsible **Maintenance** section for one-time backfill tasks
|
||||
- Maintenance section header shows "⚠ action needed" when any backfill has a non-zero count
|
||||
|
||||
### v0.2.2 — Sponsor Linking & Search Fixes
|
||||
- **Root cause fixed:** Congress.gov list API does not return sponsor data — only the detail endpoint does. Poller now calls the detail endpoint for each new bill to get the sponsor and populate `bill.sponsor_id`
|
||||
- **Backfill task:** `backfill_sponsor_ids` Celery task + `/api/admin/backfill-sponsors` endpoint + "Backfill Sponsors" button in Admin UI — fixes existing bills with `NULL` sponsor_id (~10 req/sec, ~3 min for 1,600 bills)
|
||||
|
||||
Reference in New Issue
Block a user