docs: add comprehensive architecture documentation

Covers full stack, database schema, API endpoints, Celery pipeline,
LLM service design, frontend structure, auth, deployment, and feature
history through v0.2.0.

Authored-By: Jack Levy
This commit is contained in:
Jack Levy
2026-02-28 23:07:43 -05:00
parent a111731cb4
commit 795385dcba

796
ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,796 @@
# PocketVeto — Architecture & Feature Documentation
> **App brand:** PocketVeto
> **Repo:** civicstack
> **Purpose:** Citizen-grade US Congress monitoring with AI-powered bill analysis, per-claim citations, and personalized tracking.
---
## Table of Contents
1. [Overview](#overview)
2. [Tech Stack](#tech-stack)
3. [Infrastructure & Docker](#infrastructure--docker)
4. [Configuration & Environment](#configuration--environment)
5. [Database Schema](#database-schema)
6. [Alembic Migrations](#alembic-migrations)
7. [Backend API](#backend-api)
8. [Celery Workers & Pipeline](#celery-workers--pipeline)
9. [LLM Service](#llm-service)
10. [Frontend](#frontend)
11. [Authentication](#authentication)
12. [Key Architectural Patterns](#key-architectural-patterns)
13. [Feature History](#feature-history)
14. [Deployment](#deployment)
---
## Overview
PocketVeto is a self-hosted, full-stack application that automatically tracks US Congress legislation, fetches bill text, generates AI summaries with per-claim source citations, correlates bills with news and Google Trends, and presents everything through a personalized dashboard. Users follow bills, members of Congress, and policy topics; the system surfaces relevant activity in their feed.
```
Congress.gov API → Poller → DB → Document Fetcher → GovInfo
LLM Processor
BillBrief
(cited AI brief)
News Fetcher + Trend Scorer
Next.js Frontend
```
---
## Tech Stack
| Layer | Technology |
|---|---|
| Reverse Proxy | Nginx (alpine) |
| Backend API | FastAPI + SQLAlchemy (async) |
| Task Queue | Celery 5 + Redis |
| Task Scheduler | Celery Beat + RedBeat (Redis-backed) |
| Database | PostgreSQL 16 |
| Cache / Broker | Redis 7 |
| Frontend | Next.js 15, React, Tailwind CSS, TypeScript |
| Auth | JWT (python-jose) + bcrypt (passlib) |
| LLM | Multi-provider factory: OpenAI, Anthropic, Gemini, Ollama |
| Bill Metadata | Congress.gov API (api.data.gov key) |
| Bill Text | GovInfo API (same api.data.gov key) |
| News | NewsAPI.org (100 req/day free tier) |
| Trends | Google Trends via pytrends |
---
## Infrastructure & Docker
### Services (`docker-compose.yml`)
```
postgres:16-alpine
DB: pocketveto
User: congress
Port: 5432 (internal)
redis:7-alpine
Port: 6379 (internal)
Role: Celery broker, result backend, RedBeat schedule store
api (civicstack-api image)
Port: 8000 (internal)
Command: alembic upgrade head && uvicorn app.main:app --host 0.0.0.0 --port 8000
Depends: postgres (healthy), redis (healthy)
worker (civicstack-worker image)
Command: celery -A app.workers.celery_app worker -Q polling,documents,llm,news -c 4
Depends: postgres (healthy), redis (healthy)
beat (civicstack-beat image)
Command: celery -A app.workers.celery_app beat -S redbeat.RedBeatScheduler
Depends: redis (healthy)
frontend (civicstack-frontend image)
Port: 3000 (internal)
Build: Next.js standalone output
nginx:alpine
Port: 80 → public
Routes: /api/* → api:8000 | /* → frontend:3000
```
### Nginx Config (`nginx/nginx.conf`)
- `resolver 127.0.0.11 valid=10s` — re-resolves Docker DNS after container restarts (prevents stale-IP 502s on redeploy)
- `/api/` → FastAPI, 120s read timeout
- `/_next/static/` → frontend with 1-day cache header
- `/` → frontend with WebSocket upgrade support
---
## Configuration & Environment
Copy `.env.example``.env` and fill in keys before first run.
```env
# Network
LOCAL_URL=http://localhost
PUBLIC_URL= # optional, e.g. https://yourapp.com
# Auth
JWT_SECRET_KEY= # python -c "import secrets; print(secrets.token_hex(32))"
# PostgreSQL
POSTGRES_USER=congress
POSTGRES_PASSWORD=congress
POSTGRES_DB=pocketveto
# Redis
REDIS_URL=redis://redis:6379/0
# Congress.gov + GovInfo (shared key from api.data.gov)
DATA_GOV_API_KEY=
CONGRESS_POLL_INTERVAL_MINUTES=30
# LLM — pick one provider
LLM_PROVIDER=openai # openai | anthropic | gemini | ollama
OPENAI_API_KEY=
OPENAI_MODEL=gpt-4o
ANTHROPIC_API_KEY=
ANTHROPIC_MODEL=claude-opus-4-6
GEMINI_API_KEY=
GEMINI_MODEL=gemini-1.5-pro
OLLAMA_BASE_URL=http://host.docker.internal:11434
OLLAMA_MODEL=llama3.1
# News & Trends
NEWSAPI_KEY=
PYTRENDS_ENABLED=true
```
**Runtime overrides:** LLM provider/model and poll interval can be changed live through the Admin page — stored in the `app_settings` table and take precedence over env vars.
---
## Database Schema
### `bills`
Primary key: `bill_id` — natural key in format `{congress}-{type}-{number}` (e.g. `119-hr-1234`).
| Column | Type | Notes |
|---|---|---|
| bill_id | varchar (PK) | |
| congress_number | int | |
| bill_type | varchar | `hr`, `s`, `hjres`, `sjres` (tracked); `hres`, `sres`, `hconres`, `sconres` (not tracked) |
| bill_number | int | |
| title | text | |
| short_title | text | |
| sponsor_id | varchar (FK → members) | bioguide_id |
| introduced_date | date | |
| latest_action_date | date | |
| latest_action_text | text | |
| status | varchar | |
| chamber | varchar | House / Senate |
| congress_url | varchar | congress.gov link |
| govtrack_url | varchar | |
| last_checked_at | timestamptz | |
| actions_fetched_at | timestamptz | |
| created_at / updated_at | timestamptz | |
Indexes: `congress_number`, `latest_action_date`, `introduced_date`, `chamber`, `sponsor_id`
---
### `bill_actions`
| Column | Type | Notes |
|---|---|---|
| id | int (PK) | |
| bill_id | varchar (FK → bills, CASCADE) | |
| action_date | date | |
| action_text | text | |
| action_type | varchar | |
| chamber | varchar | |
| created_at | timestamptz | |
---
### `bill_documents`
Stores fetched bill text versions from GovInfo.
| Column | Type | Notes |
|---|---|---|
| id | int (PK) | |
| bill_id | varchar (FK → bills, CASCADE) | |
| doc_type | varchar | `bill_text`, `committee_report`, `amendment` |
| doc_version | varchar | Introduced, Enrolled, etc. |
| govinfo_url | varchar | Source URL on GovInfo |
| raw_text | text | Full extracted text |
| fetched_at | timestamptz | |
| created_at | timestamptz | |
---
### `bill_briefs`
AI-generated analysis. `key_points` and `risks` are JSONB arrays of cited objects.
| Column | Type | Notes |
|---|---|---|
| id | int (PK) | |
| bill_id | varchar (FK → bills, CASCADE) | |
| document_id | int (FK → bill_documents, SET NULL) | |
| brief_type | varchar | `full` (first version) or `amendment` (diff from prior version) |
| summary | text | 2-4 paragraph plain-language summary |
| key_points | jsonb | `[{text, citation, quote}]` |
| risks | jsonb | `[{text, citation, quote}]` |
| deadlines | jsonb | `[{date, description}]` |
| topic_tags | jsonb | `["healthcare", "taxation", ...]` |
| llm_provider | varchar | Which provider generated this brief |
| llm_model | varchar | Specific model name |
| govinfo_url | varchar (nullable) | Source document URL (from bill_documents) |
| created_at | timestamptz | |
Indexes: `bill_id`, `topic_tags` (GIN for JSONB containment queries)
**Citation structure** — each `key_points`/`risks` item:
```json
{
"text": "The bill allocates $50B for defense",
"citation": "Section 301(a)(2)",
"quote": "There is hereby appropriated for fiscal year 2026, $50,000,000,000 for the Department of Defense..."
}
```
---
### `members`
Primary key: `bioguide_id` (Congress.gov canonical identifier).
| Column | Type |
|---|---|
| bioguide_id | varchar (PK) |
| name | varchar |
| first_name / last_name | varchar |
| party | varchar |
| state | varchar |
| chamber | varchar |
| district | varchar (nullable, House only) |
| photo_url | varchar (nullable) |
| created_at / updated_at | timestamptz |
---
### `users`
| Column | Type | Notes |
|---|---|---|
| id | int (PK) | |
| email | varchar (unique) | |
| hashed_password | varchar | bcrypt |
| is_admin | bool | First registered user = true |
| notification_prefs | jsonb | Future: ntfy, Telegram, RSS config |
| created_at | timestamptz | |
---
### `follows`
| Column | Type | Notes |
|---|---|---|
| id | int (PK) | |
| user_id | int (FK → users, CASCADE) | |
| follow_type | varchar | `bill`, `member`, `topic` |
| follow_value | varchar | bill_id, bioguide_id, or topic name |
| created_at | timestamptz | |
Unique constraint: `(user_id, follow_type, follow_value)`
---
### `news_articles`
| Column | Type | Notes |
|---|---|---|
| id | int (PK) | |
| bill_id | varchar (FK → bills, CASCADE) | |
| source | varchar | News outlet |
| headline | varchar | |
| url | varchar (unique) | Deduplication key |
| published_at | timestamptz | |
| relevance_score | float | Default 1.0 |
| created_at | timestamptz | |
---
### `trend_scores`
One record per bill per day.
| Column | Type | Notes |
|---|---|---|
| id | int (PK) | |
| bill_id | varchar (FK → bills, CASCADE) | |
| score_date | date | |
| newsapi_count | int | Articles from NewsAPI (30-day window) |
| gnews_count | int | Articles from Google News RSS |
| gtrends_score | float | Google Trends interest 0100 |
| composite_score | float | Weighted combination 0100 |
| created_at | timestamptz | |
**Composite score formula:**
```
newsapi_pts = min(newsapi_count / 20, 1.0) × 40 # saturates at 20 articles
gnews_pts = min(gnews_count / 50, 1.0) × 30 # saturates at 50 articles
gtrends_pts = (gtrends_score / 100) × 30
composite = newsapi_pts + gnews_pts + gtrends_pts # range 0100
```
---
### `committees` / `committee_bills`
| committees | committee_id (PK), name, chamber, type |
|---|---|
| committee_bills | id, committee_id (FK), bill_id (FK), referred_date |
---
### `app_settings`
Key-value store for runtime-configurable settings.
| Key | Purpose |
|---|---|
| `congress_last_polled_at` | ISO timestamp of last successful poll |
| `llm_provider` | Overrides `LLM_PROVIDER` env var |
| `llm_model` | Overrides provider default model |
| `congress_poll_interval_minutes` | Overrides env var |
---
## Alembic Migrations
| File | Description |
|---|---|
| `0001_initial_schema.py` | All initial tables |
| `0002_widen_chamber_party_columns.py` | Wider varchar for Bill.chamber, Member.party |
| `0003_widen_member_state_district.py` | Wider varchar for Member.state, Member.district |
| `0004_add_brief_type.py` | BillBrief.brief_type column (`full`/`amendment`) |
| `0005_add_users_and_user_follows.py` | users table + user_id FK on follows; drops global follows |
| `0006_add_brief_govinfo_url.py` | BillBrief.govinfo_url for frontend source links |
Migrations run automatically on API startup: `alembic upgrade head`.
---
## Backend API
Base URL: `/api`
Auth header: `Authorization: Bearer <jwt>`
### `/api/auth`
| Method | Path | Auth | Description |
|---|---|---|---|
| POST | `/register` | — | Create account. First user → admin. Returns token + user. |
| POST | `/login` | — | Returns token + user. |
| GET | `/me` | Required | Current user info. |
### `/api/bills`
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/` | — | Paginated bill list. Query: `chamber`, `topic`, `sponsor_id`, `q`, `page`, `per_page`, `sort`. |
| GET | `/{bill_id}` | — | Full bill detail with sponsor, actions, briefs, news, trend scores. |
| GET | `/{bill_id}/actions` | — | Action timeline, newest first. |
| GET | `/{bill_id}/news` | — | Related news articles, limit 20. |
| GET | `/{bill_id}/trend` | — | Trend score history. Query: `days` (7365, default 30). |
### `/api/members`
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/` | — | Paginated members. Query: `chamber`, `party`, `state`, `q`, `page`, `per_page`. |
| GET | `/{bioguide_id}` | — | Member detail. |
| GET | `/{bioguide_id}/bills` | — | Member's sponsored bills, paginated. |
### `/api/follows`
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/` | Required | Current user's follows. |
| POST | `/` | Required | Add follow `{follow_type, follow_value}`. Idempotent. |
| DELETE | `/{id}` | Required | Remove follow (ownership checked). |
### `/api/dashboard`
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/` | Required | Personalized feed from followed bills/members/topics + trending. Returns `{feed, trending, follows}`. |
### `/api/search`
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/` | — | Full-text search. Query: `q` (min 2 chars). Returns `{bills, members}`. |
### `/api/settings`
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/` | Required | Current settings (DB overrides env). |
| PUT | `/` | Admin | Update `{key, value}`. Allowed keys: `llm_provider`, `llm_model`, `congress_poll_interval_minutes`. |
| POST | `/test-llm` | Admin | Test LLM connection. Returns `{status, provider, model, summary_preview}`. |
### `/api/admin`
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | `/users` | Admin | All users with follow counts. |
| DELETE | `/users/{id}` | Admin | Delete user (cannot delete self). Cascades follows. |
| PATCH | `/users/{id}/toggle-admin` | Admin | Promote/demote admin status (cannot change self). |
| GET | `/stats` | Admin | Pipeline progress: total bills, docs fetched, briefs generated, remaining. |
| POST | `/trigger-poll` | Admin | Queue immediate Congress.gov poll. |
| POST | `/trigger-member-sync` | Admin | Queue member sync. |
| POST | `/trigger-trend-scores` | Admin | Queue trend score calculation. |
| GET | `/task-status/{task_id}` | Admin | Celery task status and result. |
### `/api/health`
| Method | Path | Description |
|---|---|---|
| GET | `/` | Simple health check `{status: "ok", timestamp}`. |
| GET | `/detailed` | Tests PostgreSQL + Redis. Returns per-service status. |
---
## Celery Workers & Pipeline
**Celery app name:** `pocketveto`
**Broker / Backend:** Redis
### Queue Routing
| Queue | Workers | Tasks |
|---|---|---|
| `polling` | worker | `poll_congress_bills`, `sync_members` |
| `documents` | worker | `fetch_bill_documents` |
| `llm` | worker | `process_document_with_llm` |
| `news` | worker | `fetch_news_for_bill`, `fetch_news_for_active_bills`, `calculate_all_trend_scores` |
**Worker settings:**
- `task_acks_late = True` — task removed from queue only after completion, not on pickup
- `worker_prefetch_multiplier = 1` — prevents workers from hoarding LLM tasks
- Serialization: JSON
### Beat Schedule (RedBeat, stored in Redis)
| Schedule | Task | When |
|---|---|---|
| Configurable (default 30 min) | `poll_congress_bills` | Continuous |
| Every 6 hours | `fetch_news_for_active_bills` | Ongoing |
| Daily 2 AM UTC | `calculate_all_trend_scores` | Nightly |
---
### Pipeline Flow
```
1. congress_poller.poll_congress_bills()
↳ Fetches bills updated since last poll (fromDateTime param)
↳ Filters: only hr, s, hjres, sjres (legislation that can become law)
↳ First run: seeds from 60 days back
↳ New bills → fetch_bill_documents.delay(bill_id)
↳ Updated bills → fetch_bill_documents.delay(bill_id) if changed
2. document_fetcher.fetch_bill_documents(bill_id)
↳ Gets text versions from Congress.gov (XML preferred, falls back to HTML/PDF)
↳ Fetches raw text from GovInfo
↳ Idempotent: skips if doc_version already stored
↳ Stores BillDocument with govinfo_url + raw_text
↳ → process_document_with_llm.delay(document_id)
3. llm_processor.process_document_with_llm(document_id)
↳ Rate limited: 10/minute
↳ Idempotent: skips if brief exists for document
↳ Determines type:
- No prior brief → "full" brief
- Prior brief exists → "amendment" brief (diff vs previous)
↳ Calls configured LLM provider
↳ Stores BillBrief with cited key_points and risks
↳ → fetch_news_for_bill.delay(bill_id)
4. news_fetcher.fetch_news_for_bill(bill_id)
↳ Queries NewsAPI using bill title + topic_tags
↳ Deduplicates by URL
↳ Stores NewsArticle records
5. trend_scorer.calculate_all_trend_scores() [nightly]
↳ Bills active in last 90 days
↳ Skips bills already scored today
↳ Fetches: NewsAPI count + Google News RSS count + Google Trends score
↳ Calculates composite_score (0100)
↳ Stores TrendScore record
```
---
## LLM Service
**File:** `backend/app/services/llm_service.py`
### Provider Factory
```python
get_llm_provider() LLMProvider
```
Reads `LLM_PROVIDER` from AppSetting (DB) then env var. Instantiates the matching provider class.
| Provider | Class | Key Setting |
|---|---|---|
| `openai` | `OpenAIProvider` | `OPENAI_API_KEY`, `OPENAI_MODEL` |
| `anthropic` | `AnthropicProvider` | `ANTHROPIC_API_KEY`, `ANTHROPIC_MODEL` |
| `gemini` | `GeminiProvider` | `GEMINI_API_KEY`, `GEMINI_MODEL` |
| `ollama` | `OllamaProvider` | `OLLAMA_BASE_URL`, `OLLAMA_MODEL` |
All providers implement:
```python
generate_brief(doc_text, bill_metadata) ReverseBrief
generate_amendment_brief(new_text, prev_text, bill_metadata) ReverseBrief
```
### ReverseBrief Dataclass
```python
@dataclass
class ReverseBrief:
summary: str
key_points: list[dict] # [{text, citation, quote}]
risks: list[dict] # [{text, citation, quote}]
deadlines: list[dict] # [{date, description}]
topic_tags: list[str]
llm_provider: str
llm_model: str
```
### Prompt Design
**Full brief prompt** instructs the LLM to produce:
```json
{
"summary": "2-4 paragraph plain-language explanation",
"key_points": [
{"text": "claim", "citation": "Section X(y)", "quote": "verbatim excerpt ≤80 words"}
],
"risks": [
{"text": "concern", "citation": "Section X(y)", "quote": "verbatim excerpt ≤80 words"}
],
"deadlines": [{"date": "YYYY-MM-DD or null", "description": "..."}],
"topic_tags": ["healthcare", "taxation"]
}
```
**Amendment brief prompt** focuses on what changed between document versions.
**Smart truncation:** Bills exceeding the token budget are trimmed — 75% of budget from the start (preamble/purpose), 25% from the end (enforcement/effective dates), with an omission notice in the middle.
**Token budgets:**
- OpenAI / Anthropic / Gemini: 6,000 tokens
- Ollama: 3,000 tokens (local models have smaller context windows)
---
## Frontend
**Framework:** Next.js 15 (App Router), TypeScript, Tailwind CSS
**State:** Zustand (auth), TanStack Query (server state)
**HTTP:** Axios with JWT interceptor
### Pages
| Route | Description |
|---|---|
| `/` | Dashboard — personalized feed + trending bills |
| `/bills` | Browse all bills with search, chamber/topic filters, pagination |
| `/bills/[id]` | Bill detail — brief with § citations, action timeline, news, trend chart |
| `/members` | Browse members of Congress, filter by chamber/party/state |
| `/members/[id]` | Member profile + sponsored bills |
| `/following` | User's followed bills, members, and topics |
| `/topics` | Browse and follow policy topics |
| `/settings` | Admin panel (admin only) |
| `/login` | Email + password sign-in |
| `/register` | Account creation |
### Key Components
**`AIBriefCard.tsx`**
Renders the LLM brief. For cited items (new format), shows a `§ Section X(y)` chip next to each bullet. Clicking the chip expands an inline panel with:
- Blockquoted verbatim excerpt from the bill
- "View source →" link to GovInfo (opens in new tab)
- One chip open at a time per card
- Old plain-string briefs render without chips (graceful backward compat)
**`AuthGuard.tsx`**
Client component wrapping the entire app. Waits for Zustand hydration, then redirects unauthenticated users to `/login`. Public paths (`/login`, `/register`) bypass the guard.
**`Sidebar.tsx`**
Navigation with: Home, Bills, Members, Following, Topics, Settings (admin only). Shows current user email + logout button at the bottom.
**`BillCard.tsx`**
Compact bill preview showing bill ID, title, sponsor with party badge, latest action date, and status.
**`TrendChart.tsx`**
Line chart of `composite_score` over time with tooltip breakdown of each data source.
### Utility Functions (`lib/utils.ts`)
```typescript
partyBadgeColor(party) Tailwind classes
"Republican" "bg-red-600 text-white"
"Democrat" "bg-blue-600 text-white"
other "bg-slate-500 text-white"
partyColor(party) text color class (used inline)
trendColor(score) color class based on score thresholds
billLabel(type, number) "H.R. 1234", "S. 567", etc.
formatDate(date) "Feb 28, 2026"
```
### Auth Store (`stores/authStore.ts`)
```typescript
interface AuthState {
token: string | null
user: { id: number; email: string; is_admin: boolean } | null
setAuth(token, user): void
logout(): void
}
// Persisted to localStorage as "pocketveto-auth"
```
---
## Authentication
- **Algorithm:** HS256 JWT, 7-day expiry
- **Storage:** Zustand store persisted to `localStorage` key `pocketveto-auth`
- **Injection:** Axios request interceptor reads from localStorage and adds `Authorization: Bearer <token>` to every request
- **First user:** The first account registered automatically receives `is_admin = true`
- **Admin role:** Required for PUT/POST `/api/settings`, all `/api/admin/*` endpoints, and viewing the Settings page in the UI
- **No email verification:** Accounts are active immediately on registration
- **Public endpoints:** `/api/bills`, `/api/members`, `/api/search`, `/api/health` — no auth required
---
## Key Architectural Patterns
### Idempotent Workers
Every Celery task checks for existing records before processing. Combined with `task_acks_late=True`, this means:
- Tasks can be retried without creating duplicates
- Worker crashes don't lose work (task stays in queue until acknowledged)
### Incremental Polling
The Congress.gov poller uses `fromDateTime` to fetch only recently updated bills, tracking the last poll timestamp in `app_settings`. On first run it seeds 60 days back to avoid processing thousands of old bills.
### Bill Type Filtering
Only tracks legislation that can become law:
- `hr` (House Resolution → Bill)
- `s` (Senate Bill)
- `hjres` (House Joint Resolution)
- `sjres` (Senate Joint Resolution)
Excluded (procedural, cannot become law): `hres`, `sres`, `hconres`, `sconres`
### Queue Specialization
Separate queues prevent a flood of LLM tasks from blocking time-sensitive polling tasks. Worker prefetch of 1 prevents any single worker from hoarding slow LLM jobs.
### LLM Provider Abstraction
All LLM providers implement the same interface. Switching providers is a single admin setting change — no code changes, no restart required (the factory reads from DB on each task invocation).
### JSONB for Flexible Brief Storage
`key_points`, `risks`, `deadlines`, `topic_tags` are stored as JSONB. This means the schema change from `list[str]` to `list[{text, citation, quote}]` required no migration — only the LLM prompt and application code changed. Old string-format briefs and new cited-object briefs coexist in the same column.
### Redis-backed Beat Schedule (RedBeat)
The Celery Beat schedule is stored in Redis rather than in memory. This means the beat scheduler can restart without losing schedule state or double-firing tasks.
### Docker DNS Re-resolution
Nginx uses `resolver 127.0.0.11 valid=10s` (Docker's internal DNS) so upstream container IPs are refreshed every 10 seconds. Without this, nginx caches the IP at startup and returns 502 errors after any container is recreated.
---
## Feature History
### v0.1.0 — Foundation
- Docker Compose stack: PostgreSQL, Redis, FastAPI, Celery, Next.js, Nginx
- Congress.gov API integration: bill polling, member sync
- GovInfo document fetching with intelligent truncation
- Multi-provider LLM service (OpenAI, Anthropic, Gemini, Ollama)
- AI brief generation: summary, key points, risks, deadlines, topic tags
- Amendment-aware processing: diffs new bill versions against prior
- NewsAPI + Google News RSS article correlation
- Google Trends (pytrends) scoring
- Composite trend score (0100) with weighted formula
- Full-text bill search (PostgreSQL tsvector)
- Member of Congress browsing
- Global follows (bill / member / topic)
- Personalized dashboard feed
- Admin settings page (LLM provider selection, data source status)
- Manual Celery task triggers from UI
- Bill type filtering: only legislation that can become law
- 60-day seed window on fresh install
**Multi-User Auth (added to v0.1.0):**
- Email + password registration/login (JWT, bcrypt)
- Per-user follow scoping
- Admin role (first user = admin)
- Admin user management: list, delete, promote/demote
- AuthGuard with login/register pages
- Analysis status dashboard (auto-refresh every 30s)
### v0.2.0 — Citations
- **Per-claim citations on AI briefs:** every key point and risk includes:
- `citation` — section reference (e.g., "Section 301(a)(2)")
- `quote` — verbatim excerpt ≤80 words from that section
- `§` citation chip UI on each bullet — click to expand quote + GovInfo source link
- `govinfo_url` stored on `BillBrief` for direct frontend access
- Old briefs (plain strings) render without chips — backward compatible
- Migration 0006: `govinfo_url` column on `bill_briefs`
- Party badges redesigned: solid `red-600` / `blue-600` / `slate-500` with white text, readable in both light and dark mode
- Tailwind content scan extended to include `lib/` directory
- Nginx DNS resolver fix: prevents stale-IP 502s after container restarts
---
## Deployment
### First Deploy
```bash
cp .env.example .env
# Edit .env — add API keys, generate JWT_SECRET_KEY
docker compose up --build -d
```
Migrations run automatically. Navigate to the app, register the first account (it becomes admin).
### Updating
```bash
git pull origin main
docker compose up --build -d
docker compose exec nginx nginx -s reload # if nginx wasn't recreated
```
### Useful Commands
```bash
# Check all service status
docker compose ps
# View logs
docker compose logs api --tail=50
docker compose logs worker --tail=50
# Force a bill poll now
# → Admin page → Manual Controls → Trigger Poll
# Check DB column layout
docker compose exec postgres psql -U congress -d pocketveto -c "\d bill_briefs"
# Tail live worker output
docker compose logs -f worker
# Restart a specific service
docker compose restart worker
```
### Bill Regeneration (Optional)
Existing briefs generated before v0.2.0 use plain strings (no citations). To regenerate with citations:
1. Delete existing `bill_briefs` rows (keeps `bill_documents` intact)
2. Re-queue all documents via a one-off script similar to `queue_docs.py`
3. Worker will regenerate using the new cited prompt at 10/minute
4. ~1,000 briefs ≈ 2 hours
This is **optional** — old string briefs render correctly in the UI with no citation chips.