Create a pipeline that checks the health of climate tech repos on a daily schedule and updates their status in the database. For each project in `climate_projects`, the pipeline should: 1. Call GitHub API (`/repos/{owner}/{repo}`) to fetch: `stargazers_count`, `pushed_at`, `open_issues_count` 2. Call GitHub Issues API to count issues labeled `good-first-issue` 3. Check if a `CONTRIBUTING.md` file exists (`/repos/{owner}/{repo}/contents/CONTRIBUTING.md`) Health score (0-100): - Activity (0-40): score based on days since last commit. >90 days = 0, 0-7 days = 40, interpolate linearly - Maintenance (0-30): issues with responses / total issues (cap at 30). Use `comments` count as proxy. - Contributor friendliness (0-30): 15 pts if `good-first-issue` count > 0, 15 pts if CONTRIBUTING.md exists Update `climate_projects` with `{ health_score, stars, has_good_first_issues, last_commit_at, health_checked_at }`. Infrastructure: - Run as a Vercel Cron Job (`/api/cron/health-check`, daily at 02:00 UTC) - Protect the route with `CRON_SECRET` header check - Rate limit GitHub API calls: max 1 req/sec, exponential backoff on 429 - Authenticate with `GITHUB_TOKEN` env var (optional — degrades to 60 req/hr unauthenticated) Write integration tests with mocked GitHub API responses covering: healthy repo, stale repo, rate limited response, missing repo (404).
No contributions yet.