# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Commands **Run tests:** ```bash python -m pytest ``` **Run a single test:** ```bash python -m pytest tests/test_service.py::test_crawl_service_bootstrap_saves_posts_without_returning_them ``` **Run the app locally:** ```bash uvicorn app.main:app --host 0.0.0.0 --port 8000 ``` **Docker build:** ```bash docker build -t your-dockerhub-id/hufs-notice-crawler:latest . ``` **Setup (first time):** ```bash python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt ``` ## Architecture FastAPI web service that crawls three HUFS Computer Science Department notice boards and returns only new posts since the last crawl. The state is persisted in PostgreSQL. **Request flow:** n8n (scheduler) → `POST /api/v1/crawl` → `CrawlService` → `HufsCrawler` → PostgreSQL **Layer responsibilities:** - `app/crawler.py` — HTTP + BeautifulSoup scraping. No DB access. Returns raw `PostStub` and `PostDetail` objects. Handles URL encoding to user-facing `subview.do?enc=...` format. - `app/service.py` — Orchestration. Compares scraped `article_id`s against DB to find new posts, fetches details only for new ones, persists results, handles bootstrap mode. - `app/main.py` — FastAPI entrypoint. Two routes: `GET /health`, `POST /api/v1/crawl`. Auto-creates tables on startup via lifespan. - `app/models.py` / `app/db.py` — SQLAlchemy ORM + session management. **Bootstrap mode:** On first run (empty `scraped_posts` table), the service saves all found posts but returns `new_posts: []` to prevent flooding Discord/n8n notifications with old posts. Subsequent runs return only genuinely new posts. **Three boards crawled:** | Key | Name | Board ID | |-----|------|----------| | `notice` | 공지사항 | 1926 | | `archive` | 자료실 | 1927 | | `jobs` | 취업정보 | 1929 | ## Tests Tests use an in-memory SQLite DB (`conftest.py`) and a `FakeCrawler` stub — no real HTTP calls or PostgreSQL required. - `test_api.py` — endpoint shape/status tests (service is mocked) - `test_service.py` — new-post detection logic, bootstrap mode, zero-new-posts path ## CI/CD GitHub Actions (`.github/workflows/deploy.yml`) triggers on push to `main`: 1. SSH into Gitea, clone repo 2. Build and push Docker image to DockerHub (tagged `latest` + optional `[x.y.z]` version from commit message) 3. Deploy via `docker compose -p nkeys-apps -f /nkeysworld/compose.apps.yml pull hufs-notice-crawler` 4. Notify Discord via webhook Required secrets: `NKEY_SSH_PRIVATE_KEY`, `DOCKERHUB_USERNAME`, `DOCKERHUB_TOKEN`, `DISCORD_WEBHOOK` The app runs on an internal Docker network (`nkeysworld-network`) with no exposed ports — n8n calls it as `http://hufs-notice-crawler:8000`.