Feat: [main] cicd 할 때 md 파일 변화는 제외하기
All checks were successful
hufs-notice-crawler-cicd / build_push_deploy (push) Successful in 5m50s
All checks were successful
hufs-notice-crawler-cicd / build_push_deploy (push) Successful in 5m50s
This commit is contained in:
2
.github/workflows/deploy.yml
vendored
2
.github/workflows/deploy.yml
vendored
@@ -3,6 +3,8 @@ name: hufs-notice-crawler-cicd
|
||||
on:
|
||||
push:
|
||||
branches: ["main"]
|
||||
paths-ignore:
|
||||
- "**/*.md"
|
||||
|
||||
jobs:
|
||||
build_push_deploy:
|
||||
|
||||
70
CLAUDE.md
Normal file
70
CLAUDE.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Commands
|
||||
|
||||
**Run tests:**
|
||||
```bash
|
||||
python -m pytest
|
||||
```
|
||||
|
||||
**Run a single test:**
|
||||
```bash
|
||||
python -m pytest tests/test_service.py::test_crawl_service_bootstrap_saves_posts_without_returning_them
|
||||
```
|
||||
|
||||
**Run the app locally:**
|
||||
```bash
|
||||
uvicorn app.main:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
**Docker build:**
|
||||
```bash
|
||||
docker build -t your-dockerhub-id/hufs-notice-crawler:latest .
|
||||
```
|
||||
|
||||
**Setup (first time):**
|
||||
```bash
|
||||
python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
FastAPI web service that crawls three HUFS Computer Science Department notice boards and returns only new posts since the last crawl. The state is persisted in PostgreSQL.
|
||||
|
||||
**Request flow:** n8n (scheduler) → `POST /api/v1/crawl` → `CrawlService` → `HufsCrawler` → PostgreSQL
|
||||
|
||||
**Layer responsibilities:**
|
||||
- `app/crawler.py` — HTTP + BeautifulSoup scraping. No DB access. Returns raw `PostStub` and `PostDetail` objects. Handles URL encoding to user-facing `subview.do?enc=...` format.
|
||||
- `app/service.py` — Orchestration. Compares scraped `article_id`s against DB to find new posts, fetches details only for new ones, persists results, handles bootstrap mode.
|
||||
- `app/main.py` — FastAPI entrypoint. Two routes: `GET /health`, `POST /api/v1/crawl`. Auto-creates tables on startup via lifespan.
|
||||
- `app/models.py` / `app/db.py` — SQLAlchemy ORM + session management.
|
||||
|
||||
**Bootstrap mode:** On first run (empty `scraped_posts` table), the service saves all found posts but returns `new_posts: []` to prevent flooding Discord/n8n notifications with old posts. Subsequent runs return only genuinely new posts.
|
||||
|
||||
**Three boards crawled:**
|
||||
| Key | Name | Board ID |
|
||||
|-----|------|----------|
|
||||
| `notice` | 공지사항 | 1926 |
|
||||
| `archive` | 자료실 | 1927 |
|
||||
| `jobs` | 취업정보 | 1929 |
|
||||
|
||||
## Tests
|
||||
|
||||
Tests use an in-memory SQLite DB (`conftest.py`) and a `FakeCrawler` stub — no real HTTP calls or PostgreSQL required.
|
||||
|
||||
- `test_api.py` — endpoint shape/status tests (service is mocked)
|
||||
- `test_service.py` — new-post detection logic, bootstrap mode, zero-new-posts path
|
||||
|
||||
## CI/CD
|
||||
|
||||
GitHub Actions (`.github/workflows/deploy.yml`) triggers on push to `main`:
|
||||
1. SSH into Gitea, clone repo
|
||||
2. Build and push Docker image to DockerHub (tagged `latest` + optional `[x.y.z]` version from commit message)
|
||||
3. Deploy via `docker compose -p nkeys-apps -f /nkeysworld/compose.apps.yml pull hufs-notice-crawler`
|
||||
4. Notify Discord via webhook
|
||||
|
||||
Required secrets: `NKEY_SSH_PRIVATE_KEY`, `DOCKERHUB_USERNAME`, `DOCKERHUB_TOKEN`, `DISCORD_WEBHOOK`
|
||||
|
||||
The app runs on an internal Docker network (`nkeysworld-network`) with no exposed ports — n8n calls it as `http://hufs-notice-crawler:8000`.
|
||||
Reference in New Issue
Block a user