All checks were successful
news-summary-bot-cicd / build_push_deploy (push) Successful in 9m6s
OCI 서버에서 YouTube 봇 감지로 yt-dlp 차단됨. 자막 전용 라이브러리로 교체하여 클라우드 IP 환경에서도 동작하도록 수정. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
20 lines
652 B
Python
20 lines
652 B
Python
from youtube_transcript_api import YouTubeTranscriptApi
|
|
|
|
|
|
def extract_video_id(url: str) -> str:
|
|
"""YouTube URL에서 video ID 추출."""
|
|
if "youtu.be/" in url:
|
|
return url.split("youtu.be/")[1].split("?")[0]
|
|
if "v=" in url:
|
|
return url.split("v=")[1].split("&")[0]
|
|
raise ValueError(f"유효하지 않은 YouTube URL: {url}")
|
|
|
|
|
|
def fetch_transcript(video_id: str) -> str:
|
|
"""YouTube 자막을 텍스트로 추출."""
|
|
ytt_api = YouTubeTranscriptApi()
|
|
transcript = ytt_api.fetch(video_id, languages=["ko", "en"])
|
|
|
|
texts = [entry.text for entry in transcript if entry.text.strip()]
|
|
return " ".join(texts)
|