Agent YouTube Journalism is an open-source investigative AI assistant that transcribes, summarizes, analyzes, and answers questions about videos from YouTube — especially those related to Brazilian politics and public interest.
The system uses multi-agent reasoning and Retrieval-Augmented Generation (RAG) to:
- Transcribe YouTube videos in Brazilian Portuguese (using OpenAI Whisper API)
- Summarize the transcript with DeepSeek via Groq Cloud
- Search the web for context (via DuckDuckGo)
- Highlight journalistically relevant parts of the video
- Index and answer questions based on the transcript + general knowledge if needed
🟢 Try it online: https://agentytjournalism.streamlit.app/
- User enters YouTube video URL + API keys
- The app:
- Downloads audio via
yt-dlp
- Transcribes using Whisper (
openai.Audio.transcribe
) - Summarizes with Groq LLM (DeepSeek)
- Searches the web for context
- Highlights journalistic investigation leads
- Indexes the transcript with FAISS
- Downloads audio via
- The user can:
- View the analysis
- Ask questions based on the video (with RAG + LLM knowledge fallback)
The system uses smolagents
to structure the reasoning with a clear cycle:
Thought → Code → Observation
app.py
: Main Streamlit app with two tabs: Analysis & Questionsprocess_video.py
: Orchestrates full pipeline (transcription → summary → highlight → indexing)rag_question_tab.py
: Handles the RAG-based Q&A flow with session stateagent_config.py
: Defines tools and setup forsmolagents
agentstreamlit_app.yaml
: Config file for deployment (Streamlit Community Cloud)prompts.yaml
: Prompt templates used for summarization, analysis, and code reasoning
groq_model.py
: Executes prompts with Groq LLM and truncates long prompts when neededlist_groq_models.py
: Lists all Groq-hosted models available for querying
tools/youtube_transcriber.py
: Downloads and transcribes video audio via Whisper APItools/summarization.py
: Summarizes the transcript using DeepSeektools/web_search.py
: Searches DuckDuckGo for current contexttools/journalistic_highlight.py
: Generates public interest highlightstools/index_transcript.py
: Splits transcript and indexes it with FAISStools/rag_query.py
: Performs RAG query and allows fallback to general LLM knowledgetools/__init__.py
: Makes the tools importable as a module
requirements.txt
: All Python dependencies (tested with Python 3.12)packages.txt
: System dependencies (e.g., ffmpeg, build tools)
Here are potential enhancements for the project:
- Fixed repeated video download when switching tabs
- Added session state to persist transcript/vectorstore between tabs
- Enabled mixed-source RAG answers (video + general knowledge)
- Adjusted
requirements.txt
andpackages.txt
for compatibility
-
Caching
- Save FAISS vectorstore, summary, highlights to disk (
.save_local()
) or usest.cache_data()
- Save FAISS vectorstore, summary, highlights to disk (
-
Better Streamlit UX
- Enable chat-style Q&A with memory
- Show progress for each processing step
- Add "Download PDF" report button
-
Model prompting
- Add clear tags like
[FACT FROM VIDEO]
vs[LLM KNOWLEDGE]
- Add clear tags like
-
Testing
- Add unit/integration tests using
pytest
- Add unit/integration tests using
-
Agent orchestration
- Split into two agents:
VideoAnalysisAgent
andQAAgent
- Optionally adopt CrewAI or LangGraph for more complex flows
- Split into two agents:
MIT License. See LICENSE
file.
Developed by Reinaldo Chaves (@reichaves) — journalist, data scientist, and investigative technologist.
Open an issue or contact via GitHub.