Loading connector details…
Loading connector details…
Choose a unique username to continue using AgentHotspot
by BryanChasko • Uncategorized
An MCP server agent that transcribes and visually analyzes videos from multiple platforms into unified documents.
Transcribe speech from videos hosted on ScreenPal, YouTube, Twitch, or S3.
Perform detailed visual analysis of video frames to extract UI elements and content.
Generate unified, timestamp-correlated documents combining audio transcription and visual descriptions.
This MCP server agent processes video URLs from ScreenPal, YouTube, Twitch, or S3 by extracting audio for transcription using OpenAI Whisper, extracting key video frames with FFmpeg, and analyzing visuals with Moondream2 VLM. It produces synchronized JSON and Markdown documents combining audio and visual data, enabling comprehensive video content understanding. The agent runs locally, ensuring privacy and seamless integration with the Kiro CLI architecture.