Loading connector details…
Loading connector details…
Choose a unique username to continue using AgentHotspot
by pragmar • Analytics & Monitoring
Advanced search and retrieval for web crawler data, enabling LLMs and agents to filter, analyze, and extract information from archived crawls.
Search and filter large web-archive datasets with boolean and field-specific queries to locate pages, images, or other resource types.
Extract and condense HTML into Markdown or snippets (or request thumbnails) to reduce token usage for summarization, analysis, or QA tasks.
Run automated site audits (SEO, 404, performance) or scrape structured data via XPath/regex across archived crawls.
MCP Webcrawl Server provides a boolean-capable fulltext search interface and resource filtering for web crawler archives, working with multiple crawl formats (ArchiveBox, HTTrack, WARC, wget, Katana, etc.). It supports field-specific queries, extras like Markdown conversion, snippets, regex/XPath extraction, and thumbnails to produce token-efficient results for LLMs. The package is Claude Desktop ready, installable via pip, includes copy-and-paste prompt routines for automated audits (SEO, 404, performance, file audits), and offers an interactive terminal mode for searching remote or local archives without downloads.
Scores are informational only and provided “as is” without warranty. AgentHotspot assumes no liability for actions taken based on these ratings.