Web Content Fetch MCP

MCPOpen SourceApache-2.047.4

by cyberchitta • Search & Retrieval

An MCP server designed to fetch text content from websites with bot protection for AI assistants.

Example Use Cases

1
Fetch documentation and reference materials from protected websites
2
Extract specific information using regex patterns
3
That require adaptive bot protection mechanisms to access web content

Description

The scrapling-fetch-mcp is an MCP server that enables AI assistants to access text content from websites that implement bot detection. It is optimized for low-volume retrieval of documentation and reference materials, bridging the gap between browser-visible content and AI access. The server provides fetching and pattern extraction capabilities, allowing agents to retrieve complete web pages or specific content based on natural language requests. With different protection modes, it adapts to various levels of site protection.

Capabilities(2 total)

Tools (2)

s_fetch_page

Fetches a complete web page with pagination support. Retrieves content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Content is returned as 'METADATA: {json}\n\n[content]' where metadata includes length information and truncation status. Args: url: URL to fetch mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. start_index: On return output starting at this character index, useful if a previous fetch was truncated and more content is required.

url:string*mode:stringformat:stringmax_length:integerstart_index:integer

s_fetch_pattern

Extracts content matching regex patterns from web pages. Retrieves specific content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Returns matched content as 'METADATA: {json}\n\n[content]' where metadata includes match statistics and truncation information. Each matched content chunk is delimited with '॥๛॥' and prefixed with '[Position: start-end]' indicating its byte position in the original document, allowing targeted follow-up requests with s-fetch-page using specific start_index values. Args: url: URL to fetch search_pattern: Regular expression pattern to search for in the content mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. context_chars: Number of characters to include before and after each match

url:string*search_pattern:string*mode:stringformat:stringmax_length:integercontext_chars:integer

Quick Actions

View on GitHub

Security

Scanned 4 month(s) ago

Risk Level

MINIMAL

Read-only data retrieval, no side effects

Trust Score

C52/100

5/17 checks passed

Scores are informational only and provided “as is” without warranty. AgentHotspot assumes no liability for actions taken based on these ratings.

Quick Stats

Service TypeMCP

Pricing ModelFree

Capabilities2 Tools / 0 Prompts / 0 Resources

Ownercyberchitta

CategorySearch & Retrieval

DependenciesStandalone

Set Your Username