Loading connector details…
Loading connector details…
Choose a unique username to continue using AgentHotspot
by Pickle-Pixel • Uncategorized
An MCP server that connects multiple LLM agents to query, compare, vote, and synthesize across various models from one terminal.
Compare outputs from multiple LLM providers simultaneously.
Synthesize and vote on responses from diverse language models.
Leverage local and cloud-based LLMs in a unified interface.
HydraMCP enables seamless interaction between different large language models including GPT, Gemini, Claude, and local models by aggregating their responses. It supports API keys, subscription-based CLI tools, and local models, allowing flexible integration and usage. Features include model comparison, consensus polling with local judging, response distillation, caching, and failure handling, enhancing multi-model collaboration and efficiency.
List all available models across all providers. Run this first to see what you can query.
Query any AI model with a prompt. Returns the model's response with metadata. OUTPUT: Markdown with the model's response, latency, and token usage. If max_response_tokens is set and compression occurred, includes distillation metadata (original tokens, compressed tokens, compressor model, compressor latency). Shows "Saved: X tokens (Y% smaller)" when compression is active. Shows "(cached)" when response is served from cache. WHEN TO USE: When you need another model's perspective, analysis, or capabilities. Set max_response_tokens to control how much of your context window this response consumes — the response will be distilled by a fast model to fit the budget while preserving code, file paths, errors, and actionable details. Set include_raw=true to see both compressed and original responses for quality verification. FAILURE MODES: - "Model query failed (4xx/5xx)" → The model or provider is unavailable. Try a different model or check that CLIProxyAPI/Ollama is running. - "circuit breaker open" → The model failed too many times recently. Try a different model or wait for automatic recovery. - Compression silently skipped → If the compressor model is unavailable or the response already fits the budget, the raw response is returned unchanged. This is not an error.
Query 2-5 models in parallel with the same prompt. Returns side-by-side comparison with latency and token metrics.
Query 3-7 models and aggregate responses using voting strategy (majority/supermajority/unanimous). Returns consensus answer with confidence score.
Query 2-5 models in parallel, then combine their best ideas into one answer. Returns a synthesized response that's better than any single model.
Scores are informational only and provided “as is” without warranty. AgentHotspot assumes no liability for actions taken based on these ratings.