Loading connector details…
Loading connector details…
Choose a unique username to continue using AgentHotspot
by longevity-genie • Healthcare & Bioinformatics
An MCP server that exposes the gget bioinformatics toolkit to AI assistants and agents for standardized programmatic access to genomic tools and databases.
Search for genes and fetch detailed gene/protein sequences and metadata from Ensembl and related databases.
Run sequence analyses like BLAST/BLAT, multiple sequence alignment, or predict protein structures (AlphaFold) programmatically.
Perform functional analyses and cancer mutation lookups (enrichment analysis, COSMIC queries) and save results in standardized formats via STDIO or filesystem integrations.
This server implements the Model Context Protocol (MCP) to wrap the gget toolkit, providing structured, type-safe access to gene search, sequence retrieval, BLAST, alignment, protein structure prediction (AlphaFold), enrichment analysis, COSMIC mutation queries, and more. It supports multiple runtime modes (stdio, HTTP, SSE) and integrates with AI clients (Claude Desktop, Cursor, Windsurf) via uv/uvx tooling. The server adds input validation, error handling, and rate-respectful access to external services, making it suitable for automated agents that need repeatable, validated bioinformatics workflows.
General search for any biological terms using gene symbols, names, or synonyms. This is a general search that looks broadly across gene names and descriptions. For specific gene symbol searches, use search_genes_simple instead. Args: search_terms: Search terms, names, or synonyms (e.g., 'cancer' or ['apoptosis', 'death']) species: Target species (e.g., 'homo_sapiens', 'mus_musculus') Returns: SearchResult: DataFrame with search results containing Ensembl IDs and descriptions Example: Input: search_terms='apoptosis', species='homo_sapiens' Output: DataFrame with genes related to apoptosis Note: Searches broadly in "gene name" and "description" sections of Ensembl database. Results are limited to prevent overwhelming LLM context.
Search for specific genes using gene symbols with enhanced search strategy. 🚀 **BATCH PROCESSING SUPPORTED**: This function can process multiple genes in a single call! Use this tool FIRST when you have gene names/symbols and need to find their Ensembl IDs. Returns Ensembl IDs which are required for get_gene_info and get_sequences tools. IMPORTANT: Due to limitations in Ensembl search, short gene names often fail to find results. For best results, provide descriptive terms along with gene symbols: RECOMMENDED FORMAT: "GENE_SYMBOL descriptive_terms" Examples: - Instead of: "APP" - Use: "APP amyloid precursor" or "APP amyloid beta precursor protein" - Instead of: ["BACE1", "MAPT"] - Use: ["BACE1 beta secretase", "MAPT microtubule tau"] This function uses AND search for multi-word terms and OR search for single words. Args: search_terms: SINGLE gene symbol OR LIST of gene symbols with descriptive terms Single: 'APP amyloid precursor' Batch: ['BACE1 beta secretase', 'MAPT tau', 'APOE apolipoprotein'] species: Target species (e.g., 'homo_sapiens', 'mus_musculus') id_type: "gene" (default) or "transcript" - whether to return genes or transcripts Returns: SearchResult: DataFrame with gene search results containing Ensembl IDs and descriptions Results from ALL search terms are combined in a single response Example (SINGLE GENE): Input: search_terms='APP amyloid precursor', species='homo_sapiens' Output: DataFrame with APP gene and related genes Example (BATCH PROCESSING, limit number of queries to 3-5 to avoid timeouts): Input: search_terms=['APOE apolipoprotein', 'APP amyloid', 'PSEN1 presenilin'], species='homo_sapiens' Output: DataFrame with ALL three genes and their Ensembl IDs in one response Downstream tools that need the Ensembl IDs from this search: - get_gene_info: Get detailed gene information - get_sequences: Get DNA/protein sequences Note: For general biological term searches without gene focus, use search_simple.
Get detailed gene and transcript metadata using Ensembl IDs. PREREQUISITE: Use search_genes first to get Ensembl IDs from gene names/symbols. Args: ensembl_ids: One or more Ensembl gene IDs (e.g., 'ENSG00000141510' or ['ENSG00000141510']) Also supports WormBase and FlyBase IDs Returns: Dict[str, Any]: DataFrame with gene information containing metadata from multiple databases Example workflow: 1. search_genes('TP53', 'homo_sapiens') → get Ensembl ID 'ENSG00000141510' 2. get_gene_info('ENSG00000141510') Example output: DataFrame with columns like 'ensembl_id', 'symbol', 'biotype', 'chromosome', 'start', 'end', plus NCBI, UniProt, and optionally PDB information
Fetch sequences and save to local file in stdio mode. PREREQUISITE: Use search_genes first to get Ensembl IDs from gene names/symbols. Args: ensembl_ids: One or more Ensembl gene IDs (e.g., 'ENSG00000141510' or ['ENSG00000141510']) translate: If True, returns amino acid sequences; if False, returns nucleotide sequences output_path: ABSOLUTE path to output file (e.g., '/home/user/sequences.fasta'). AVOID relative paths as they cause file location issues. Auto-generated if not provided. format: Output format (currently supports 'fasta') Returns: LocalFileResult: Contains ABSOLUTE path, format, and success information instead of sequence data
Fetch FTPs for reference genomes and annotations by species from Ensembl. Args: species: Species in format "genus_species" (e.g., "homo_sapiens"). Shortcuts supported: "human", "mouse", "human_grch37" which: Which results to return. Options: 'gtf', 'cdna', 'dna', 'cds', 'cdrna', 'pep', 'all' Returns: Union[Dict[str, Any], List[str]]: Dictionary with URLs, versions, and metadata Example: Input: species="homo_sapiens", which="gtf" Output: Dictionary containing GTF URLs with Ensembl version and release info
Scores are informational only and provided “as is” without warranty. AgentHotspot assumes no liability for actions taken based on these ratings.