gget MCP Server

MCPOpen SourceMIT39.1

by longevity-genie • Healthcare & Bioinformatics

An MCP server that exposes the gget bioinformatics toolkit to AI assistants and agents for standardized programmatic access to genomic tools and databases.

Example Use Cases

1
Search for genes and fetch detailed gene/protein sequences and metadata from Ensembl and related databases.
2
Run sequence analyses like BLAST/BLAT, multiple sequence alignment, or predict protein structures (AlphaFold) programmatically.
3
Perform functional analyses and cancer mutation lookups (enrichment analysis, COSMIC queries) and save results in standardized formats via STDIO or filesystem integrations.

Description

This server implements the Model Context Protocol (MCP) to wrap the gget toolkit, providing structured, type-safe access to gene search, sequence retrieval, BLAST, alignment, protein structure prediction (AlphaFold), enrichment analysis, COSMIC mutation queries, and more. It supports multiple runtime modes (stdio, HTTP, SSE) and integrates with AI clients (Claude Desktop, Cursor, Windsurf) via uv/uvx tooling. The server adds input validation, error handling, and rate-respectful access to external services, making it suitable for automated agents that need repeatable, validated bioinformatics workflows.

Capabilities(20 total)

Tools (20)

gget_search

General search for any biological terms using gene symbols, names, or synonyms. This is a general search that looks broadly across gene names and descriptions. For specific gene symbol searches, use search_genes_simple instead. Args: search_terms: Search terms, names, or synonyms (e.g., 'cancer' or ['apoptosis', 'death']) species: Target species (e.g., 'homo_sapiens', 'mus_musculus') Returns: SearchResult: DataFrame with search results containing Ensembl IDs and descriptions Example: Input: search_terms='apoptosis', species='homo_sapiens' Output: DataFrame with genes related to apoptosis Note: Searches broadly in "gene name" and "description" sections of Ensembl database. Results are limited to prevent overwhelming LLM context.

search_terms:any*species:string

gget_search_genes

Search for specific genes using gene symbols with enhanced search strategy. 🚀 **BATCH PROCESSING SUPPORTED**: This function can process multiple genes in a single call! Use this tool FIRST when you have gene names/symbols and need to find their Ensembl IDs. Returns Ensembl IDs which are required for get_gene_info and get_sequences tools. IMPORTANT: Due to limitations in Ensembl search, short gene names often fail to find results. For best results, provide descriptive terms along with gene symbols: RECOMMENDED FORMAT: "GENE_SYMBOL descriptive_terms" Examples: - Instead of: "APP" - Use: "APP amyloid precursor" or "APP amyloid beta precursor protein" - Instead of: ["BACE1", "MAPT"] - Use: ["BACE1 beta secretase", "MAPT microtubule tau"] This function uses AND search for multi-word terms and OR search for single words. Args: search_terms: SINGLE gene symbol OR LIST of gene symbols with descriptive terms Single: 'APP amyloid precursor' Batch: ['BACE1 beta secretase', 'MAPT tau', 'APOE apolipoprotein'] species: Target species (e.g., 'homo_sapiens', 'mus_musculus') id_type: "gene" (default) or "transcript" - whether to return genes or transcripts Returns: SearchResult: DataFrame with gene search results containing Ensembl IDs and descriptions Results from ALL search terms are combined in a single response Example (SINGLE GENE): Input: search_terms='APP amyloid precursor', species='homo_sapiens' Output: DataFrame with APP gene and related genes Example (BATCH PROCESSING, limit number of queries to 3-5 to avoid timeouts): Input: search_terms=['APOE apolipoprotein', 'APP amyloid', 'PSEN1 presenilin'], species='homo_sapiens' Output: DataFrame with ALL three genes and their Ensembl IDs in one response Downstream tools that need the Ensembl IDs from this search: - get_gene_info: Get detailed gene information - get_sequences: Get DNA/protein sequences Note: For general biological term searches without gene focus, use search_simple.

search_terms:any*species:stringid_type:string

gget_info

Get detailed gene and transcript metadata using Ensembl IDs. PREREQUISITE: Use search_genes first to get Ensembl IDs from gene names/symbols. Args: ensembl_ids: One or more Ensembl gene IDs (e.g., 'ENSG00000141510' or ['ENSG00000141510']) Also supports WormBase and FlyBase IDs Returns: Dict[str, Any]: DataFrame with gene information containing metadata from multiple databases Example workflow: 1. search_genes('TP53', 'homo_sapiens') → get Ensembl ID 'ENSG00000141510' 2. get_gene_info('ENSG00000141510') Example output: DataFrame with columns like 'ensembl_id', 'symbol', 'biotype', 'chromosome', 'start', 'end', plus NCBI, UniProt, and optionally PDB information

ensembl_ids:any*

gget_seq

Fetch sequences and save to local file in stdio mode. PREREQUISITE: Use search_genes first to get Ensembl IDs from gene names/symbols. Args: ensembl_ids: One or more Ensembl gene IDs (e.g., 'ENSG00000141510' or ['ENSG00000141510']) translate: If True, returns amino acid sequences; if False, returns nucleotide sequences output_path: ABSOLUTE path to output file (e.g., '/home/user/sequences.fasta'). AVOID relative paths as they cause file location issues. Auto-generated if not provided. format: Output format (currently supports 'fasta') Returns: LocalFileResult: Contains ABSOLUTE path, format, and success information instead of sequence data

ensembl_ids:any*translate:booleanoutput_path:anyformat:string

gget_ref

Fetch FTPs for reference genomes and annotations by species from Ensembl. Args: species: Species in format "genus_species" (e.g., "homo_sapiens"). Shortcuts supported: "human", "mouse", "human_grch37" which: Which results to return. Options: 'gtf', 'cdna', 'dna', 'cds', 'cdrna', 'pep', 'all' Returns: Union[Dict[str, Any], List[str]]: Dictionary with URLs, versions, and metadata Example: Input: species="homo_sapiens", which="gtf" Output: Dictionary containing GTF URLs with Ensembl version and release info

species:stringwhich:any

Quick Actions

View on GitHub

Security

Scanned 4 month(s) ago

Risk Level

MINIMAL

Read-only data retrieval, no side effects

Trust Score

C56/100

5/17 checks passed

Scores are informational only and provided “as is” without warranty. AgentHotspot assumes no liability for actions taken based on these ratings.

Quick Stats

Service TypeMCP

Pricing ModelFree

Capabilities20 Tools / 0 Prompts / 0 Resources

Ownerlongevity-genie

CategoryHealthcare & Bioinformatics

DependenciesStandalone

Set Your Username