ensembl-mcp-server

v0.4.2 pre-1.0

Look up genes, fetch sequences, predict variant consequences, find orthologs and cross-database xrefs via Ensembl REST via MCP. STDIO or Streamable HTTP.

public 7 tools 4 resources 1 prompts protocol 2025-11-25 github cyanheads/ensembl-mcp-server

ensembl.caseyjhand.com/mcp

claude mcp add --transport http ensembl-mcp-server https://ensembl.caseyjhand.com/mcp

codex mcp add ensembl-mcp-server --url https://ensembl.caseyjhand.com/mcp

{
  "mcpServers": {
    "ensembl-mcp-server": {
      "url": "https://ensembl.caseyjhand.com/mcp"
    }
  }
}

gemini mcp add --transport http ensembl-mcp-server https://ensembl.caseyjhand.com/mcp

{
  "mcpServers": {
    "ensembl-mcp-server": {
      "command": "bunx",
      "args": [
        "mcp-remote",
        "https://ensembl.caseyjhand.com/mcp"
      ]
    }
  }
}

{
  "mcpServers": {
    "ensembl-mcp-server": {
      "type": "http",
      "url": "https://ensembl.caseyjhand.com/mcp"
    }
  }
}

curl -X POST https://ensembl.caseyjhand.com/mcp \
  -H "Content-Type: application/json" \
  -H "MCP-Protocol-Version: 2025-11-25" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-11-25","capabilities":{},"clientInfo":{"name":"curl","version":"1.0.0"}}}'

Built on@cyanheads/mcp-ts-core v0.10.14

Tools

ensembl_list_species

List species supported by Ensembl with display name, common name, assembly, taxon ID, and division. Required discovery step — species names like homo_sapiens are opaque to non-biologists and are the input format every other Ensembl tool expects. Filter by division to select one; use nameContains to find a species by partial name match. With no division, returns the endpoint default division — the vertebrates (~356 species on the default GRCh38 endpoint); pass a division to list that division.

read

invocation

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_list_species",
    "arguments": {}
  }
}

schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "division": {
      "description": "Filter to a specific Ensembl division. EnsemblVertebrates includes human, mouse, zebrafish, and other vertebrates. EnsemblPlants covers crop and model plant genomes. EnsemblFungi, EnsemblMetazoa, EnsemblProtists cover non-vertebrate model organisms. Omit to return the endpoint default division (vertebrates).",
      "type": "string",
      "enum": [
        "EnsemblVertebrates",
        "EnsemblPlants",
        "EnsemblFungi",
        "EnsemblMetazoa",
        "EnsemblProtists"
      ]
    },
    "nameContains": {
      "description": "Case-insensitive substring filter applied locally after fetching. Matches against species name, display name, and common name. Example: \"sapiens\" matches homo_sapiens; \"mouse\" matches mus_musculus.",
      "type": "string"
    }
  },
  "additionalProperties": false
}

view source ↗

ensembl_lookup_gene

open-world

Resolve a gene by symbol + species (or by stable ID) to its Ensembl ID, genomic location (chr:start-end:strand), biotype, description, and transcript list. Entry point for most workflows — the stable ID and coordinates returned here are inputs to other tools. Accepts both symbol lookup (BRCA2 + homo_sapiens) and direct ID lookup (ENSG00000139618). Supports batch lookup of up to 20 IDs or symbols in one call via the ids or symbols field. Provide exactly one of symbol, id, ids, or symbols. For symbol lookups species defaults to homo_sapiens (override for other organisms); for ID lookups species is not needed. Use ensembl_list_species to discover valid species names.

read

invocation

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_lookup_gene",
    "arguments": {}
  }
}

schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "symbol": {
      "description": "Gene symbol to look up (e.g. BRCA2, TP53, EGFR). Species defaults to homo_sapiens; set species for other organisms. Case-insensitive in most species.",
      "type": "string"
    },
    "id": {
      "description": "Ensembl stable gene ID (e.g. ENSG00000139618 or ENSG00000139618.7 with version). Species is not required for ID lookup.",
      "type": "string"
    },
    "species": {
      "description": "Species in Ensembl internal format: lowercase scientific name with underscores (e.g. homo_sapiens, mus_musculus, danio_rerio). Optional for symbol lookups — defaults to homo_sapiens; set it for other organisms. Use ensembl_list_species to discover valid values.",
      "type": "string"
    },
    "ids": {
      "description": "Batch lookup: up to 20 Ensembl stable IDs (ENSG…, ENST…). Returns a succeeded/failed split. Provide exactly one of symbol, id, ids, or symbols.",
      "maxItems": 20,
      "type": "array",
      "items": {
        "type": "string",
        "description": "An Ensembl stable gene or transcript ID to resolve in this batch."
      }
    },
    "symbols": {
      "description": "Batch lookup: up to 20 gene symbols. Species defaults to homo_sapiens; set species for other organisms. Returns a succeeded/failed split. Provide exactly one of symbol, id, ids, or symbols.",
      "maxItems": 20,
      "type": "array",
      "items": {
        "type": "string",
        "description": "A gene symbol to resolve in this batch (e.g. BRCA2, TP53)."
      }
    },
    "expand_transcripts": {
      "default": false,
      "description": "When true, include the full transcript list in the response. Each transcript has its ID, biotype, canonical flag, and coordinates. Default is false to keep responses compact.",
      "type": "boolean"
    }
  },
  "required": [
    "expand_transcripts"
  ],
  "additionalProperties": false
}

view source ↗

ensembl_get_sequence

open-world

Fetch the DNA, cDNA, CDS, or protein sequence for a gene, transcript, protein, or genomic region. Returns the sequence with its stable ID, molecule type, and character count — large sequences are returned in full but the length is stated so callers can budget context. The type parameter selects which sequence is fetched: genomic (default, includes introns), cdna (spliced transcript), cds (coding sequence only), protein. For region mode, set id to a region — either species:chr:start-end (e.g. homo_sapiens:13:32315086-32400268) or a bare chr:start-end with species set (e.g. id 13:32315086-32400268, species homo_sapiens). Protein sequences require a transcript or protein stable ID (ENST…/ENSP…), not a gene ID — use ensembl_lookup_gene with expand_transcripts=true to get the canonical transcript ID first.

read

invocation

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_get_sequence",
    "arguments": {
      "id": "<id>"
    }
  }
}

schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "description": "Ensembl stable ID (ENSG…, ENST…, ENSP…) or a genomic region for region mode. Region accepts species:chr:start-end (e.g. homo_sapiens:13:32315086-32400268) or a bare chr:start-end (e.g. 13:32315086-32400268) when the species field is set."
    },
    "type": {
      "default": "genomic",
      "description": "Sequence type to retrieve. genomic: full genomic DNA including introns (default). cdna: spliced transcript sequence (requires ENST… ID). cds: coding sequence only, no UTRs (requires ENST… ID with coding transcript). protein: amino acid sequence (requires ENST… or ENSP… ID).",
      "type": "string",
      "enum": [
        "genomic",
        "cdna",
        "cds",
        "protein"
      ]
    },
    "species": {
      "description": "Species in Ensembl internal format (e.g. homo_sapiens). Required for a bare chr:start-end region; optional for the species:chr:start-end form (the embedded species is used when the field is omitted). Optional for stable ID lookups — Ensembl infers species from the ID prefix.",
      "type": "string"
    },
    "expand_5prime": {
      "default": 0,
      "description": "Number of base pairs to extend upstream (5' direction) of the requested feature. Default 0. Only applies to genomic sequences and region queries.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    },
    "expand_3prime": {
      "default": 0,
      "description": "Number of base pairs to extend downstream (3' direction) of the requested feature. Default 0. Only applies to genomic sequences and region queries.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    }
  },
  "required": [
    "id",
    "type",
    "expand_5prime",
    "expand_3prime"
  ],
  "additionalProperties": false
}

view source ↗

ensembl_query_region

open-world

Find genomic features overlapping a chromosomal region: genes, transcripts, variants, regulatory elements, or exons. Returns each feature with its stable ID, type, location, biotype, and name. Useful for "what's in this locus?" and for seeding follow-up lookups. Region format is chr:start-end (e.g. 13:32315086-32400268 for the BRCA2 locus). Ensembl normalizes chromosome names and canonical vertebrate output omits the chr prefix (13, not chr13); a chr-prefixed name like chr13 is also accepted. The feature parameter defaults to gene only to prevent overwhelming returns — requesting variation in an 85 kb region returns 44,000+ entries. Explicitly include variation, regulatory, transcript, or exon only when needed. Exon rows carry the parent transcript ID, so the same exon appears once per transcript it belongs to.

read

invocation

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_query_region",
    "arguments": {
      "species": "<species>",
      "region": "<region>"
    }
  }
}

schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "species": {
      "type": "string",
      "description": "Species in Ensembl internal format (e.g. homo_sapiens, mus_musculus). Use ensembl_list_species to discover valid values."
    },
    "region": {
      "type": "string",
      "description": "Genomic region in chr:start-end format (e.g. 13:32315086-32400268). Ensembl normalizes chromosome names and canonical vertebrate output omits the chr prefix (13, not chr13); a chr-prefixed name like chr13 is also accepted. For large regions (>100 kb), limit to gene feature type to avoid overwhelming results."
    },
    "feature": {
      "default": [
        "gene"
      ],
      "description": "Feature types to retrieve. Default is gene only. Requesting variation in a large region can return tens of thousands of features. Include variation only for targeted small regions (single gene loci or smaller).",
      "type": "array",
      "items": {
        "type": "string",
        "enum": [
          "gene",
          "transcript",
          "variation",
          "regulatory",
          "exon"
        ],
        "description": "A feature type to retrieve: gene, transcript, variation, regulatory, or exon."
      }
    },
    "biotype": {
      "description": "Optional biotype filter (e.g. protein_coding, lncRNA, SNV). Applied server-side by Ensembl. Not all feature types support biotype filtering.",
      "type": "string"
    }
  },
  "required": [
    "species",
    "region",
    "feature"
  ],
  "additionalProperties": false
}

view source ↗

ensembl_predict_variant

open-world

Predict the functional consequences of a sequence variant using the Ensembl Variant Effect Predictor (VEP). Accepts three input formats: HGVS notation (transcript-relative, e.g. ENST00000380152.8:c.2T>A, or genomic, e.g. 13:g.32316462T>A); region+allele (chr:start:end:strand/allele, e.g. 1:65568:65568:1/T); and a dbSNP rsID (e.g. rs334). Returns the most severe consequence term, affected transcripts and genes, impact level (HIGH/MODERATE/LOW/MODIFIER), and any colocated known variants with clinical significance. HGVS input: provide the full notation including transcript version for best results. Region+allele input: Ensembl normalizes chromosome names and canonical vertebrate output omits the chr prefix (a chr-prefixed name is also accepted). By default the response caps transcript consequences (max_transcript_consequences) and per-variant PubMed IDs (max_pubmed_ids_per_variant) to keep large VEP results compact — well-studied variants like rs334 otherwise carry 60+ consequences and 100+ citations. Truthful totals are always reported; set a cap to 0 (or include_all_colocated_pubmed=true) to retrieve the full set.

read

invocation

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_predict_variant",
    "arguments": {
      "variant": "<variant>"
    }
  }
}

schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "variant": {
      "type": "string",
      "description": "Variant in one of three formats: (1) HGVS notation — transcript-relative: ENST00000380152.8:c.2T>A; genomic: 13:g.32316462T>A; (2) Region+allele: chr:start:end:strand/allele — e.g. 1:65568:65568:1/T (strand is 1 for forward or -1 for reverse); (3) dbSNP rsID — e.g. rs334. Ensembl normalizes chromosome names; canonical vertebrate output omits the \"chr\" prefix, though a chr-prefixed name is also accepted."
    },
    "species": {
      "default": "homo_sapiens",
      "description": "Species in Ensembl internal format. Default is homo_sapiens. For non-human variants, set the appropriate species (e.g. mus_musculus for mouse). Use ensembl_list_species to discover valid values.",
      "type": "string"
    },
    "max_transcript_consequences": {
      "default": 10,
      "description": "Maximum transcript consequences to return per VEP record. High-impact variants can affect 60+ transcripts; the default keeps the response focused on the top consequences. Set to 0 to return every transcript consequence uncapped. transcriptConsequencesTotal on each record always reports the true pre-cap count.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    },
    "max_pubmed_ids_per_variant": {
      "default": 10,
      "description": "Maximum PubMed IDs to return per colocated known variant. Well-studied variants (e.g. rs334) cite 100+ papers; the default trims each list. Set to 0 to return every PubMed ID uncapped. pubmedTotal on each colocated variant reports the true pre-cap count. Ignored when include_all_colocated_pubmed is true.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    },
    "include_all_colocated_pubmed": {
      "default": false,
      "description": "When true, return every PubMed ID for each colocated variant, overriding max_pubmed_ids_per_variant. Default false to keep responses compact.",
      "type": "boolean"
    }
  },
  "required": [
    "variant",
    "species",
    "max_transcript_consequences",
    "max_pubmed_ids_per_variant",
    "include_all_colocated_pubmed"
  ],
  "additionalProperties": false
}

view source ↗

ensembl_get_homology

open-world

Find orthologs and/or paralogs of a gene across species. Returns each homolog's stable ID, species, homology type (ortholog_one2one, ortholog_one2many, paralog_many2many, etc.), perc_id (percent identity), perc_pos (percent positives), and taxonomy level. Essential for cross-species research — for example, "what is the mouse equivalent of human TP53?" or "how conserved is BRCA2 across mammals?". Provide either symbol + species or a stable gene ID. Target species can be filtered to a single species or left open to return all available homologs.

read

invocation

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_get_homology",
    "arguments": {}
  }
}

schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "symbol": {
      "description": "Gene symbol in the source species (e.g. BRCA2, TP53). Species defaults to homo_sapiens; set species for other organisms. Cannot be combined with id.",
      "type": "string"
    },
    "id": {
      "description": "Ensembl stable gene ID (e.g. ENSG00000139618). Use ensembl_lookup_gene to get the stable ID from a symbol. Cannot be combined with symbol.",
      "type": "string"
    },
    "species": {
      "default": "homo_sapiens",
      "description": "Source species (the species the query gene belongs to) in Ensembl internal format. Default is homo_sapiens. Use ensembl_list_species to discover valid values.",
      "type": "string"
    },
    "target_species": {
      "description": "Filter to homologs in a single target species (e.g. mus_musculus for mouse). Omit to return homologs across all available species. Use ensembl_list_species to discover valid values.",
      "type": "string"
    },
    "type": {
      "default": "orthologues",
      "description": "Type of homologs to return. orthologues: genes related by speciation (cross-species equivalents). paralogues: genes related by duplication (within or across species). all: both orthologs and paralogs.",
      "type": "string",
      "enum": [
        "orthologues",
        "paralogues",
        "all"
      ]
    },
    "max_results": {
      "default": 25,
      "description": "Maximum number of homologs to return. Broad orthology queries (e.g. BRCA2 across all species) can return 150+ homologs; the default keeps responses focused. Set to 0 to return every homolog uncapped. totalCount always reports the true number available before this cap.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    }
  },
  "required": [
    "species",
    "type",
    "max_results"
  ],
  "additionalProperties": false
}

view source ↗

ensembl_get_xrefs

open-world

Retrieve cross-database references for a gene or feature — HGNC, UniProt, EntrezGene, OMIM, RefSeq, Reactome, and others. Returns each xref with its database name, primary ID, display ID, and description. The dbname filter narrows to specific databases; omit to return all xrefs. IDs returned here chain to protein (pubchem via UniProt), literature (pubmed via PubMed IDs), disease (OMIM via MIM_GENE), and pathway (Reactome) resources. Requires an Ensembl stable ID — use ensembl_lookup_gene to get the ENSG… ID first. Common dbname values: HGNC, Uniprot_gn, EntrezGene, MIM_GENE, RefSeq_mRNA, RefSeq_peptide, Reactome, GO (Gene Ontology), ChEMBL.

read

invocation

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_get_xrefs",
    "arguments": {
      "id": "<id>"
    }
  }
}

schema

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "description": "Ensembl stable gene ID (ENSG…) or transcript ID (ENST…). Use ensembl_lookup_gene to get the stable ID from a gene symbol. xrefs/id returns the full cross-reference set (56+ entries for well-annotated genes like BRCA2)."
    },
    "dbname": {
      "description": "Filter to a specific external database by its Ensembl internal name. Examples: HGNC (HGNC gene ID), Uniprot_gn (UniProt gene name), EntrezGene (NCBI Gene ID), MIM_GENE (OMIM disease gene), RefSeq_mRNA (NCBI RefSeq transcript), Reactome (pathway IDs), GO (Gene Ontology terms). Omit to return all available xrefs.",
      "type": "string"
    }
  },
  "required": [
    "id"
  ],
  "additionalProperties": false
}

view source ↗

Resources

Ensembl Gene

Gene record by Ensembl stable ID (ENSG…). Returns location, biotype, description, and transcript list. Stable, injectable context for multi-step workflows. Use ensembl_lookup_gene to resolve a gene symbol to the stable ID first.

uri ensembl://gene/{id} mime application/json

Ensembl Transcript

Transcript record by Ensembl stable ID (ENST…). Returns parent gene, location, biotype, canonical flag, and length. Use ensembl_lookup_gene with expand_transcripts=true to discover transcript IDs for a given gene, then fetch this resource for stable, injectable context.

uri ensembl://transcript/{id} mime application/json

Ensembl Species

Ensembl species catalog for the endpoint default division — internal name, display name, assembly, taxon ID, and division. On the default GRCh38 endpoint this returns the vertebrate division (~356 species). Stable, injectable context for unfamiliar species names. For a specific division read ensembl://species/{division} (e.g. ensembl://species/EnsemblPlants); the ensembl_list_species tool also filters by division and name.

uri ensembl://species mime application/json

Ensembl Species by Division

Ensembl-supported species within a single division (EnsemblVertebrates, EnsemblPlants, EnsemblFungi, EnsemblMetazoa, or EnsemblProtists) — internal name, display name, assembly, taxon ID, and division. Read ensembl://species for the endpoint default division (vertebrates), or use the ensembl_list_species tool to also filter by name.

uri ensembl://species/{division} mime application/json

Prompts

ensembl_gene_dossier

Structured research workflow for assembling a complete gene profile. Guides the agent through the Ensembl tool chain in order: symbol → ID + location → protein sequence → key variants → cross-species orthologs → xref IDs for protein and literature follow-up.

gene_symbolrequired — Gene symbol to research (e.g. BRCA2, TP53, EGFR). Case-insensitive; Ensembl will resolve the canonical form.
species — Species in Ensembl internal format (e.g. homo_sapiens, mus_musculus). Default is homo_sapiens. Use ensembl_list_species to discover valid values.