@cyanheads/ensembl-mcp-server

v0.1.3 pre-1.0

Look up genes, fetch sequences, predict variant consequences, find orthologs and cross-database xrefs via Ensembl REST via MCP. STDIO or Streamable HTTP.

@cyanheads/ensembl-mcp-server
claude mcp add --transport http ensembl-mcp-server https://ensembl.caseyjhand.com/mcp
codex mcp add ensembl-mcp-server --url https://ensembl.caseyjhand.com/mcp
{
  "mcpServers": {
    "ensembl-mcp-server": {
      "url": "https://ensembl.caseyjhand.com/mcp"
    }
  }
}
gemini mcp add --transport http ensembl-mcp-server https://ensembl.caseyjhand.com/mcp
{
  "mcpServers": {
    "ensembl-mcp-server": {
      "command": "bunx",
      "args": [
        "@cyanheads/ensembl-mcp-server@latest"
      ]
    }
  }
}
{
  "mcpServers": {
    "ensembl-mcp-server": {
      "type": "http",
      "url": "https://ensembl.caseyjhand.com/mcp"
    }
  }
}
curl -X POST https://ensembl.caseyjhand.com/mcp \
  -H "Content-Type: application/json" \
  -H "MCP-Protocol-Version: 2025-11-25" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-11-25","capabilities":{},"clientInfo":{"name":"curl","version":"1.0.0"}}}'

Tools

7

ensembl_list_species

List species supported by Ensembl with display name, common name, assembly, taxon ID, and division. Required discovery step — species names like homo_sapiens are opaque to non-biologists and are the input format every other Ensembl tool expects. Filter by division to limit results; use nameContains to find a species by partial name match. Returns the full species catalog when no filters are applied (EnsemblVertebrates has ~250 species; all divisions combined have ~1,000+).

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_list_species",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "division": {
      "description": "Filter to a specific Ensembl division. EnsemblVertebrates includes human, mouse, zebrafish, and other vertebrates. EnsemblPlants covers crop and model plant genomes. EnsemblFungi, EnsemblMetazoa, EnsemblProtists cover non-vertebrate model organisms. Omit to return all divisions.",
      "type": "string",
      "enum": [
        "EnsemblVertebrates",
        "EnsemblPlants",
        "EnsemblFungi",
        "EnsemblMetazoa",
        "EnsemblProtists"
      ]
    },
    "nameContains": {
      "description": "Case-insensitive substring filter applied locally after fetching. Matches against species name, display name, and common name. Example: \"sapiens\" matches homo_sapiens; \"mouse\" matches mus_musculus.",
      "type": "string"
    }
  },
  "additionalProperties": false
}
view source ↗

ensembl_lookup_gene

open-world

Resolve a gene by symbol + species (or by stable ID) to its Ensembl ID, genomic location (chr:start-end:strand), biotype, description, and transcript list. Entry point for most workflows — the stable ID and coordinates returned here are inputs to other tools. Accepts both symbol lookup (BRCA2 + homo_sapiens) and direct ID lookup (ENSG00000139618). Supports batch lookup of up to 20 IDs or symbols in one call via the ids or symbols field. For symbol lookup, species is required; for ID lookup, species is not needed. Use ensembl_list_species to discover valid species names.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_lookup_gene",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "symbol": {
      "description": "Gene symbol to look up (e.g. BRCA2, TP53, EGFR). Requires species to be set. Case-insensitive in most species.",
      "type": "string"
    },
    "id": {
      "description": "Ensembl stable gene ID (e.g. ENSG00000139618 or ENSG00000139618.7 with version). Species is not required for ID lookup.",
      "type": "string"
    },
    "species": {
      "description": "Species in Ensembl internal format: lowercase scientific name with underscores (e.g. homo_sapiens, mus_musculus, danio_rerio). Required when using symbol. Default is homo_sapiens for symbol-based lookups. Use ensembl_list_species to discover valid values.",
      "type": "string"
    },
    "ids": {
      "description": "Batch lookup: up to 20 Ensembl stable IDs (ENSG…, ENST…). Returns a succeeded/failed split. Cannot be combined with symbol or id.",
      "maxItems": 20,
      "type": "array",
      "items": {
        "type": "string",
        "description": "An Ensembl stable gene or transcript ID to resolve in this batch."
      }
    },
    "symbols": {
      "description": "Batch lookup: up to 20 gene symbols. Requires species to be set. Returns a succeeded/failed split. Cannot be combined with symbol, id, or ids.",
      "maxItems": 20,
      "type": "array",
      "items": {
        "type": "string",
        "description": "A gene symbol to resolve in this batch (e.g. BRCA2, TP53)."
      }
    },
    "expand_transcripts": {
      "default": false,
      "description": "When true, include the full transcript list in the response. Each transcript has its ID, biotype, canonical flag, and coordinates. Default is false to keep responses compact.",
      "type": "boolean"
    }
  },
  "required": [
    "expand_transcripts"
  ],
  "additionalProperties": false
}
view source ↗

ensembl_get_sequence

open-world

Fetch the DNA, cDNA, CDS, or protein sequence for a gene, transcript, protein, or genomic region. Returns the sequence with its stable ID, molecule type, and character count — large sequences are returned in full but the length is stated so callers can budget context. The type parameter selects which sequence is fetched: genomic (default, includes introns), cdna (spliced transcript), cds (coding sequence only), protein. For region mode, set id to the format species:chr:start-end (e.g. homo_sapiens:13:32315086-32400268) and set species. Protein sequences require a transcript or protein stable ID (ENST…/ENSP…), not a gene ID — use ensembl_lookup_gene with expand_transcripts=true to get the canonical transcript ID first.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_get_sequence",
    "arguments": {
      "id": "<id>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "description": "Ensembl stable ID (ENSG…, ENST…, ENSP…) or region in the format species:chr:start-end (e.g. homo_sapiens:13:32315086-32400268) for region mode. For genomic region queries, species is also required."
    },
    "type": {
      "default": "genomic",
      "description": "Sequence type to retrieve. genomic: full genomic DNA including introns (default). cdna: spliced transcript sequence (requires ENST… ID). cds: coding sequence only, no UTRs (requires ENST… ID with coding transcript). protein: amino acid sequence (requires ENST… or ENSP… ID).",
      "type": "string",
      "enum": [
        "genomic",
        "cdna",
        "cds",
        "protein"
      ]
    },
    "species": {
      "description": "Species in Ensembl internal format (e.g. homo_sapiens). Required for region mode (when id is a species:chr:start-end string). Optional for stable ID lookups — Ensembl infers species from the ID prefix.",
      "type": "string"
    },
    "expand_5prime": {
      "default": 0,
      "description": "Number of base pairs to extend upstream (5' direction) of the requested feature. Default 0. Only applies to genomic sequences and region queries.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    },
    "expand_3prime": {
      "default": 0,
      "description": "Number of base pairs to extend downstream (3' direction) of the requested feature. Default 0. Only applies to genomic sequences and region queries.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    }
  },
  "required": [
    "id",
    "type",
    "expand_5prime",
    "expand_3prime"
  ],
  "additionalProperties": false
}
view source ↗

ensembl_query_region

open-world

Find genomic features overlapping a chromosomal region: genes, transcripts, variants, regulatory elements, or exons. Returns each feature with its stable ID, type, location, biotype, and name. Useful for "what's in this locus?" and for seeding follow-up lookups. Region format is chr:start-end (e.g. 13:32315086-32400268 for the BRCA2 locus). Chromosome names use Ensembl format — no "chr" prefix for vertebrates (use 13 not chr13). The feature parameter defaults to gene only to prevent overwhelming returns — requesting variation in an 85 kb region returns 44,000+ entries. Explicitly include variation, regulatory, transcript, or exon only when needed.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_query_region",
    "arguments": {
      "species": "<species>",
      "region": "<region>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "species": {
      "type": "string",
      "description": "Species in Ensembl internal format (e.g. homo_sapiens, mus_musculus). Use ensembl_list_species to discover valid values."
    },
    "region": {
      "type": "string",
      "description": "Genomic region in chr:start-end format (e.g. 13:32315086-32400268). Chromosome names use Ensembl format — no \"chr\" prefix for vertebrates (13, not chr13). For large regions (>100 kb), limit to gene feature type to avoid overwhelming results."
    },
    "feature": {
      "default": [
        "gene"
      ],
      "description": "Feature types to retrieve. Default is gene only. Requesting variation in a large region can return tens of thousands of features. Include variation only for targeted small regions (single gene loci or smaller).",
      "type": "array",
      "items": {
        "type": "string",
        "enum": [
          "gene",
          "transcript",
          "variation",
          "regulatory",
          "exon"
        ],
        "description": "A feature type to retrieve: gene, transcript, variation, regulatory, or exon."
      }
    },
    "biotype": {
      "description": "Optional biotype filter (e.g. protein_coding, lncRNA, SNV). Applied server-side by Ensembl. Not all feature types support biotype filtering.",
      "type": "string"
    }
  },
  "required": [
    "species",
    "region",
    "feature"
  ],
  "additionalProperties": false
}
view source ↗

ensembl_predict_variant

open-world

Predict the functional consequences of a sequence variant using the Ensembl Variant Effect Predictor (VEP). Accepts HGVS notation (transcript-relative, e.g. ENST00000380152.8:c.2T>A, or genomic, e.g. 13:g.32316462T>A) and also region+allele format (chr:start:end:strand/allele, e.g. 1:65568:65568:1/T). Returns the most severe consequence term, affected transcripts and genes, impact level (HIGH/MODERATE/LOW/MODIFIER), and any colocated known variants with clinical significance. HGVS input: provide the full notation including transcript version for best results. Region+allele input: use Ensembl chromosome naming (no chr prefix for vertebrates).

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_predict_variant",
    "arguments": {
      "variant": "<variant>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "variant": {
      "type": "string",
      "description": "Variant in one of two formats: (1) HGVS notation — transcript-relative: ENST00000380152.8:c.2T>A; genomic: 13:g.32316462T>A; (2) Region+allele: chr:start:end:strand/allele — e.g. 1:65568:65568:1/T. For region+allele, strand is 1 (forward) or -1 (reverse); chromosome names use no \"chr\" prefix for vertebrates."
    },
    "species": {
      "default": "homo_sapiens",
      "description": "Species in Ensembl internal format. Default is homo_sapiens. For non-human variants, set the appropriate species (e.g. mus_musculus for mouse). Use ensembl_list_species to discover valid values.",
      "type": "string"
    }
  },
  "required": [
    "variant",
    "species"
  ],
  "additionalProperties": false
}
view source ↗

ensembl_get_homology

open-world

Find orthologs and/or paralogs of a gene across species. Returns each homolog's stable ID, species, homology type (ortholog_one2one, ortholog_one2many, paralog_many2many, etc.), perc_id (percent identity), perc_pos (percent positives), and taxonomy level. Essential for cross-species research — for example, "what is the mouse equivalent of human TP53?" or "how conserved is BRCA2 across mammals?". Provide either symbol + species or a stable gene ID. Target species can be filtered to a single species or left open to return all available homologs.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_get_homology",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "symbol": {
      "description": "Gene symbol in the source species (e.g. BRCA2, TP53). Requires species to be set. Cannot be used together with id.",
      "type": "string"
    },
    "id": {
      "description": "Ensembl stable gene ID (e.g. ENSG00000139618). Use ensembl_lookup_gene to get the stable ID from a symbol. Cannot be used together with symbol.",
      "type": "string"
    },
    "species": {
      "default": "homo_sapiens",
      "description": "Source species (the species the query gene belongs to) in Ensembl internal format. Default is homo_sapiens. Use ensembl_list_species to discover valid values.",
      "type": "string"
    },
    "target_species": {
      "description": "Filter to homologs in a single target species (e.g. mus_musculus for mouse). Omit to return homologs across all available species. Use ensembl_list_species to discover valid values.",
      "type": "string"
    },
    "type": {
      "default": "orthologues",
      "description": "Type of homologs to return. orthologues: genes related by speciation (cross-species equivalents). paralogues: genes related by duplication (within or across species). all: both orthologs and paralogs.",
      "type": "string",
      "enum": [
        "orthologues",
        "paralogues",
        "all"
      ]
    }
  },
  "required": [
    "species",
    "type"
  ],
  "additionalProperties": false
}
view source ↗

ensembl_get_xrefs

open-world

Retrieve cross-database references for a gene or feature — HGNC, UniProt, EntrezGene, OMIM, RefSeq, Reactome, and others. Returns each xref with its database name, primary ID, display ID, and description. The dbname filter narrows to specific databases; omit to return all xrefs. IDs returned here chain to protein (pubchem via UniProt), literature (pubmed via PubMed IDs), disease (OMIM via MIM_GENE), and pathway (Reactome) resources. Requires an Ensembl stable ID — use ensembl_lookup_gene to get the ENSG… ID first. Common dbname values: HGNC, Uniprot_gn, EntrezGene, MIM_GENE, RefSeq_mRNA, RefSeq_peptide, Reactome, GO (Gene Ontology), ChEMBL.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "ensembl_get_xrefs",
    "arguments": {
      "id": "<id>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "description": "Ensembl stable gene ID (ENSG…) or transcript ID (ENST…). Use ensembl_lookup_gene to get the stable ID from a gene symbol. xrefs/id returns the full cross-reference set (56+ entries for well-annotated genes like BRCA2)."
    },
    "dbname": {
      "description": "Filter to a specific external database by its Ensembl internal name. Examples: HGNC (HGNC gene ID), Uniprot_gn (UniProt gene name), EntrezGene (NCBI Gene ID), MIM_GENE (OMIM disease gene), RefSeq_mRNA (NCBI RefSeq transcript), Reactome (pathway IDs), GO (Gene Ontology terms). Omit to return all available xrefs.",
      "type": "string"
    }
  },
  "required": [
    "id"
  ],
  "additionalProperties": false
}
view source ↗

Resources

3

Gene record by Ensembl stable ID (ENSG…). Returns location, biotype, description, and transcript list. Stable, injectable context for multi-step workflows. Use ensembl_lookup_gene to resolve a gene symbol to the stable ID first.

uri ensembl://gene/{id} mime application/json

Transcript record by Ensembl stable ID (ENST…). Returns parent gene, location, biotype, canonical flag, and length. Use ensembl_lookup_gene with expand_transcripts=true to discover transcript IDs for a given gene, then fetch this resource for stable, injectable context.

uri ensembl://transcript/{id} mime application/json

Complete catalog of Ensembl-supported species with internal name, display name, assembly, taxon ID, and division. Addressable reference for tool bootstrapping. Contains ~350 vertebrate species plus additional non-vertebrate divisions. Use this as stable, injectable context when working with unfamiliar species names.

uri ensembl://species mime application/json

Prompts

1

Structured research workflow for assembling a complete gene profile. Guides the agent through the Ensembl tool chain in order: symbol → ID + location → protein sequence → key variants → cross-species orthologs → xref IDs for protein and literature follow-up.

  • gene_symbolrequired — Gene symbol to research (e.g. BRCA2, TP53, EGFR). Case-insensitive; Ensembl will resolve the canonical form.
  • species — Species in Ensembl internal format (e.g. homo_sapiens, mus_musculus). Default is homo_sapiens. Use ensembl_list_species to discover valid values.