RA-MCP Archive Search

MCPOfficialOpen SourceApache-2.041.3

by AI-Riksarkivet • Uncategorized

An MCP server and CLI for searching and browsing AI-transcribed historical documents from the Swedish National Archives.

Example Use Cases

1
Perform full-text search on historical documents from the Swedish National Archives.
2
Browse and view AI-transcribed pages and handwritten text recognition results.
3
Interactive archival research guides integrated with LLM clients.

Description

RA-MCP provides full-text search across millions of AI-transcribed pages, complete page transcriptions, handwritten text recognition, interactive document viewing, and archival research guides. It is accessible as MCP tools compatible with any LLM client, enabling streamlined archival research. The server supports streamable HTTP for easy integration with clients like ChatGPT and Claude, and offers a CLI for direct terminal interaction.

Capabilities(47 total)

Tools (28)

search_transcribed

Search AI-transcribed text in digitised historical documents from the Swedish National Archives. IMPORTANT: Transcriptions are AI-generated (HTR/OCR) and contain recognition errors — always use fuzzy search (~) to compensate for misread characters and increase hits. Supports Solr syntax: wildcards (troll*), fuzzy (stockholm~1), Boolean ((A AND B)), proximity ("term1 term2"~10). Always group Boolean queries with outer parentheses. Use fuzzy (~) for OCR/HTR errors and old Swedish variants (präst/prest, silver/silfver). Paginate with offset (0, 50, 100...). Session dedup: re-calling returns stubs for already-seen documents.

keyword:string*offset:integer*limit:integermax_snippets_per_record:integermax_response_tokens:integersort:stringyear_min:anyyear_max:anydedup:booleanresearch_context:any

search_metadata

Search document metadata (titles, names, places, descriptions) across the Swedish National Archives catalog. Covers 2M+ records when only_digitised=False, including non-digitised materials. Use the dedicated name parameter for person searches and place parameter for place searches — these can be combined with keyword. Does NOT search transcribed page text — use search_transcribed for that. Same Solr syntax as search_transcribed. Session dedup: re-calling returns stubs for already-seen documents. Important: name and place filter a dedicated metadata field that is sparsely populated. Most person/place matches are NOT digitised, so set only_digitised=False when using name or place to avoid empty results.

keyword:string*offset:integer*only_digitised:booleanlimit:integermax_response_tokens:integersort:stringyear_min:anyyear_max:anyname:anyplace:anydedup:booleanresearch_context:any

browse_document

View full page transcriptions of a document by reference code. Use reference codes from search results. Returns original text (usually Swedish), links to bildvisaren (image viewer), and ALTO XML. Blank pages are normal (digitised but no text). Non-digitised materials return metadata only. Session dedup: re-browsing same pages returns stubs. Set dedup=False to force full text. TOKEN COST: ~300 tokens overhead per response + ~200-1500 tokens per page depending on content density. Dense court protocol pages average ~1000 tokens each; title/cover pages ~300. Request only the pages you need — start with 3-5 pages and paginate.

reference_code:string*pages:string*highlight_term:anymax_pages:integerdedup:booleanresearch_context:any

htr_htr_transcribe

Transcribe handwritten documents and return results as file URLs. Sends images to the HTRflow Gradio Space for AI-powered handwritten text recognition. Returns URLs to an interactive viewer, per-page JSON transcriptions, and an archival export file.

image_urls:array*language:stringlayout:stringexport_format:stringcustom_yaml:any

view_document

Display document pages with zoomable images and text layer overlays. Takes a reference code and page specification (same as browse_document). Use after search to visually inspect document pages with transcription overlay. Use highlight_term to pre-populate the search bar and highlight matching text lines.

reference_code:string*pages:string*highlight_term:anymax_pages:integer

Resources (19)

get_table_of_contents

Get the table of contents (Innehållsförteckning) for the Riksarkivet historical guide.

URI: riksarkivet://guide/contents/table_of_contentsMIME: text/plain

get_ui_resource

URI: ui://document-viewer/mcp-app.htmlMIME: text/html;profile=mcp-app

get_ui_resource

URI: ui://pdf-viewer/mcp-app.htmlMIME: text/html;profile=mcp-app

archival-guide/SKILL.md

URI: skill://archival-guide/SKILL.mdMIME: text/markdown

archival-guide/_manifest

File listing for archival-guide

URI: skill://archival-guide/_manifestMIME: application/json

Quick Actions

View on GitHub

Security

Scanned 3 month(s) ago

Risk Level

MINIMAL

Read-only data retrieval, no side effects

Trust Score

C57/100

9/17 checks passed

Scores are informational only and provided “as is” without warranty. AgentHotspot assumes no liability for actions taken based on these ratings.

Quick Stats

Service TypeMCP

Pricing ModelFree

Capabilities28 Tools / 0 Prompts / 19 Resources

OwnerAI-Riksarkivet

CategoryUncategorized

DependenciesStandalone

Set Your Username

RA-MCP Archive Search

Example Use Cases

Description

Capabilities(47 total)

Tools (28)

search_transcribed

search_metadata

browse_document

htr_htr_transcribe

view_document

Resources (19)

get_table_of_contents

get_ui_resource

get_ui_resource

archival-guide/SKILL.md

archival-guide/_manifest

Quick Actions

Security

Quick Stats