MinerU Document Explorer logo

MinerU Document Explorer

Organization
opendatalab

Agent-native knowledge engine with MCP tools for document indexing, wiki organization, fast retrieval and deep reading across PDF/DOCX/PPTX/Markdown

Publisheropendatalab
RepositoryMinerU-Document-Explorer
LanguageTypeScript
Forks
60
Stars
554
Available tools
0
Transport typestdio
Categories
LicenseMIT
Links
  • Connect tools to AI workflows

    MinerU Document Explorer exposes MCP capabilities that can be used by compatible AI clients and agents.

  • 0 available tools

    Browse the callable actions below, including names and descriptions when provided by the server.

  • Ready-to-copy setup

    Use the installation snippets to configure this server in your preferred MCP client.

  • Open source signals

    554 stars and 60 forks from the linked repository.


πŸ€” Why MinerU Document Explorer?

MinerU Document Explorer equips your agent with three tool suites β€” Retrieve, Deep Read, and Ingest β€” closing the full knowledge loop:

Overview of MinerU Document Explorer

  • πŸ” Retrieve β€” Cross-collection search: BM25, vector, and hybrid with LLM reranking and query expansion
  • πŸ“– Deep Read β€” Navigate inside a single document without loading the whole file: table of contents, section reading, inline search, and element extraction
  • πŸ“ Ingest β€” Build and maintain a LLM wiki from raw documents, following the Karpathy LLM Wiki pattern

Developed by the MinerU team, building on QMD and Karpathy's LLM Wiki.

πŸ’‘ What can you do with it?

  • Research assistant β€” Point it at a folder of PDFs and ask your AI agent to survey the literature, compare methods, or find specific results across papers
  • Project knowledge base β€” Index your codebase, docs, and design specs; let the agent answer questions about architecture, find relevant code, or trace requirements
  • Study companion β€” Upload textbooks or lecture notes; use doc_toc + doc_read to navigate, doc_grep to search, and build a wiki of key concepts
  • Enterprise document search β€” Index contracts, reports, or manuals and search them with natural language queries enhanced by LLM reranking

See it in action: The demo/ folder contains a complete end-to-end example β€” an AI agent automatically reads ~10 arXiv papers on RAG, builds an interlinked wiki knowledge base, and writes a research survey. See the Demo Guide for step-by-step instructions.

πŸš€ Quick Start

Agent-Assisted Setup: If you're using an AI agent (Claude Code, Cursor, etc.), simply ask it to help you deploy MinerU Document Explorer and install skills β€” the agent can handle the entire setup process for you, including MCP configuration.

Follow the https://github.com/opendatalab/MinerU-Document-Explorer/blob/main/docs/quickstart.md to install MinerU Document Explorer and walk the user through configuration.

πŸ“– Document Deep Reading

Navigate and search within a single document without reading the whole file:

sh
# View document structure
qmd doc-toc papers/attention-is-all-you-need.pdf

# Read specific sections by address
qmd doc-read papers/attention-is-all-you-need.pdf "line:45-120"

# Search within one document
qmd doc-grep papers/attention-is-all-you-need.pdf "self-attention"

πŸ”Œ MCP Server β€” 15 Tools for AI Agents

Integrate with AI agents via Model Context Protocol.

MCP Server vs CLI: The MCP server runs as a persistent process β€” LLM models (embeddings, reranker, query expansion) are loaded once and stay in memory across requests. CLI commands like qmd query must reload all models on every invocation, adding ~5–15 s of startup overhead each time. For agent workflows, always prefer the MCP server.

Two transport modes:

ModeCommandBest for
stdioqmd mcpClaude Desktop, Claude Code β€” client spawns and manages the process
HTTP daemonqmd mcp --http --daemonCursor, Windsurf, VS Code, multi-client setups β€” one shared persistent server
sh
# Start the HTTP daemon (recommended β€” models stay loaded across all requests)
qmd mcp --http --daemon             # default port 8181
qmd mcp --http --daemon --port 8080 # custom port

# Verify server is running
curl http://localhost:8181/health

# Stop the daemon
qmd mcp stop

Client Configuration

Option A β€” stdio (Cursor manages the process lifecycle):

json
{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}

Option B β€” HTTP (run qmd mcp --http --daemon first; models stay loaded, faster responses):

json
{
  "mcpServers": {
    "qmd": {
      "url": "http://localhost:8181/mcp"
    }
  }
}
json
{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}
json
{
  "mcpServers": {
    "qmd": {
      "command": "qmd",
      "args": ["mcp"]
    }
  }
}

For stdio transport, use "command": "qmd", "args": ["mcp"] in your client's MCP configuration.

For HTTP transport, start qmd mcp --http --daemon and point your client to http://localhost:8181/mcp.

See MCP setup guide for all 15 tools and HTTP transport details.

Agent Skills

MinerU Document Explorer ships with a built-in Agent Skill that teaches AI agents how to use the full tool suite effectively β€” decision trees, usage patterns, and best practices for all 15 MCP tools.

sh
# Install the skill (works with both npm and source installs)
qmd skill install              # local project (.agents/skills/)
qmd skill install --global     # global (~/.agents/skills/)

# Or from source repo
claude skill add ./skills/mineru-document-explorer/SKILL.md

πŸ“Š How It Compares

MinerU Doc ExplorerLlamaIndexObsidianNotebookLM
Runs 100% locallyβœ…βš οΈ LLM APIsβœ…βŒ Cloud
Agent integration (MCP)15 toolsPlugin❌❌
Deep reading within docsβœ…βŒβŒβœ…
Wiki knowledge compilationβœ…βŒManual❌
FormatsMD, PDF, DOCX, PPTXManyMDPDF, URL
Search pipelineBM25 + vec + rerankConfigurableBasicProprietary
Zero-config searchβœ… qmd search❌PluginN/A
Open sourceMITMITPartial❌

βš™οΈ Requirements

RequirementNotes
Node.js >= 22 or BunRuntime
Python >= 3.10Document processing (pymupdf, python-docx, python-pptx)
macOSbrew install sqlite for extension support

πŸ“„ Document Processing Setup

Python 3.10+ is required for document processing (PDF, DOCX, PPTX):

sh
# Check Python version
python3 --version  # needs >= 3.10

# Install required Python packages
pip install pymupdf python-docx python-pptx

# Verify
python3 -c "import pymupdf; import docx; import pptx; print('OK')"
sh
pip install mineru-open-sdk
export MINERU_API_KEY="your-key"  # get from https://mineru.net

When MINERU_API_KEY is set, MinerU Cloud is automatically used as the primary PDF provider with PyMuPDF as fallback.

For advanced configuration (custom providers, local VLM models, GPT PageIndex), create ~/.config/qmd/doc-reading.json:

json
{
  "docReading": {
    "providers": {
      "fullText": { "pdf": ["mineru_cloud", "pymupdf"] }
    },
    "credentials": {
      "mineru": { "api_key": "your-api-key" }
    }
  }
}

πŸ€– LLM Models (auto-downloaded on first use)

ModelPurposeSize
embeddinggemma-300MVector embeddings~300 MB
qwen3-reranker-0.6bRe-ranking~640 MB
qmd-query-expansion-1.7BQuery expansion~1.1 GB

Models are only needed for qmd embed, qmd vsearch, and qmd query. qmd search runs BM25 retrieval.

πŸ“š Documentation

🎯 Demo GuideEnd-to-end example: agent-driven RAG research survey
πŸ“– CLI ReferenceAll commands, options, output formats
πŸ”Œ MCP ServerSetup, 15 tools, HTTP transport
πŸ“¦ SDK / LibraryTypeScript API, types, examples
πŸ—οΈ ArchitectureSearch pipeline, scoring, data schema, chunking
🀝 ContributingDevelopment setup, code style, how to contribute

❀️ Acknowledgments

MinerU Document Explorer builds upon these foundational projects:


πŸ“ Changelog

v1 β€” 2026-04-07 (Current)

Rebuilt from an OpenClaw agent skill into a full agent-native knowledge engine: npm package (npm install -g mineru-document-explorer), qmd CLI, MCP server with 15 tools across three groups (Retrieval / Deep Reading / Knowledge Ingestion), multi-format support (MD, PDF, DOCX, PPTX), hybrid search (BM25 + vector + LLM reranking), and LLM Wiki knowledge base pattern.

v0 β€” 2026-03-30 (Previous)

OpenClaw-native agent skill (doc-search CLI). Four capabilities: Logic Retrieval, Semantic Retrieval, Keyword Retrieval, Evidence Extraction. See the v0 repository.

Installation

TypingMind
Prerequisites:

Node.js 18+

{
  "mcpServers": {
    "opendatalab-mineru-document-explorer": {
      "command": "npx",
      "args": [
        "-y",
        "mineru-document-explorer"
      ]
    }
  }
}

Use MinerU Document Explorer MCP with multiple AI models

TypingMind connects MCP tools at the workspace level, so once MinerU Document Explorer is connected, you can use it with different AI models in TypingMind instead of setting it up separately for each model. This MCP runs locally through the TypingMind MCP connector on your device.

Setup guide to use the local connector

Use this when the MCP server needs access to local files, apps, or private resources on your computer.

1

Open the MCP settings

In TypingMind, go to Settings, Advanced Settings, then Model Context Protocol and choose Setup Connector.

  1. Open TypingMind in your browser.
  2. Click the Settings icon.
  3. Go to Advanced Settings.
  4. Open the Model Context Protocol section.
  5. Click Setup Connector and choose This Device.
TypingMind MCP connector setup screen with This Device selected
2

Run the connector command

Choose This Device, copy the command from TypingMind, and run it in Terminal. Keep the process running while you use MCP.

  1. Copy the setup command shown by TypingMind.
  2. Open Terminal on macOS or Windows Terminal on Windows.
  3. Paste and run the command.
  4. Approve the package install if Terminal asks you to proceed.
  5. Keep the Terminal window running while using MCP tools.
3

Add MinerU Document Explorer as a server

When the connector status is Ready, click Edit Servers and paste the MCP server configuration.

  1. Wait until the connector status shows Ready.
  2. Click Edit Servers.
  3. Paste the MinerU Document Explorer MCP server configuration.
  4. Save the server list.
  5. Refresh if you want to confirm the connector is still ready.
TypingMind MCP settings showing active server and Edit Servers button
{
  "mcpServers": {
    "opendatalab-mineru-document-explorer": {
      "command": "npx",
      "args": [
        "-y",
        "mineru-document-explorer"
      ]
    }
  }
}
4

Use it across models

Save the server list, open Plugins, enable the MinerU Document Explorer MCP tools, then select any supported AI model in TypingMind and use the tools in chat or assign them to an AI agent.

  1. Open the Plugins page in TypingMind.
  2. Enable the MinerU Document Explorer MCP tools.
  3. Start a chat and choose the AI model you want to use.
  4. Use the MCP tools in chat or assign them to an AI agent.
  5. Switch to another AI model whenever needed without reconnecting MCP.
TypingMind chat using enabled MCP tools with a selected AI model
Can you use MinerU Document Explorer to help me with this task?
MinerU Document Explorer
Sure. I read it.
Here is what I found using MinerU Document Explorer.

Frequently asked questions

What is the MinerU Document Explorer MCP server used for?

MinerU Document Explorer is an MCP server that lets compatible AI clients connect to external tools and context. In TypingMind, you can add this MCP server once and make its tools available in your AI workspace.

Can I use MinerU Document Explorer MCP with multiple AI models in TypingMind?

Yes. TypingMind connects MCP tools at the workspace level, so you can use MinerU Document Explorer with different AI models such as Claude, ChatGPT, Gemini, or other models you have configured in TypingMind without setting up the MCP server separately for each model.

Why use MinerU Document Explorer MCP with TypingMind?

TypingMind is one of the best frontends for LLM chat because it brings multiple AI models, prompts, plugins, AI agents, API keys, and MCP tools into one workspace. With MinerU Document Explorer connected, you can use its MCP tools across your preferred models while keeping your chat workflow organized in TypingMind.

How do I connect MinerU Document Explorer MCP to TypingMind?

MinerU Document Explorer runs through the TypingMind local MCP connector. This is best when the MCP server needs access to local files, desktop apps, command-line tools, or private resources on your computer.

What tools does MinerU Document Explorer MCP provide in TypingMind?

MinerU Document Explorer exposes MCP capabilities that can be enabled from the TypingMind Plugins page and used in chat or assigned to AI agents.

Do I need to share my API keys with TypingMind to use MinerU Document Explorer MCP?

No. TypingMind is local-first and lets you keep your model providers, API keys, prompts, and MCP configuration under your control. If MinerU Document Explorer requires authentication, add the required headers, OAuth settings, or local configuration for that MCP server when you create the connection.

Related MCP Servers

View all

Set up your own AI workspace now

Get notified about new features and future giveaways by subscribing to our newsletter πŸ‘‡