Minima logo

Minima

CommunityPopular
dmayboroda

On-premises conversational RAG with configurable containers

Publisherdmayboroda
Repositoryminima
LanguagePython
Forks
103
Stars
1.1K
Available tools
0
Transport typestdio, streamable-http
Categories
LicenseMPL-2.0
Links
  • Connect tools to AI workflows

    Minima exposes MCP capabilities that can be used by compatible AI clients and agents.

  • 0 available tools

    Browse the callable actions below, including names and descriptions when provided by the server.

  • Ready-to-copy setup

    Use the installation snippets to configure this server in your preferred MCP client.

  • Open source signals

    1.1K stars and 103 forks from the linked repository.

Minima is an open source RAG on-premises containers, with ability to integrate with ChatGPT and MCP. Minima can also be used as a fully local RAG or with your own deployed LLM.

Minima currently supports four modes:

  1. Isolated installation (Ollama) – Operate fully on-premises with containers, free from external dependencies such as ChatGPT or Claude. All neural networks (LLM, reranker, embedding) run on your cloud or PC, ensuring your data remains secure.

  2. Custom LLM (OpenAI-compatible API) – Use your own deployed LLM with OpenAI-compatible API (vLLM, Ollama server, TGI, etc.). The indexer runs locally while the LLM can be on your server, cloud, or local machine. No Ollama deployment needed, lighter resource usage, and full control over your LLM infrastructure.

  3. Custom GPT – Query your local documents using ChatGPT app or web with custom GPTs. The indexer runs on your cloud or local PC, while the primary LLM remains ChatGPT.

  4. Anthropic Claude – Use Anthropic Claude app to query your local documents. The indexer operates on your local PC, while Anthropic Claude serves as the primary LLM.


Running as Containers

Quick Start with run.sh

The easiest way to start Minima is using the run.sh script:

bash
./run.sh

You'll see the following options:

Select an option:
1) Fully Local Setup (Ollama)
2) Custom LLM (OpenAI-compatible API)
3) ChatGPT Integration
4) MCP usage
5) Quit

Manual Docker Compose Commands

  1. Create a .env file in the project's root directory (where you'll find .env.sample). Place .env in the same folder and copy all environment variables from .env.sample to .env.

  2. Ensure your .env file includes the following variables:

  1. For fully local installation use: docker compose -f docker-compose-ollama.yml --env-file .env up --build.

  2. For custom LLM deployment (OpenAI-compatible API) use: docker compose -f docker-compose-custom-llm.yml --env-file .env up --build.

  3. For ChatGPT enabled installation use: docker compose -f docker-compose-chatgpt.yml --env-file .env up --build.

  4. For MCP integration (Anthropic Desktop app usage): docker compose -f docker-compose-mcp.yml --env-file .env up --build.

  5. In case of ChatGPT enabled installation copy OTP from terminal where you launched docker and use Minima GPT

  6. If you use Anthropic Claude, just add folliwing to /Library/Application\ Support/Claude/claude_desktop_config.json

{
    "mcpServers": {
      "minima": {
        "command": "uv",
        "args": [
          "--directory",
          "/path_to_cloned_minima_project/mcp-server",
          "run",
          "minima"
        ]
      }
    }
  }
  1. To use fully local installation go to cd electron, then run npm install and npm start which will launch Minima electron app.

  2. Ask anything, and you'll get answers based on local files in {LOCAL_FILES_PATH} folder.


Variables Explained

LOCAL_FILES_PATH: Specify the root folder for indexing (on your cloud or local pc). Indexing is a recursive process, meaning all documents within subfolders of this root folder will also be indexed. Supported file types: .pdf, .xls, .docx, .txt, .md, .csv.

EMBEDDING_MODEL_ID: Specify the embedding model to use. Currently, only Sentence Transformer models are supported. Testing has been done with sentence-transformers/all-mpnet-base-v2, but other Sentence Transformer models can be used.

EMBEDDING_SIZE: Define the embedding dimension provided by the model, which is needed to configure Qdrant vector storage. Ensure this value matches the actual embedding size of the specified EMBEDDING_MODEL_ID.

OLLAMA_MODEL: Set up the Ollama model, use an ID available on the Ollama site. Please, use LLM model here, not an embedding. This is only required when using Ollama (not needed when using custom LLM).

LLM_BASE_URL: (Optional) Base URL for your custom OpenAI-compatible LLM API endpoint. When this is set, Ollama will not be used and you don't need to deploy it.

LLM_MODEL: (Optional) Model name for your custom LLM. Required when LLM_BASE_URL is set.

LLM_API_KEY: (Optional) API key for your custom LLM. If your LLM doesn't require authentication, you can omit this or set it to any value.

RERANKER_MODEL: Specify the reranker model for Ollama mode. Currently, we have tested with BAAI rerankers. You can explore all available rerankers using this link. Note: This is NOT required for Custom LLM mode - the reranker model will not be downloaded if you're using LLM_BASE_URL.

USER_ID: Just use your email here, this is needed to authenticate custom GPT to search in your data.

PASSWORD: Put any password here, this is used to create a firebase account for the email specified above.


Examples

Example of .env file for on-premises/local usage with Ollama:

LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
EMBEDDING_SIZE=768
OLLAMA_MODEL=qwen2:0.5b # must be LLM model id from Ollama models page
RERANKER_MODEL=BAAI/bge-reranker-base # please, choose any BAAI reranker model

Example of .env file for custom LLM deployment (OpenAI-compatible API):

LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
EMBEDDING_SIZE=768
LLM_BASE_URL=http://your-llm-address:port/v1 # Your custom LLM endpoint
LLM_MODEL=Qwen/Qwen-1.7B # Your model name
LLM_API_KEY=not-needed # Optional: API key if required

# NOTE: OLLAMA_MODEL and RERANKER_MODEL are NOT needed for custom LLM mode
# The Docker build will skip reranker download automatically

Important: When using custom LLM mode, you do NOT need to set OLLAMA_MODEL or RERANKER_MODEL variables. The custom LLM workflow uses direct retrieval without reranking for better performance. The Dockerfile will automatically skip downloading the reranker model during build.

To use a chat ui, please navigate to http://localhost:3000

The custom LLM mode uses a different workflow compared to Ollama:

Ollama Workflow:

  1. User query → Query enhancement (LLM call)
  2. Document retrieval with reranking (HuggingFace CrossEncoder)
  3. Answer generation (LLM call)

Custom LLM Workflow:

  1. User query → LLM decides if document search is needed (function calling)
  2. If needed: Direct vector search (no reranking)
  3. LLM generates answer with or without retrieved context

Compatible LLM Servers:

  • vLLM - High-performance inference server (http://your-server:8000/v1)
  • Text Generation Inference (TGI) - Hugging Face's inference server
  • Ollama Server - Ollama running in API mode
  • LiteLLM - Proxy for multiple LLM providers
  • LocalAI - OpenAI-compatible local inference
  • OpenAI API - Directly use OpenAI's API
  • Any OpenAI-compatible endpoint

This will automatically use docker-compose-custom-llm.yml which deploys only the necessary services (no Ollama container).

Example of .env file for Claude app:

LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
EMBEDDING_SIZE=768

For the Claude app, please apply the changes to the claude_desktop_config.json file as outlined above.

To use MCP with GitHub Copilot:

  1. Create a .env file in the project’s root directory (where you’ll find env.sample). Place .env in the same folder and copy all environment variables from env.sample to .env.

  2. Ensure your .env file includes the following variables:

    • LOCAL_FILES_PATH
    • EMBEDDING_MODEL_ID
    • EMBEDDING_SIZE
  3. Create or update the .vscode/mcp.json with the following configuration:

json
{
  "servers": {
    "minima": {
      "type": "stdio",
      "command": "path_to_cloned_minima_project/run_in_copilot.sh",
      "args": [
        "path_to_cloned_minima_project"
      ]
    }
  }
}

Example of .env file for ChatGPT custom GPT usage:

LOCAL_FILES_PATH=/Users/davidmayboroda/Downloads/PDFs/
EMBEDDING_MODEL_ID=sentence-transformers/all-mpnet-base-v2
EMBEDDING_SIZE=768
USER_ID=user@gmail.com # your real email
PASSWORD=password # you can create here password that you want

Also, you can run minima using run.sh.


For MCP usage, please be sure that your local machines python is >=3.10 and 'uv' installed.

Minima (https://github.com/dmayboroda/minima) is licensed under the Mozilla Public License v2.0 (MPLv2).

Installation

TypingMind
Prerequisites:

Node.js 18+

{
  "mcpServers": {
    "dmayboroda-minima": {
      "command": "npx",
      "args": [
        "-y",
        "minima"
      ]
    }
  }
}

Use Minima MCP with multiple AI models

TypingMind connects MCP tools at the workspace level, so once Minima is connected, you can use it with different AI models in TypingMind instead of setting it up separately for each model. You can run MCP locally on your device or connect to a remote MCP server URL.

Option 1: Use the local connector

Use this when the MCP server needs access to local files, apps, or private resources on your computer.

1

Open the MCP settings

In TypingMind, go to Settings, Advanced Settings, then Model Context Protocol and choose Setup Connector.

  1. Open TypingMind in your browser.
  2. Click the Settings icon.
  3. Go to Advanced Settings.
  4. Open the Model Context Protocol section.
  5. Click Setup Connector and choose This Device.
TypingMind MCP connector setup screen with This Device selected
2

Run the connector command

Choose This Device, copy the command from TypingMind, and run it in Terminal. Keep the process running while you use MCP.

  1. Copy the setup command shown by TypingMind.
  2. Open Terminal on macOS or Windows Terminal on Windows.
  3. Paste and run the command.
  4. Approve the package install if Terminal asks you to proceed.
  5. Keep the Terminal window running while using MCP tools.
3

Add Minima as a server

When the connector status is Ready, click Edit Servers and paste the MCP server configuration.

  1. Wait until the connector status shows Ready.
  2. Click Edit Servers.
  3. Paste the Minima MCP server configuration.
  4. Save the server list.
  5. Refresh if you want to confirm the connector is still ready.
TypingMind MCP settings showing active server and Edit Servers button
{
  "mcpServers": {
    "dmayboroda-minima": {
      "command": "npx",
      "args": [
        "-y",
        "minima"
      ]
    }
  }
}
4

Use it across models

Save the server list, open Plugins, enable the Minima MCP tools, then select any supported AI model in TypingMind and use the tools in chat or assign them to an AI agent.

  1. Open the Plugins page in TypingMind.
  2. Enable the Minima MCP tools.
  3. Start a chat and choose the AI model you want to use.
  4. Use the MCP tools in chat or assign them to an AI agent.
  5. Switch to another AI model whenever needed without reconnecting MCP.
TypingMind chat using enabled MCP tools with a selected AI model
Can you use Minima to help me with this task?
Minima
Sure. I read it.
Here is what I found using Minima.

Option 2: Add an MCP server URL

Use this when Minima is already hosted remotely or your team wants one shared connector that multiple users can access.

1

Open MCP connectors

In TypingMind, go to Plugins, open MCP connectors, then choose Add URL.

  1. Open TypingMind in your browser.
  2. Go to Plugins.
  3. Open MCP connectors.
  4. Click Add URL.
TypingMind Add Custom MCP Server URL form
2

Paste the server URL

Enter your server URL in the Server URL field. Add a connection name, description, icon, custom HTTP headers, or OAuth client settings if the server requires them.

  1. Paste your server URL into the Server URL field.
  2. Enter a connection name for Minima.
  3. Add a description and icon if you want it to be easier to identify.
  4. Add custom HTTP headers or OAuth client details if the server requires authentication.
3

Create the connection

Click Create connection, then return to the Plugins list and confirm the new MCP connection is active.

  1. Click Create connection.
  2. Return to the MCP connectors list.
  3. Confirm the Minima connection appears as active.
  4. Refresh the plugin list if the connection does not appear immediately.
4

Switch models without reconnecting

Start a chat with your preferred model, enable the Minima tools from Plugins, and switch to another model whenever needed. The MCP connection stays available to the TypingMind workspace.

  1. Start a new chat in TypingMind.
  2. Select the AI model you want to use.
  3. Enable the Minima tools from Plugins.
  4. Ask the model to use the tool when needed.
  5. Switch to another AI model and reuse the same MCP connection.
TypingMind chat using enabled MCP tools with a selected AI model
Can you use Minima to help me with this task?
Minima
Sure. I read it.
Here is what I found using Minima.

Frequently asked questions

What is the Minima MCP server used for?

Minima is an MCP server that lets compatible AI clients connect to external tools and context. In TypingMind, you can add this MCP server once and make its tools available in your AI workspace.

Can I use Minima MCP with multiple AI models in TypingMind?

Yes. TypingMind connects MCP tools at the workspace level, so you can use Minima with different AI models such as Claude, ChatGPT, Gemini, or other models you have configured in TypingMind without setting up the MCP server separately for each model.

Why use Minima MCP with TypingMind?

TypingMind is one of the best frontends for LLM chat because it brings multiple AI models, prompts, plugins, AI agents, API keys, and MCP tools into one workspace. With Minima connected, you can use its MCP tools across your preferred models while keeping your chat workflow organized in TypingMind.

How do I connect Minima MCP to TypingMind?

Minima can be connected in TypingMind with the local MCP connector or by adding a remote MCP server URL. Use the local connector when the server needs access to files, apps, or private resources on your device, and use a server URL when the MCP server is hosted remotely.

What tools does Minima MCP provide in TypingMind?

Minima exposes MCP capabilities that can be enabled from the TypingMind Plugins page and used in chat or assigned to AI agents.

Do I need to share my API keys with TypingMind to use Minima MCP?

No. TypingMind is local-first and lets you keep your model providers, API keys, prompts, and MCP configuration under your control. If Minima requires authentication, add the required headers, OAuth settings, or local configuration for that MCP server when you create the connection.

Related MCP Servers

View all

Set up your own AI workspace now

Get notified about new features and future giveaways by subscribing to our newsletter 👇