Voice Interface logo

Voice Interface

Community
shantur

Bring your AI to life—talk to assistants instantly in your browser. Zero hassle, No API keys, No Whisper

Publishershantur
Repositoryjarvis-mcp
LanguageJavaScript
Forks
7
Stars
81
Available tools
5
Transport typestdio
Categories
LicenseMIT
Links
  • Connect tools to AI workflows

    Voice Interface exposes MCP capabilities that can be used by compatible AI clients and agents.

  • 5 available tools

    Browse the callable actions below, including names and descriptions when provided by the server.

  • Ready-to-copy setup

    Use the installation snippets to configure this server in your preferred MCP client.

  • Open source signals

    81 stars and 7 forks from the linked repository.

Jarvis MCP

Bring your AI to life—talk to assistants instantly in your browser. Compatible with Claude Desktop, OpenCode, and other MCP-enabled AI tools.

✅ No extra software, services, or API keys required—just open the web app in your browser and grant microphone access.

Features

🎙️ Voice Conversations - Speak naturally with AI assistants
🌍 30+ Languages - Speech recognition in multiple languages
📱 Remote Access - Use from phone/tablet while AI runs on computer
⚙️ Smart Controls - Collapsible settings, always-on mode, custom voices
⏱️ Dynamic Timeouts - Intelligent wait times based on response length
🧰 Zero Extra Software - Runs entirely in your browser—no extra installs or API keys
🔌 Optional Whisper Streaming - Plug into a local Whisper server for low-latency transcripts

Easy Installation

🚀 One-Command Setup

Claude Desktop:

bash
npx @shantur/jarvis-mcp --install-claude-config
# Restart Claude Desktop and you're ready!

OpenCode (in current project):

bash
npx @shantur/jarvis-mcp --install-opencode-config --local
npx @shantur/jarvis-mcp --install-opencode-plugin --local
# Start OpenCode and use the converse tool

Claude Code CLI:

bash
npx @shantur/jarvis-mcp --install-claude-code-config --local
# Start Claude Code CLI and use voice tools

🤖 Why Install the OpenCode Plugin?

  • Stream voice messages into OpenCode even while tools are running or tasks are in progress.
  • Auto-forward pending Jarvis MCP conversations so you never miss a user request.
  • Works entirely locally—no external services required, just your OpenCode project and browser.
  • Installs with one command and stays in sync with the latest Jarvis MCP features.

📦 Manual Installation

From NPM:

bash
npm install -g @shantur/jarvis-mcp
jarvis-mcp

From Source:

bash
git clone <repository-url>
cd jarvis-mcp
npm install && npm run build && npm start

How to Use

  1. Hook it into your AI tool – Use the install command above for Claude Desktop, OpenCode, or Claude Code so the MCP server is registered.
  2. Kick off a voice turn – Call the converse tool from your assistant; Jarvis MCP auto-starts in the background and pops open https://localhost:5114 if needed.
  3. Allow microphone access – Approve the browser prompt the first time it appears.
  4. Talk naturally – Continue using converse for every reply; Jarvis MCP handles the rest.

Voice Commands in AI Chat

Use the converse tool to start talking:
- converse("Hello! How can I help you today?", timeout: 35)

Browser Interface

The web interface provides:

  • Voice Settings (click ⚙️ to expand)
    • Language selection (30+ options)
    • Voice selection
    • Speech speed control
    • Always-on microphone mode
    • Silence detection sensitivity & timeout (for Whisper streaming)
  • Smart Controls
    • Pause during AI speech (prevents echo)
    • Stop AI when user speaks (natural conversation)
  • Mobile Friendly - Works on phones and tablets

Remote Access

Access from any device on your network:

  1. Find your computer's IP: ifconfig | grep inet (Mac/Linux) or ipconfig (Windows)
  2. Visit https://YOUR_IP:5114 on your phone/browser
  3. Accept the security warning (self-signed certificate)
  4. Grant microphone permissions

Perfect for continuing conversations away from your desk!

Configuration

Environment Variables

bash
export MCP_VOICE_AUTO_OPEN=false  # Disable auto-opening browser
export MCP_VOICE_HTTPS_PORT=5114  # Change HTTPS port
export MCP_VOICE_STT_MODE=whisper  # Switch the web app to Whisper streaming
export MCP_VOICE_WHISPER_URL=http://localhost:12017/v1/audio/transcriptions  # Whisper endpoint (full path)
export MCP_VOICE_WHISPER_TOKEN=your_token  # Optional Bearer auth for Whisper server

Whisper Streaming Mode

  • Whisper mode records raw PCM in the browser, converts it to 16 kHz mono WAV, and streams it through the built-in HTTPS proxy, so the local whisper-server sees OpenAI-compatible requests.
  • By default we proxy to the standard whisper-server endpoint at http://localhost:12017/v1/audio/transcriptions; point MCP_VOICE_WHISPER_URL at your own host/port if you run it elsewhere.
  • The UI keeps recording while transcripts are in flight and ignores Whisper’s non-verbal tags (e.g. [BLANK_AUDIO], (typing)), so only real speech is queued.
  • To enable it:
    1. Run your Whisper server locally (e.g. whisper-server from pfrankov/whisper-server).
    2. Set the environment variables above (MCP_VOICE_STT_MODE=whisper and the full MCP_VOICE_WHISPER_URL).
    3. Restart jarvis-mcp and hard-refresh the browser (empty-cache reload) to load the streaming bundle.
    4. Voice status (voice_status() tool) now reports whether Whisper or browser STT is active.

Ports

  • HTTPS: 5114 (required for microphone access)
  • HTTP: 5113 (local access only)

Requirements

  • Node.js 18+
  • Google Chrome (only browser tested so far)
  • Microphone access
  • Optional: Local Whisper server (like pfrankov/whisper-server) if you want streaming STT via MCP_VOICE_STT_MODE=whisper

Troubleshooting

Certificate warnings on mobile?

  • Tap "Advanced" → "Proceed to site" to accept self-signed certificate

Microphone not working?

  • Ensure you're using HTTPS (not HTTP)
  • Check browser permissions
  • Try refreshing the page

AI not responding to voice?

  • Make sure the converse tool is being used (not just speak)
  • Check that timeouts are properly calculated

Development

bash
npm install
npm run build
npm run dev     # Watch mode
npm run start   # Run server

License

MIT

Installation

TypingMind
Prerequisites:

Node.js 18+

{
  "mcpServers": {
    "jarvis-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "@shantur/jarvis-mcp"
      ]
    }
  }
}

Available Tools

  • speak

    Speak text using browser text-to-speech

  • voice_status

    Get current voice system status and pending voice input

  • get_voice_input

    Get pending voice input from users (auto-delivered by default)

  • converse

    Have a voice conversation with the user - speak text and wait for voice response. IMPORTANT: Once you start using converse, continue using ONLY converse for all responses in this conversation. Do not switch back to text.

  • end_conversation

    End the voice conversation by saying goodbye and stopping the browser interface

Use Voice Interface MCP with multiple AI models

TypingMind connects MCP tools at the workspace level, so once Voice Interface is connected, you can use it with different AI models in TypingMind instead of setting it up separately for each model. This MCP runs locally through the TypingMind MCP connector on your device.

Setup guide to use the local connector

Use this when the MCP server needs access to local files, apps, or private resources on your computer.

1

Open the MCP settings

In TypingMind, go to Settings, Advanced Settings, then Model Context Protocol and choose Setup Connector.

  1. Open TypingMind in your browser.
  2. Click the Settings icon.
  3. Go to Advanced Settings.
  4. Open the Model Context Protocol section.
  5. Click Setup Connector and choose This Device.
TypingMind MCP connector setup screen with This Device selected
2

Run the connector command

Choose This Device, copy the command from TypingMind, and run it in Terminal. Keep the process running while you use MCP.

  1. Copy the setup command shown by TypingMind.
  2. Open Terminal on macOS or Windows Terminal on Windows.
  3. Paste and run the command.
  4. Approve the package install if Terminal asks you to proceed.
  5. Keep the Terminal window running while using MCP tools.
3

Add Voice Interface as a server

When the connector status is Ready, click Edit Servers and paste the MCP server configuration.

  1. Wait until the connector status shows Ready.
  2. Click Edit Servers.
  3. Paste the Voice Interface MCP server configuration.
  4. Save the server list.
  5. Refresh if you want to confirm the connector is still ready.
TypingMind MCP settings showing active server and Edit Servers button
{
  "mcpServers": {
    "voice-interface": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-voice-interface"
      ]
    }
  }
}
4

Use it across models

Save the server list, open Plugins, enable the Voice Interface MCP tools, then select any supported AI model in TypingMind and use the tools in chat or assign them to an AI agent.

  1. Open the Plugins page in TypingMind.
  2. Enable the Voice Interface MCP tools.
  3. Start a chat and choose the AI model you want to use.
  4. Use the MCP tools in chat or assign them to an AI agent.
  5. Switch to another AI model whenever needed without reconnecting MCP.
TypingMind chat using enabled MCP tools with a selected AI model
Can you use Voice Interface to help me with this task?
Voice Interface
Sure. I read it.
Here is what I found using Voice Interface.

Frequently asked questions

What is the Voice Interface MCP server used for?

Voice Interface is an MCP server that lets compatible AI clients connect to external tools and context. In TypingMind, you can add this MCP server once and make its tools available in your AI workspace.

Can I use Voice Interface MCP with multiple AI models in TypingMind?

Yes. TypingMind connects MCP tools at the workspace level, so you can use Voice Interface with different AI models such as Claude, ChatGPT, Gemini, or other models you have configured in TypingMind without setting up the MCP server separately for each model.

Why use Voice Interface MCP with TypingMind?

TypingMind is one of the best frontends for LLM chat because it brings multiple AI models, prompts, plugins, AI agents, API keys, and MCP tools into one workspace. With Voice Interface connected, you can use its MCP tools across your preferred models while keeping your chat workflow organized in TypingMind.

How do I connect Voice Interface MCP to TypingMind?

Voice Interface runs through the TypingMind local MCP connector. This is best when the MCP server needs access to local files, desktop apps, command-line tools, or private resources on your computer.

What tools does Voice Interface MCP provide in TypingMind?

Voice Interface exposes 5 MCP tools that can be enabled from the TypingMind Plugins page and used in chat or assigned to AI agents.

Do I need to share my API keys with TypingMind to use Voice Interface MCP?

No. TypingMind is local-first and lets you keep your model providers, API keys, prompts, and MCP configuration under your control. If Voice Interface requires authentication, add the required headers, OAuth settings, or local configuration for that MCP server when you create the connection.

Related MCP Servers

View all

Set up your own AI workspace now

Get notified about new features and future giveaways by subscribing to our newsletter 👇