Selenium WebDriver logo

Selenium WebDriver

Community
pshivapr

Selenium Tools for MCP

Publisherpshivapr
Repositoryselenium-mcp
LanguageTypeScript
Forks
5
Stars
6
Available tools
56
Transport typestdio
Categories
LicenseMIT
Links
  • Connect tools to AI workflows

    Selenium WebDriver exposes MCP capabilities that can be used by compatible AI clients and agents.

  • 56 available tools

    Browse the callable actions below, including names and descriptions when provided by the server.

  • Ready-to-copy setup

    Use the installation snippets to configure this server in your preferred MCP client.

  • Open source signals

    6 stars and 5 forks from the linked repository.

Add to Cursor Add to VS Code Add to Claude Add to ChatGPT Add to Codex Add to Gemini

Selenium MCP Server

npm version

npm downloads

GitHub issues

smithery badge

This is a server implementation that bridges the gap between MCP clients (AI assistants) and Selenium WebDriver. It exposes Selenium WebDriver's functionalities as MCP tools, allowing AI models to utilize them for tasks like:

  • Browser management (launching, navigating, closing browsers)
  • Element interaction (clicking, typing, finding elements)
  • Web scraping and automated testing
  • Advanced operations like screenshots, cookie management, and JavaScript execution

In essence, the selenium webdriver mcp setup allows AI assistants to leverage the power of Selenium Webdriver for web automation, by communicating with a dedicated Selenium MCP server via the Model Context Protocol. This facilitates tasks such as automated web interactions, testing, and data extraction, all controlled by AI.

🚀 Overview

A Model Context Protocol (MCP) server for Selenium that provides comprehensive Selenium WebDriver automation tools for AI assistants and applications. This server enables automated web browser interactions, testing, and scraping through a standardized interface.

Built with TypeScript and modern ES modules, it offers type-safe browser automation capabilities through the Model Context Protocol.

✨ Key Features

  • Multi-Browser Support: Chrome, Firefox, Safari, and Edge browser automation
  • Comprehensive Element Interaction: Click, type, hover, drag & drop, file uploads
  • Advanced Navigation: Forward, backward, refresh, window management
  • Wait Strategies: Intelligent waiting for elements and page states
  • Type Safety: Full TypeScript implementation with Zod validation

🤝 Integration

MCP Client Integration

Configure your MCP client to connect to the Selenium server:

Standard Configuration (applicable to Windsurf, Warp, Gemini CLI etc)

json
{
  "servers": {
    "selenium-mcp": {
      "command": "npx",
      "args": ["-y", "selenium-webdriver-mcp@latest"]
    }
  }
}

Installation in VS Code

Update your mcp.json in VS Code with below configuration

NOTE: If you're new to MCP servers, follow this link Use MCP servers in VS Code

Example 'stdio' type connection

json
{
  "servers": {
    "selenium-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "selenium-webdriver-mcp@latest"
      ],
      "type": "stdio"
    }
  },
  "inputs": []
}

Example 'http' type connection

json
{
  "servers": {
    "Selenium": {
      "url": "https://smithery.ai/server/@pshivapr/selenium-mcp",
      "type": "http"
    }
  },
  "inputs": []
}

After installation, the Selenium MCP server will be available for use with your GitHub Copilot agent in VS Code.

To install the Selenium MCP server using the VS Code CLI

bash
# For VS Code
code --add-mcp '{\"name\":\"selenium-mcp\",\"command\": \"npx\",\"args\": [\"selenium-webdriver-mcp@latest\"]}'
bash
# For VS Code Insiders
vscode-insiders --add-mcp '{\"name\":\"selenium-mcp\",\"command\": \"npx\",\"args\": [\"selenium-webdriver-mcp@latest\"]}'

To install the package using either npm, or Smithery

Using npm:

bash
npm install -g selenium-webdriver-mcp@latest

Using Smithery

To install Selenium MCP for Claude Desktop automatically via smithery badge

bash
npx @smithery/cli install @pshivapr/selenium-mcp --client claude

Claude Desktop Integration

Add to your Claude Desktop configuration:

json
{
  "mcpServers": {
    "selenium-mcp": {
      "command": "npx",
      "args": ["-y", "selenium-webdriver-mcp@latest"]
    }
  }
}

Screenshot

Selenium + Claude

Prompts

An example prompt to start AI Agent interaction:

Using selenium mcp tools, navigate to <https://parabank.parasoft.com/> click the 'Register' link and signup using dynamic test data and click register. Then generate selenium tests in <YOUR_FAVOURITE_PROGRAMMING_LANGUAGE> using pom, create tests using cucumber features, steps and execute the tests.

Note: For more prompts, look at examples directory of the project

🛠️ MCP Available Tools

Browser Management Tools

ToolDescriptionParameters
browser_openOpen a new browser sessionbrowser, options
browser_navigateNavigate to a URLurl
browser_navigate_backNavigate back in historyNone
browser_navigate_forwardNavigate forward in historyNone
browser_titleGet the current page titleNone
browser_refreshRefresh the current pageNone
browser_get_urlGet the current page URLNone
browser_get_page_sourceGet the current page HTML sourceNone
browser_maximizeMaximize the browser windowNone
browser_resizeResize browser windowwidth, height
browser_closeClose current browser sessionNone

Cookie Management Tools

ToolDescriptionParameters
browser_get_cookiesGet all cookies from the current browser sessionNone
browser_get_cookie_by_nameGet a specific cookie by namecookie (cookie name)
browser_add_cookie_by_nameAdd a new cookie to the browsercookie (cookie name), value
browser_set_cookie_objectSet a cookie object in the browsercookie (cookie object as string)
browser_delete_cookieDelete a specific cookie by namevalue (cookie name to delete)
browser_delete_cookiesDelete all cookies from the current browser sessionNone

Window Management Tools

ToolDescriptionParameters
browser_switch_to_windowSwitch to a different browser window by handlewindowHandle
browser_switch_to_original_windowSwitch back to the original browser windowNone
browser_switch_to_window_by_titleSwitch to a window by its page titletitle
browser_switch_window_by_indexSwitch to a window by its index positionindex
browser_switch_to_window_by_urlSwitch to a window by its URLurl

Element Interaction Tools

ToolDescriptionParameters
browser_find_elementFind an element on the pageby, value, timeout
browser_find_elementsFind multiple elements on the pageby, value, timeout
browser_clickClick on an elementby, value, timeout
browser_typeType text into an elementby, value, text, timeout
browser_get_element_textGet text content of elementby, value, timeout
browser_file_uploadUpload file via input elementby, value, filePath, timeout
browser_clearClear text from an elementby, value, timeout
browser_get_attributeGet element attribute valueby, value, attribute, timeout

Element State Validation Tools

ToolDescriptionParameters
browser_element_is_displayedCheck if an element is visible on the pageby, value, timeout
browser_element_is_enabledCheck if an element is enabled for interactionby, value, timeout
browser_element_is_selectedCheck if an element is selected (checkboxes, radio buttons)by, value, timeout

Frame Management Tools

ToolDescriptionParameters
browser_switch_to_frameSwitch to an iframe elementby, value, timeout
browser_switch_to_parent_frameSwitch to the parent frame (from nested iframe)None
browser_switch_to_default_contentSwitch back to the main page contentNone

Advanced Action Tools

ToolDescriptionParameters
browser_hoverHover over an elementby, value, timeout
browser_double_clickDouble-click on an elementby, value, timeout
browser_right_clickRight-click (context menu)by, value, timeout
browser_drag_and_dropDrag from source to targetby, value, targetBy, targetValue, timeout
browser_wait_for_elementWait for element to appearby, value, timeout
browser_execute_scriptExecute JavaScript codescript, args
browser_screenshotTake a screenshotfilename (optional)
browser_select_dropdown_by_textSelect dropdown option by visible textby, value, text, timeout
browser_select_dropdown_by_valueSelect dropdown option by valueby, value, dropdownValue, timeout
browser_key_pressPress a keyboard key in the browserkey, timeout

Scrolling Tools

ToolDescriptionParameters
browser_scroll_to_elementScroll to bring an element into viewby, value, timeout
browser_scroll_to_topScroll to the top of the pageNone
browser_scroll_to_bottomScroll to the bottom of the pageNone
browser_scroll_to_coordinatesScroll to specific coordinatesx, y
browser_scroll_by_pixelsScroll by specified number of pixelsx, y

Form Interaction Tools

ToolDescriptionParameters
browser_select_checkboxSelect/check a checkboxby, value, timeout
browser_unselect_checkboxUnselect/uncheck a checkboxby, value, timeout
browser_submit_formSubmit a form elementby, value, timeout
browser_focus_elementFocus on a specific elementby, value, timeout
browser_blur_elementRemove focus from a specific elementby, value, timeout

Element Locator Strategies

  • id: Find by element ID
  • css: Find by CSS selector
  • xpath: Find by XPath expression
  • name: Find by name attribute
  • tag: Find by HTML tag name
  • class: Find by CSS class name

📋 Requirements

  • Node.js: Version 18.0.0 or higher
  • Browsers: Chrome, Firefox, Safari, or Edge installed
  • WebDrivers: Automatically managed by selenium-webdriver
  • Operating System: Windows, macOS, or Linux

🚦 Development

Getting Started

Clone the repository

bash
git clone https://github.com/pshivapr/selenium-mcp.git
cd selenium-mcp

Install dependencies

bash
npm install

Build the project

bash
npm run build

Running the Server

Production Mode

bash
npm start

Development Mode (with auto-reload)

bash
npm run dev

Direct Execution

bash
node dist/index.js

Using as CLI Tool

After building, you can use the server as a global command:

bash
npx selenium-webdriver-mcp@latest

📝 License

MIT License - see LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Badges/Mentions

MCP Market

Pulse

MCP Badge


Built with ❤️ for the Model Context Protocol ecosystem

Installation

TypingMind
Prerequisites:

Node.js 18+

{
  "mcpServers": {
    "selenium-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "selenium-webdriver-mcp@latest"
      ]
    }
  }
}

Available Tools

  • browser_open

    Open a new browser session

  • browser_navigate

    Navigate to a URL

  • browser_navigate_back

    Navigate back in the browser

  • browser_navigate_forward

    Navigate forward in the browser

  • browser_title

    Get the current page title

  • browser_get_url

    Get the current page URL

  • browser_get_page_source

    Get the current page source

  • browser_maximize

    Maximize the browser window

  • browser_resize

    Resize the browser window

  • browser_refresh

    Refresh the current page

  • browser_switch_to_window

    Switch to a different browser window

  • browser_switch_to_original_window

    Switches back to the original browser window

  • browser_switch_to_window_by_title

    Switch to a window by its title

  • browser_switch_window_by_index

    Switch to a window by its index

  • browser_switch_to_window_by_url

    Switch to a window by its URL

  • browser_close

    Close the current browser session

  • browser_find_element

    Find an element

  • browser_find_elements

    Find multiple elements

  • browser_click

    Perform a click on an element

  • browser_type

    Type into an editable field

  • browser_clear

    Clears the value of an input element

  • browser_get_element_text

    Gets the text of an element

  • browser_get_attribute

    Gets the value of an attribute from an element

  • browser_element_is_displayed

    Checks if an element is displayed

  • browser_element_is_enabled

    Checks if an element is enabled

  • browser_element_is_selected

    Checks if an element is selected

  • browser_switch_to_frame

    Switches to an iframe element

  • browser_switch_to_default_content

    Switches to the default content

  • browser_switch_to_parent_frame

    Switches to the parent iframe

  • browser_file_upload

    Uploads a file using a file input element

  • browser_hover

    Hover over an element

  • browser_wait_for_element

    Wait for an element to be present

  • browser_drag_and_drop

    Perform drag and drop between two elements

  • browser_double_click

    Perform double click on an element

  • browser_right_click

    Perform right click (context click) on an element

  • browser_select_dropdown_by_text

    Select dropdown by visible text

  • browser_select_dropdown_by_value

    Select dropdown by value

  • browser_key_press

    Press a key on the keyboard

  • browser_execute_script

    Execute JavaScript in the context of the current page

  • browser_scroll_to_element

    Scroll to an element

  • browser_scroll_to_top

    Scroll to the top of the page

  • browser_scroll_to_bottom

    Scroll to the bottom of the page

  • browser_scroll_to_coordinates

    Scroll to specific coordinates

  • browser_scroll_by_pixels

    Scroll by a specific number of pixels

  • browser_select_checkbox

    Select a checkbox

  • browser_unselect_checkbox

    Unselect a checkbox

  • browser_submit_form

    Submit a form

  • browser_focus_element

    Focus on a specific element

  • browser_blur_element

    Remove focus from a specific element

  • browser_screenshot

    Take a screenshot of the current page

  • browser_get_cookies

    Get all cookies

  • browser_get_cookie_by_name

    Get a cookie by name

  • browser_add_cookie_by_name

    Add a cookie to the browser

  • browser_set_cookie_object

    Set a cookie in the browser

  • browser_delete_cookie

    Delete a cookie from the browser

  • browser_delete_cookies

    Delete cookies from the browser

Use Selenium WebDriver MCP with multiple AI models

TypingMind connects MCP tools at the workspace level, so once Selenium WebDriver is connected, you can use it with different AI models in TypingMind instead of setting it up separately for each model. This MCP runs locally through the TypingMind MCP connector on your device.

Setup guide to use the local connector

Use this when the MCP server needs access to local files, apps, or private resources on your computer.

1

Open the MCP settings

In TypingMind, go to Settings, Advanced Settings, then Model Context Protocol and choose Setup Connector.

  1. Open TypingMind in your browser.
  2. Click the Settings icon.
  3. Go to Advanced Settings.
  4. Open the Model Context Protocol section.
  5. Click Setup Connector and choose This Device.
TypingMind MCP connector setup screen with This Device selected
2

Run the connector command

Choose This Device, copy the command from TypingMind, and run it in Terminal. Keep the process running while you use MCP.

  1. Copy the setup command shown by TypingMind.
  2. Open Terminal on macOS or Windows Terminal on Windows.
  3. Paste and run the command.
  4. Approve the package install if Terminal asks you to proceed.
  5. Keep the Terminal window running while using MCP tools.
3

Add Selenium WebDriver as a server

When the connector status is Ready, click Edit Servers and paste the MCP server configuration.

  1. Wait until the connector status shows Ready.
  2. Click Edit Servers.
  3. Paste the Selenium WebDriver MCP server configuration.
  4. Save the server list.
  5. Refresh if you want to confirm the connector is still ready.
TypingMind MCP settings showing active server and Edit Servers button
{
  "mcpServers": {
    "selenium-webdriver": {
      "command": "npx",
      "args": [
        "-y",
        "selenium-webdriver-mcp"
      ]
    }
  }
}
4

Use it across models

Save the server list, open Plugins, enable the Selenium WebDriver MCP tools, then select any supported AI model in TypingMind and use the tools in chat or assign them to an AI agent.

  1. Open the Plugins page in TypingMind.
  2. Enable the Selenium WebDriver MCP tools.
  3. Start a chat and choose the AI model you want to use.
  4. Use the MCP tools in chat or assign them to an AI agent.
  5. Switch to another AI model whenever needed without reconnecting MCP.
TypingMind chat using enabled MCP tools with a selected AI model
Can you use Selenium WebDriver to help me with this task?
Selenium WebDriver
Sure. I read it.
Here is what I found using Selenium WebDriver.

Frequently asked questions

What is the Selenium WebDriver MCP server used for?

Selenium WebDriver is an MCP server that lets compatible AI clients connect to external tools and context. In TypingMind, you can add this MCP server once and make its tools available in your AI workspace.

Can I use Selenium WebDriver MCP with multiple AI models in TypingMind?

Yes. TypingMind connects MCP tools at the workspace level, so you can use Selenium WebDriver with different AI models such as Claude, ChatGPT, Gemini, or other models you have configured in TypingMind without setting up the MCP server separately for each model.

Why use Selenium WebDriver MCP with TypingMind?

TypingMind is one of the best frontends for LLM chat because it brings multiple AI models, prompts, plugins, AI agents, API keys, and MCP tools into one workspace. With Selenium WebDriver connected, you can use its MCP tools across your preferred models while keeping your chat workflow organized in TypingMind.

How do I connect Selenium WebDriver MCP to TypingMind?

Selenium WebDriver runs through the TypingMind local MCP connector. This is best when the MCP server needs access to local files, desktop apps, command-line tools, or private resources on your computer.

What tools does Selenium WebDriver MCP provide in TypingMind?

Selenium WebDriver exposes 56 MCP tools that can be enabled from the TypingMind Plugins page and used in chat or assigned to AI agents.

Do I need to share my API keys with TypingMind to use Selenium WebDriver MCP?

No. TypingMind is local-first and lets you keep your model providers, API keys, prompts, and MCP configuration under your control. If Selenium WebDriver requires authentication, add the required headers, OAuth settings, or local configuration for that MCP server when you create the connection.

Related MCP Servers

View all

Set up your own AI workspace now

Get notified about new features and future giveaways by subscribing to our newsletter 👇