Exploring the Model Context Protocol with Deno 2 and Playwright

December 2, 2024

Last week Anthropic introduced the Model Context Protocol (MCP), an attempt to set the standard for integrating data from existing applications and APIs with Large Language Models (LLMs) like Claude.

As AI assistants gain mainstream adoption, the industry has invested heavily in model capabilities, achieving rapid advances in reasoning and quality. Yet even the most sophisticated models are constrained by their isolation from data—trapped behind information silos and legacy systems. Every new data source requires its own custom implementation, making truly connected systems difficult to scale.

I’m personally most interested in the potential of a widely-used protocol to jailbreak personal and private business data out of walled gardens and into the wonderful world of frontier language models (like Claude, gpt-4o and o1). So when I stumbled onto a awesome-mcp-servers directory of examples/projects, I was inspired to give it a try.

Anthropic has also published a bunch of helpful MCP server examples on GitHub. One example that stood out to me was the puppeteer example. Puppeteer is a cool example, because in this case, it means you’re giving Claude the ability to browse and interact with the web via a web browser like Chrome.

Lately at work I’ve been playing with Deno 2 and Playwright, and it’s honestly been a pretty great combo. Playwright seems to be picking up steam lately, and it has has great support for firefox and mobile device emulation.

Since I still have playwright loaded into my brain RAM, I figured it’d be a neat experiment to fork the Puppeteer example, and create a new Deno + Playwright + MCP server implementation.

For the rest of this post, I’ll be walking through some of my experience from my vibecheck on the Model Context Protocol. My hope is for this post to be a resource I can share with friends as an onramp to MCP.

Diving into Model Context Protocol server development

At the time of writing, MCP has officially implemented a Python SDK, and a TypeScript SDK.

The best way to get started is to visit the Quickstart guide in their docs. They cover using the SQLite MCP Server to give Claude the ability to query data out of an SQLite DB and into the context of the language model.

There are hosts, clients, and servers. Hosts are LLM apps like Claude Desktop or an IDE. Clients fetch data from MCP servers, and return it to the host application (like Claude Desktop). Servers act as an integration layer with external systems, exposing an API interface via tool definitions, and implementing the code to gather/respond to requests.

The MCP Host (ie Claude Desktop) is a client to a local running server process that implements the MCP spec. Your MCP servers can run literally anything, as long as it speaks the MCP protocol.

To illustrate the flow of things, they provide this diagram. Claude desktop makes requests to a MCP server, which then fetches the desired data, then provides the data/context back to Claude via the MCP protocol.

What does the protocol actually look like?

The Model Context Protocol uses JSON-RPC 2.0 as it’s wire format.

There are 4 message types: Requests, Notifications, Results, and Errors; which I’ve shared below.

I like the idea of a JSON-RPC format, because it’s easy to read from the wire and understand what’s happening in a pretty transparent way (kind of important for good protocols!)

Interestingly, it sort of looks like the Language Server Protocol, which is how VSCode and other IDEs integrate with various programming language servers for things like linting and code completion.

You can find all of this information in the Core architecture section of the docs.

Requests are messages that expect a response.

interface Request {
  method: string;
  params?: { ... };
}

// as JSON RPC
{
  jsonrpc: "2.0",
  id: number | string,
  method: string,
  params?: object
}

Notifications are one-way messages that don’t expect a response.

interface Notification {
  method: string;
  params?: { ... };
}

// as JSON RPC
{
  jsonrpc: "2.0",
  id: number | string,
  result?: object,
  error?: {
    code: number,
    message: string,
    data?: unknown
  }
}

Results are successful (non error) responses to Requests.

interface Result {
  [key: string]: unknown;
}

// as JSON RPC
{
  jsonrpc: "2.0",
  method: string,
  params?: object
}

Errors are responses to failed Requests

interface Error {
  code: number;
  message: string;
  data?: unknown;
}

The standard error codes are

// Standard JSON-RPC error codes
enum ErrorCode {
  ParseError = -32700,
  InvalidRequest = -32600,
  MethodNotFound = -32601,
  InvalidParams = -32602,
  InternalError = -32603
}

Writing a Model Context Protocol server with Deno 2

Before walking through the specifics, please note that all of this code lives on github, and I encourage you to clone it and try it! https://github.com/jakedahn/deno2-playwright-mcp-server

Recently I’ve been on a TypeScript and Deno 2 kick, I even shared another exploration of building a CLI tool with Deno a few days ago. Read about the t2x CLI

One of the things I like about Deno is that you can compile your code down to an executable binary. IMO this is an excellent way to distribute shareable/runnable MCP servers. When you share it to someone, they don’t have to worry about configuring a software runtime, they can just execute the file.

To get started, I ran deno init, which creates a deno.json file, and a main.ts file.

I added a few dependencies by running deno add npm:@modelcontextprotocol/sdk npm:playwright jsr:@std/encoding

When possible, I try to use libraries from JSR. Here we’re using the jsr:@std/encoding library for base64 encoding screenshots.

After adding some build scripts, to build binaries for different operating systems, my deno.json file looks like this:

// deno.json
{
  "tasks": {
    "start": "deno run --watch --allow-net --allow-env --allow-run --allow-sys --allow-read --allow-write main.ts",
    "build-mac": "deno compile --target aarch64-apple-darwin --allow-net --allow-env --allow-run --allow-sys --allow-read --allow-write --output playwright-server main.ts",
    "build-linux-x86_64": "deno compile --target x86_64-unknown-linux-gnu --allow-net --allow-env --allow-run --allow-sys --allow-read --allow-write --output playwright-server main.ts",
    "build-linux-ARM64": "deno compile --target aarch64-unknown-linux-gnu --allow-net --allow-env --allow-run --allow-sys --allow-read --allow-write --output playwright-server main.ts",
    "build-windows-x86_64": "deno compile --target x86_64-pc-windows-msvc --allow-net --allow-env --allow-run --allow-sys --allow-read --allow-write --output playwright-server main.ts"
  },
  "imports": {
    "@modelcontextprotocol/sdk": "npm:@modelcontextprotocol/sdk@^1.0.1",
    "@std/encoding": "jsr:@std/encoding@^1.0.5",
    "playwright": "npm:playwright@^1.49.0"
  }
}

Then in the main.ts file, I defined a bunch of tools and setup handlers for each tool.

Tools are how you expose executable functions from your server to the client. In your code, you define the tool interface, which includes a name, description, and an input schema. Then you implement a handler function for each tool.

Below I’ve included a few examples, but you can find the entire list of tool definitions for the Playwright MCP server over on github: https://github.com/jakedahn/deno2-playwright-mcp-server/blob/main/main.ts#L21-L125

As an example, here is both the playwright_screenshot tool definition, as well as the handler function that routes tool calls to the code that gets run. I’ve basically implemented most of the real application logic inside of the handleToolCall function.

// playwright_navigate Tool Definition
{
  name: "playwright_screenshot",
  description: "Take a screenshot of the current page or a specific element",
  inputSchema: {
    type: "object",
    properties: {
      name: { type: "string", description: "Name for the screenshot" },
      selector: {
        type: "string",
        description: "CSS selector for element to screenshot",
      },
      width: {
        type: "number",
        description: "Width in pixels (default: 800)",
      },
      height: {
        type: "number",
        description: "Height in pixels (default: 600)",
      },
    },
    required: ["name"],
  },
}

// Server request handler
async function handleToolCall(
  name: string,
  args: Record<string, unknown>
): Promise<{ toolResult: CallToolResult }> {
  const page = await ensureBrowser();

  switch (name) {
    case "playwright_screenshot": {
      const width = (args.width as number) ?? 800;
      const height = (args.height as number) ?? 600;
      await page.setViewportSize({ width, height });

      const screenshot = await (args.selector
        ? page.locator(args.selector as string).screenshot()
        : page.screenshot());

      const base64Screenshot = encodeBase64(screenshot);
      screenshots.set(args.name as string, base64Screenshot);

      server.notification({
        method: "notifications/resources/list_changed",
      });

      return {
        toolResult: {
          content: [
            {
              type: "text",
              text: `Screenshot '${args.name}' taken at ${width}x${height}`,
            } as TextContent,
            {
              type: "image",
              data: base64Screenshot,
              mimeType: "image/png",
            } as ImageContent,
          ],
          isError: false,
        },
      };
    }


    // Other handlers are implemented behind other case names:
    //   case "playwright_navigate":
    //   case "playwright_click":
    //   case "playwright_fill":
    //   case "playwright_select":
    //   case "playwright_hover":
    //   case "playwright_evaluate":
  }
}

One thing worth noting here, is that when your server returns a file or binary data, such as a web browser screenshot png, you do so via the concept of Resources. Resources are a core primitive in MCP, you can read more about them here: https://modelcontextprotocol.io/docs/concepts/resources

You also implement Resource List and Read endpoints where the client (Claude Desktop) can request a specific resource/file/data. Below is an example of what those handlers look like in the Playwright example server.

server.setRequestHandler(ListResourcesRequestSchema, async () => ({
  resources: [
    ...Array.from(screenshots.keys()).map((name) => ({
      uri: `screenshot://${name}`,
      mimeType: "image/png",
      name: `Screenshot: ${name}`,
    })),
  ],
}));

server.setRequestHandler(
  ReadResourceRequestSchema,
  async (request: ReadResourceRequest) => {
    const uri = request.params.uri;
    if (uri.startsWith("screenshot://")) {
      const name = uri.split("://")[1];
      const screenshot = screenshots.get(name);
      if (screenshot) {
        return {
          contents: [
            {
              uri,
              mimeType: "image/png",
              blob: screenshot,
            },
          ],
        };
      }
    }

    throw new Error(`Resource not found: ${uri}`);
  }
);

Gotchas

At the 6:25 timestamp in the video at the top of the page, I actually ran into a challenge with Claude not understanding that it needed to read the text from the page.

Sometimes, getting Claude to call the tool you want it to call can be annoying. The cheap/quick hack to get make it more accurate with tool calling is to dedicate some of your initial prompt with additional information about how to use the tools.

For example, using a prompt like this took me from 1/3 success rate, to 3/3:

Please use the playwright tools for the following requests.

Playwright lets you load web pages, grab text content from the page, and interact with various elements. To answer questions about text content, you'll want to grab the text from the page using the `playwright_evaluate` tool. To navigate to different urls, you'll want to use the `playwright_navigate` tool. Then tou click around, use the `playwright_click tool.

---
Your task:

visit shruggingface.com, and give me a list of all the blog post titles on the homepage

There’s probably better ways to do this, like maybe making a tool specifically for playwright_read_page, which could be called directly. Prompts (reusable prompt templates in the protocol) may also be a better solution , but I’m le tired.

Building a Deno binary

Building a deno binary is easy, you just run deno compile main.ts, along with whatever security permissions you want to allow.

In my deno.json file, I’ve included a few build tasks.

To build the MCP server executable binary on mac, you run deno task build-mac

This will create a file at ./playwright-server

How to configure Claude Desktop to run a local Model Context Protocol server server

First, you need to configure the Claude Desktop app to run the MCP server process, since it is Claude Desktop that will be making requests.

To do this, you need to add the following configuration to the ~/Library/Application\ Support/Claude/claude_desktop_config.json file.

Because we’ve built a binary, you only need to set the command path to point to the binary.

{
  "mcpServers": {
    "playwright": {
      "command": "/path/to/deno2-playwright-mcp-server/playwright-server"
    }
  }
}

Once you’ve configured this file, you should be able to start Claude Desktop, and begin surfing the web with your bff Claude! 😎

You’ll know it’s working when you see this hammer button with a number next to it. When you click the button, you get a list of all of the MCP tools that Claude Desktop has access to.

Vibecheck?

Vibes are ✅.

Overall I’m excited about the Model Context Protocol, and the first SDK releases. You only need a small amount of glue code to jailbreak your legacy systems.

It would be amazing if both the AI engineering community and other LLM providers were to adopt MCP as a standard protocol.

The dream is to have a massive public open source library of MCP server implementations, and support across the other major LLM providers like Google and OpenAI.

I’ve shared all of the code from the video and this blog post on github over here: https://github.com/jakedahn/deno2-playwright-mcp-server Also, many thanks to the MCP team for sharing excellent examples, this Playwright example was heavily based on their Puppeteer example, so all of the credit goes to them.

I may be biased, but if you are considering using Deno to write an MCP server, it’s probably best to just fork my repo and start making edits. It’s fairly instructive and hackable.

If you have any thoughts/questions/suggestions/improvements, feel free to email me jake at shruggingface.com

Update: April 19, 2025

It’s been a few months since I posted this, and MCP as taken the AI world by storm! While I had a neat toy implementation of a playwright MCP server in this post, Microsoft has actually released an official playwright MCP server here: https://github.com/microsoft/playwright-mcp

If you’re planning to actually do things with playwright and MCP, I’d highly recommend not using my code, and instead using the official implementation.