Introducing Scorecard's MCP Server

We're excited to announce the launch of the first remote Model Context Protocol (MCP) server for evaluation. Scorecard's AI evaluation and experimentation tools are now available directly within your favorite AI assistants. Today we're open-sourcing this implementation at https://github.com/scorecard-ai/scorecard-mcp, developed in close collaboration with our partners at Clerk, Stainless, and Cloudflare.

In the spirit of our earlier eval partnership with Anthropic, we're introducing this MCP server implementation as another step forward to empower organizations to evaluate and improve your AI agent's performance.

Seamless AI Assistant Integration

As AI assistants and specialized tools converge, our MCP implementation enables access to Scorecard's evaluation capabilities through natural language, directly in your workflow. This integration lets you:

  • Run experiments and evaluations directly from your AI assistant
  • Generate synthetic data without context switching
  • Configure and iterate on metrics seamlessly
  • Analyze your agent’s performance while staying in your workflow

Available in claude.ai, Cursor, Windsurf and all MCP-compatible clients, this integration keeps AI builders focused on their work rather than switching between tools.

Open Source Implementation

Our implementation uses the current MCP specification (2025-03-26) with plans to incorporate the next draft of MCP authentication standards in the future via Clerk. This work was made possible by three key partners:

Clerk: Authentication Made Simple

We partnered with Clerk for authentication because their platform tools for AI applications provide a seamless security layer:pre

// Token exchange with Clerk
  export async function tokenExchangeCallback(options: TokenExchangeCallbackOptions) {
    if (options.grantType === "authorization_code") {
      // For initial auth, maintain the token TTL from Clerk
      return {
        newProps: { ...options.props },
        accessTokenTTL: options.props.tokenSet.accessTokenTTL,
      };
    }
    if (options.grantType === "refresh_token") {
      // For token refresh, we use OIDC discovery to connect with Clerk
      const { as, client, clientAuth } = await getOidcConfig({
        issuer: `https://${env.CLERK_DOMAIN}/`,
        client_id: env.CLERK_CLIENT_ID,
        client_secret: env.CLERK_CLIENT_SECRET,
      });
      const response = await oauth.refreshTokenGrantRequest(
        as, client, clientAuth, options.props.refreshToken
      );
      // Return the refreshed tokens with updated TTL
      return { /* token refresh response */ };
    }
  }

// OIDC configuration for Clerk integration
  async function getOidcConfig({ issuer, client_id, client_secret }) {
    // Discover OIDC endpoints from Clerk
    const as = await oauth.discoveryRequest(new URL(issuer))
      .then(response => oauth.processDiscoveryResponse(new URL(issuer), response));
    // Set up client credentials for OAuth exchanges
    const client = { client_id };
    const clientAuth = oauth.ClientSecretPost(client_secret);
    return { as, client, clientAuth };
  }

This implementation delivers enterprise-grade security through OAuth and PKCE, supports persistent sessions, and presents a custom consent screen experience for users. The full source for the clerk implementation can be seen here.

Stainless: From API to MCP in Minutes

Stainless automatically converted our OpenAPI specification directly into MCP-compatible code:

// Seamlessly connecting to Scorecard's API through Stainless-generated MCP
  import { init, server } from "scorecard-ai-mcp/server"; // Stainless-generated package
  import Scorecard from "scorecard-ai";
  // Initialize with authenticated user's token
  const client = new Scorecard({
    bearerToken: userToken
  });
  // One line to connect the Scorecard client to the MCP server
  init({ server, client });

This dramatically reduced development time and ensured a high-quality experience for our team and our MCP users. Check out our generated Package and their Guide to learn more.

Cloudflare: Deploy Anywhere

Our MCP server is deployed on Cloudflare Workers, providing global availability with low latency:

  // Deploying globally with Cloudflare Workers
  {
    "name": "scorecard-mcp",
    "main": "src/index.ts",
    "compatibility_date": "2025-04-01",
    "compatibility_flags": ["nodejs_compat"],
    "durable_objects": {
      "bindings": [
        {
          "class_name": "ScorecardAuthenticatedMCP",
          "name": "MCP_OBJECT"
        }
      ]
    },
    "kv_namespaces": [
      {
        "binding": "OAUTH_KV", // For storing OAuth state
        "id": "8df5d44c1acc4feb844b23f6b3e06a1b"
      }
    ],
    "observability": {
      "enabled": true // Real-time monitoring
    }
  }

We’ve been impressed how Cloudflare has doubled down on the vision for MCP at their recent Demo Day, demonstrating how MCP is "emerging as a new AI interface" for discovering and interacting with services. Cloudflare workers give us a convenient KV namespace as well as immediate scaling and seamless observability. Special thanks to Dustin Moore for his engineering leadership in connecting these technologies into a cohesive, powerful tool.

What's Next

We're implementing the MCP draft spec with Clerk and will soon publish our authentication approach. Explore our code at github.com/scorecard-ai/scorecard-mcp and share your feedback to help shape the future of remote MCP.

Trending

Popular Articles This Month

Take Control of AI Performance

Join forward-thinking teams using Scorecard to upgrade the way they build, test, and improve AI PRODUCTS.

Learn More