
AI Tools
February 25, 2025
—
3
min read
Introducing AgentEval.org
Introducing AgentEval.org: An Open-Source Benchmarking Initiative for AI Agent Evaluation
All Blogs
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
No results found.
Try filtering for other categories.
Try filtering for other categories.

AI Tools
June 30, 2025
—
3
min read
Scorecard MCP 2.0: 1,000 Lines -> 70
Introducing Scorecard MCP 2.0 built with the new MCP spec

AI Tools
May 30, 2025
—
3
min read
Introducing Scorecard's MCP Server
We're excited to announce the launch of the first remote Model Context Protocol (MCP) server for evaluation.

Workflows
November 7, 2024
—
4
min read
Simulate, Test, Repeat: The Key to Robust AI System Development
Simulations are transforming the development and testing of AI systems across industries, far beyond just self-driving cars.

LLM Evaluation
September 29, 2024
—
6
min read
5 Must-Have Features for LLM Evaluation Frameworks
Unlock the full potential of Large Language Models (LLMs) with a comprehensive evaluation framework. Discover the 5 must-have features to ensure reliable performance and cost-effectiveness in your LLM applications.