Loading…
April 2-3, 2026
New York, NY
View More Details & Registration

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for MCP Dev Summit North America to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration..

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.


Friday April 3, 2026 3:25pm - 3:50pm EDT
While frontier models achieve impressive scores on benchmarks like MCP Atlas (62.3%) and SWE-bench (62.1%), these metrics don't answer the critical question: "Will this agent work for OUR specific use-case?"This talk presents a practical framework for building custom agent evaluation systems tailored to your organization's needs. We'll cover the complete lifecycle: data collection and categorization, open-source instrumentation patterns, and production monitoring for long-term performance tracking. You'll learn to construct evaluation datasets reflecting actual workloads, implement testing harnesses mirroring production constraints, and establish monitoring pipelines that catch degradations early.We'll demonstrate techniques for measuring agent reliability across accuracy, latency, cost, and safety dimensions while accounting for real-world variables: prompt engineering, data quality, MCP tool availability, and model selection. Attendees will leave with actionable strategies to build confidence in production deployments and create feedback loops for continuous improvement.
Speakers
avatar for Gaurav Saxena

Gaurav Saxena

Director of Engineering
Gaurav Saxena is an engineering leader in the field of platform and cloud engineering with over 20 years of experience in the software industry. His technical expertise includes Stream-based architectures, Kubernetes, Service Mesh, Software Supply Chain Security, and Observabilit... Read More →
avatar for Matvey Kukuy

Matvey Kukuy

CEO, Archestra.AI
Maintainer: Grafana OnCall, KeepHQ, Archestra.AI.

Ex-Engineering Director at Grafana Labs.
Friday April 3, 2026 3:25pm - 3:50pm EDT
Broadway Ballroom South (6th Floor)
  MCP Best Practices

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link