loading…
Search for a command to run...
loading…
Enables LLM agents to capture screenshots and visualize runtime behavior within the Godot game engine for game development benchmarking. It utilizes AppleScript
Enables LLM agents to capture screenshots and visualize runtime behavior within the Godot game engine for game development benchmarking. It utilizes AppleScript to provide display capture functionality for agents running on macOS.
A benchmark suite for evaluating LLM agents on game development tasks.
Paper: GameDevBench: A Comprehensive Benchmark for Game Development
GameDevBench contains 132 game development tasks to evaluate LLM agents' ability to complete game development problems in the Godot game engine.
Godot 4.x - Download and install from godotengine.org
godot is available in your PATH, or set GODOT_EXEC_PATH environment variablePython 3.10+ - Required for all agents
Install the agent(s) you want to use:
Before running the benchmark, unzip the tasks folder:
unzip tasks.zip
Note: The tasks are distributed as a zip file to prevent accidental data leakage.
You can use the built-in plans for claude-code, codex, and gemini-cli, or provide API keys directly. For OpenHands you must provide your own API keys. See .env.example for a complete list of optional environment variables.
uv run python gamedevbench/src/benchmark_runner.py \
--agent AGENT \
--model MODEL \
run --task-list tasks.yaml
claude-code - Anthropic's Claude Code CLIcodex - OpenAI Codexgemini-cli - Google Gemini CLIopenhands - OpenHands (requires Python 3.12+)--agent AGENT - Agent to use (required)--model MODEL - Model name (e.g., claude-sonnet-4.5-20250929)--enable-mcp - Enable MCP (Model Context Protocol) server for supported agents--use-runtime-video - Enable runtime video mode--skip-display - Skip tasks that require displayrun --task-list FILE - Run tasks from YAML file (e.g., tasks.yaml)macOS-only Features:
--enable-mcp) currently only works on macOSGODOT_SCREENSHOT_DISPLAY environment variable to correct display numberBenchmark results are saved to results/ directory with the following information:
@misc{chi2026gamedevbenchevaluatingagenticcapabilities,
title={GameDevBench: Evaluating Agentic Capabilities Through Game Development},
author={Wayne Chi and Yixiong Fang and Arnav Yayavaram and Siddharth Yayavaram and Seth Karten and Qiuhong Anna Wei and Runkun Chen and Alexander Wang and Valerie Chen and Ameet Talwalkar and Chris Donahue},
year={2026},
eprint={2602.11103},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2602.11103},
}
Run in your terminal:
claude mcp add gamedevbench-mcp -- npx Yes, GameDevBench MCP is free — one-click install via Unyly at no cost.
No, GameDevBench runs without API keys or environment variables.
Self-hosted: the server runs locally on your machine via the install command above.
Open GameDevBench on unyly.org, pick your client tab (Claude Desktop, Claude Code, Cursor) and press Install — the config is generated automatically, no JSON editing.
Web content fetching and conversion for efficient LLM usage.
Retrieval from AWS Knowledge Base using Bedrock Agent Runtime.
by modelcontextprotocolProvides auto-configuration for setting up an MCP server in Spring Boot applications.
A very streamlined mcp client that supports calling and monitoring stdio/sse/streamableHttp, and can also view request responses through the /logs page. It also
by xuzexin-hzNot sure what to pick?
Find your stack in 60 seconds
Author?
Embed badge for your README
Browse similar
All ai MCPs