Code Mode: give agents an entire API in 1,000 tokens

Matt Carey
9 min readintermediate
--
View Original

Overview

The article introduces Code Mode, a technique that allows AI agents to interact with the entire Cloudflare API using only 1,000 tokens, significantly reducing context window usage. It discusses the implementation of a new Model Context Protocol (MCP) server that utilizes this approach, enabling efficient API access while maintaining security and performance.

What You'll Learn

1

How to implement Code Mode for efficient API interaction

2

Why reducing context window usage is crucial for AI agents

3

When to use the Cloudflare MCP server for API access

Key Questions Answered

How does Code Mode reduce token usage for API interactions?
Code Mode allows AI agents to execute code against a typed SDK, enabling them to compose multiple API calls and return only the necessary data. This approach reduces the number of input tokens used by 99.9%, allowing access to the entire Cloudflare API with just two tools, search() and execute().
What are the main tools provided by the new Cloudflare MCP server?
The new Cloudflare MCP server provides two main tools: search() for querying the OpenAPI spec and execute() for executing JavaScript code against the Cloudflare API. This setup allows agents to discover and interact with thousands of API endpoints efficiently.
What security measures are in place for executing code in Code Mode?
Code Mode runs generated code inside a Dynamic Worker isolate, which is a lightweight V8 sandbox. This environment has no file system access, no environment variables to leak, and external fetches are disabled by default, ensuring secure execution of the code.
How does the Cloudflare MCP server handle OAuth 2.1 compliance?
The Cloudflare MCP server is OAuth 2.1 compliant and uses the Workers OAuth Provider to downscope tokens to the permissions explicitly granted by users. This ensures that agents only have access to the capabilities that users approve.

Key Statistics & Figures

Reduction in input tokens used
99.9%
This statistic highlights the efficiency of Code Mode compared to traditional MCP servers, which would consume 1.17 million tokens for the same tasks.
Token footprint for Cloudflare API access
1,000 tokens
This fixed token usage allows for access to the entire Cloudflare API, regardless of the number of endpoints.

Technologies & Tools

Backend
Cloudflare API
Used for accessing various services like DNS, Zero Trust, Workers, and R2.
Backend
Dynamic Worker
Provides a secure environment for executing JavaScript code generated by agents.
Protocol
Model Context Protocol (mcp)
Standard for AI agents to use external tools effectively.

Key Actionable Insights

1
Implement Code Mode to streamline API interactions and reduce token consumption significantly.
By using Code Mode, developers can enable AI agents to perform complex tasks with minimal context, making them more efficient and responsive.
2
Utilize the Cloudflare MCP server for comprehensive API access without overwhelming the context window.
This approach allows developers to manage multiple API endpoints efficiently while maintaining a fixed token footprint, which is essential for scalable AI applications.
3
Leverage the Dynamic Worker isolate for secure code execution in your applications.
Using a sandboxed environment ensures that your agents can execute code safely, protecting against potential vulnerabilities associated with external fetches and environment leaks.

Common Pitfalls

1
Failing to properly manage token permissions when using the Cloudflare MCP server.
Without careful attention to OAuth scopes, agents may gain excessive permissions, leading to security vulnerabilities. Always ensure that tokens are downscoped to the minimum necessary permissions.

Related Concepts

Model Context Protocol (mcp)
Dynamic Workers
Cloudflare API
Oauth 2.1 Compliance