top of page

Beyond Tool Calling: Why AI Agents Should Write Code to Speak with MCP

ree

Model Context Protocol (MCP) has rapidly become the standard for connecting AI models to external data and systems. However, the traditional method of "tool calling"—where an LLM outputs JSON blobs to trigger functions—is hitting a performance ceiling.


There is a superior approach emerging: Converting MCP tools into a TypeScript API and asking the LLM to write executable code.

This shift moves us from chatty, token-heavy interactions to streamlined, agentic code execution. Here is why the future of AI agents lies in code generation, not just function calling.

The Problem with Traditional Tool Calling

ree

In the standard "tool calling" model (often called function calling), the workflow looks like this:

  1. User: "What is the weather in Austin?"

  2. LLM: Generates a specific JSON token (e.g., {"tool": "get_weather"}).

  3. System: Pauses the LLM, parses the JSON, calls the API, and feeds the result back.

  4. LLM: Reads the result and generates the final answer.


While functional, this approach is fragile. LLMs are trained on synthetic data to understand these specific JSON schemas. It is akin to teaching Shakespeare to speak Mandarin in a month; they can do it, but it isn't their native tongue. Furthermore, complex tasks require multiple round-trips, wasting tokens and increasing latency.


The Solution: Code Mode

LLMs are excellent at writing code. They have been trained on millions of open-source repositories. When an LLM writes code, it draws on a massive, "native" dataset.

By converting MCP tools into a TypeScript API, we allow the agent to interact with tools using the medium it understands best: programming code.


How It Works

ree

  1. Schema Conversion: The system fetches the MCP server's schema (the list of available tools).

  2. Type Generation: This schema is converted into a standard TypeScript interface (e.g., interface WeatherTools { getWeather(city: string): Promise<Result> }).

  3. Code Generation: Instead of outputting JSON, the LLM is prompted to write a TypeScript script that utilizes this interface to solve the user's problem.

  4. Sandboxed Execution: The generated script is executed in a secure environment, performing all necessary logic and API calls in one go.


This approach shines when tasks are complex. If an agent needs to fetch data, filter it, and then use that data to call a second API, "Code Mode" allows the LLM to write a single script to handle the logic. No round-trips, no token waste.


Security: The Power of Isolates

ree

Allowing an AI to write and execute arbitrary code sounds dangerous. This is where modern sandboxing architecture comes in.

Instead of heavy Docker containers, this architecture utilizes V8 Isolates. Isolates are lightweight JavaScript runtimes that spin up in milliseconds. They allow the system to create a fresh, disposable environment for every single request.

The "Binding" Advantage

ree

In this model, the sandbox has no direct internet access. It cannot simply fetch('google.com'). Instead, it communicates with the outside world strictly through Bindings—RPC connections to the MCP servers.

  • No API Keys Leaked: The AI code never sees the API keys. It calls a function like weather.get(), and the binding handles the authentication out-of-band.

  • Granular Control: You don't need complex network proxies. You simply define which TypeScript functions are available in the sandbox.

Conclusion

By treating MCP servers as libraries rather than just JSON endpoints, we unlock the full reasoning and coding potential of modern LLMs. It reduces latency, improves accuracy on complex tasks, and provides a stricter security model for enterprise AI deployment.


Try It Yourself: The "Code Mode" Simulation

Below is a complete Node.js project you can push to Git. It demonstrates the core concept: defining a tool, generating a TypeScript interface for it, and simulating how an execution engine runs that code securely.


Project Structure

Create a folder and set up the following files:

Plaintext

mcp-code-mode/
├── src/
│   ├── engine.ts       # The sandboxed execution engine
│   ├── mcp-server.ts   # A mock MCP server (The Tools)
│   ├── index.ts        # Main entry point
│   └── types.ts        # Type definitions
├── package.json
└── tsconfig.json

1. package.json

JSON

{
  "name": "mcp-code-mode-demo",
  "version": "1.0.0",
  "description": "Demonstration of LLM Code Generation for MCP Tools",
  "main": "dist/index.js",
  "scripts": {
    "start": "ts-node src/index.ts",
    "build": "tsc"
  },
  "dependencies": {
    "ts-node": "^10.9.1",
    "typescript": "^5.0.0"
  },
  "devDependencies": {
    "@types/node": "^20.0.0"
  }
}

2. tsconfig.json

JSON

{
  "compilerOptions": {
    "target": "ES2020",
    "module": "CommonJS",
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true
  }
}

3. src/types.ts

TypeScript

// The shape of our Mock MCP Tools
export interface IMcpTools {
    getWeather(city: string): Promise<string>;
    calculateDistance(cityA: string, cityB: string): Promise<number>;
}

4. src/mcp-server.ts

This represents the "External Tool" or MCP Server. In a real scenario, this would be a remote connection.

TypeScript

import { IMcpTools } from './types';

export const McpServer: IMcpTools = {
    async getWeather(city: string): Promise<string> {
        console.log(`[MCP Server] Fetching weather for ${city}...`);
        // Mock data
        const weathers: Record<string, string> = {
            "Austin": "Sunny, 95F",
            "London": "Rainy, 60F",
            "Tokyo": "Clear, 75F"
        };
        return weathers[city] || "Unknown";
    },

    async calculateDistance(cityA: string, cityB: string): Promise<number> {
        console.log(`[MCP Server] Calculating distance between ${cityA} and ${cityB}...`);
        return Math.floor(Math.random() * 5000) + 1000; // Mock distance
    }
};

5. src/engine.ts

This simulates the "Sandboxed Environment." It takes the code string (which an LLM would generate) and executes it with access only to the MCP tools.

TypeScript

import { vm } from 'node:vm'; // Used to simulate a sandbox
import { McpServer } from './mcp-server';

export async function executeAgentCode(userCode: string) {
    console.log("--- Starting Sandbox Execution ---");
    
    // We create a context (sandbox) where the code runs.
    // The code only has access to 'console' and our 'tools'.
    // It does NOT have access to 'process', 'fs', or the internet.
    const sandbox = {
        console: console,
        tools: McpServer, // Injecting the MCP tools as a global object
    };

    try {
        // Wrap user code in an async function to allow await
        const wrappedCode = `
            (async () => {
                ${userCode}
            })();
        `;

        // Execute code in the sandbox
        await new Function('tools', 'console', `return ${wrappedCode}`)(McpServer, console);
        
    } catch (error) {
        console.error("Runtime Error in Sandbox:", error);
    }
    console.log("--- Sandbox Execution Finished ---");
}

6. src/index.ts

This acts as the Agent. It defines the inputs and simulates the LLM's output.

TypeScript

import { executeAgentCode } from './engine';

// 1. This is the Context we provide to the LLM
const systemPrompt = `
You have access to a global object called 'tools'.
The interface is:
interface IMcpTools {
    getWeather(city: string): Promise<string>;
    calculateDistance(cityA: string, cityB: string): Promise<number>;
}
Write a script to check the weather in Austin and London, and find the distance between them.
`;

// 2. This is the OUTPUT we simulate receiving from the LLM.
// In a real app, you would call OpenAI/Anthropic here.
// Notice: The LLM writes LOGIC, not just a JSON object.
const simulatedLlmOutput = `
    const weatherAustin = await tools.getWeather("Austin");
    console.log("Austin Weather:", weatherAustin);

    const weatherLondon = await tools.getWeather("London");
    console.log("London Weather:", weatherLondon);

    if (weatherAustin.includes("Sunny")) {
        console.log("It's a good day in Austin, checking travel distance...");
        const distance = await tools.calculateDistance("Austin", "London");
        console.log("Distance to travel:", distance, "miles");
    } else {
        console.log("Staying home.");
    }
`;

// 3. Run the "Code Mode"
async function main() {
    console.log("Agent received prompt:", systemPrompt);
    console.log("\nAgent generated code:\n", simulatedLlmOutput);
    console.log("\nExecuting in Sandbox...\n");
    
    await executeAgentCode(simulatedLlmOutput);
}

main();

How to Run

  1. Initialize the project: npm install

  2. Run the simulation: npm start


You will see the system execute the logic dynamically. This demonstrates how an agent can chain logic (if/else), use variables, and call tools multiple times in a single pass without needing a "Manager" loop to process JSON responses.

Comments


bottom of page