AI Integration Overview

Integrating AI into an application goes far beyond making an API call to a model. It means designing how the model interacts with your domain, what tools it has available, how it communicates with external systems, and how you maintain control and traceability over everything it does.

This section covers the foundational concepts — language-agnostic — that you need to understand before choosing a framework. Concrete implementations are in the specific pages for TypeScript, PHP, and Python.

Mental model: from LLM to application

A language model on its own is a function (text) → text. To be useful inside a real application, you need:

Application
├── Orchestration layer  →  decides what to do with the model's response
├── Integration layer    →  connects the model with your systems
│   ├── Tools / Functions →  actions the model can invoke
│   ├── Context           →  relevant information the model receives
│   └── Memory            →  persistence of information across turns
└── Observation layer    →  traces, metrics, evaluation

The combination of these layers is what an AI framework provides out of the box. Without one, you build them manually.

Universal primitives

Regardless of language or framework, every serious AI integration revolves around the same primitives:

1. Tool / Function calling

The model’s ability to invoke functions in your code. The model decides when and with what arguments to call each tool; your code executes them.

LLM  →  "I need to look up order 42"
         ↓
         tool: get_order(id: 42)
         ↓
App  →  executes the function, returns result
         ↓
LLM  →  "Order 42 is in 'shipped' status"

Each tool is described with:

Name: unique identifier
Description: natural language text explaining what it does (critical — the model uses this to decide whether to call it)
Parameter schema: typically JSON Schema or a typed equivalent
Handler: the actual function that executes the action

{
  "name": "get_order",
  "description": "Retrieves an order by its ID including status, items, and shipping info",
  "parameters": {
    "type": "object",
    "properties": {
      "id": { "type": "integer", "description": "The order ID" }
    },
    "required": ["id"]
  }
}

The description is the contract

A tool description is not documentation — it’s the instruction the model reads to decide whether to use it and how. An ambiguous or incomplete description produces incorrect calls or unnecessary invocations. Treat it with the same care as a system prompt.

2. Structured output

Forcing the model to produce responses in a specific format (JSON, XML, etc.) rather than free text. Essential for integrating responses into business workflows.

Prompt:            "Extract the order data from this email..."
Unstructured output: "The order is number 42, the customer is..."
Structured output:   { "order_id": 42, "customer": "...", "items": [...] }

Modern frameworks expose this directly from your type schema (Zod in TS, Pydantic in Python, etc.), without needing to parse text manually.

3. Streaming

LLM responses are generated token by token. Streaming lets you show text to the user as it’s generated rather than waiting for the full response — it dramatically improves perceived speed.

Without streaming: user waits 8 seconds → receives complete text
With streaming:    user sees text appear progressively from the first token

In REST APIs, streaming typically uses Server-Sent Events (SSE) or chunked transfer encoding. Frameworks abstract this with iterators, streams, or reactive hooks.

4. Memory and context

An LLM is stateless — it doesn’t remember previous conversations. “Memory” is simply context that your application manages and sends with each call:

Type	Description	Storage
In-context	Message history within the current context window	In-memory message array
Short-term	Summaries or fragments of recent conversations	Redis, database
Long-term	Persistent information about the user or entity	Vector DB, relational database
Episodic	Results and learnings from past executions	Database + embeddings

5. Agent

An agent is an LLM operating in a loop: observe, decide an action (tool call or response), execute, observe the result, repeat. See AI Agents for the full architecture.

What frameworks add over a manual loop:

Agent state management across steps
Error handling and retries
Tracing of each decision
Context injection at each iteration
Types and interfaces for tools and responses

MCP: Model Context Protocol

MCP is an open protocol (originally from Anthropic, now under AAIF) that standardizes how agents connect with external tools. It is, essentially, the “USB-C of agent integrations”.

Without MCP:  Agent A ←→ Tool (Agent A's own format)
              Agent B ←→ Tool (Agent B's own format)   ← same tool, two implementations

With MCP:     Agent A ←→ MCP ←→ Tool
              Agent B ←→ MCP ←→ Tool                   ← same tool, one protocol

How MCP works

MCP defines a client-server protocol:

Application/Agent (MCP Client)
        ↓  JSON-RPC 2.0 over stdio / HTTP+SSE
MCP Server (exposes tools, resources, prompts)
        ↓
External system (database, API, filesystem, etc.)

An MCP Server exposes three types of capabilities:

Capability	Description	Example
Tools	Functions the model can invoke	`execute_sql`, `create_issue`
Resources	Data the model can read	Files, DB records, URLs
Prompts	Reusable prompt templates	Analysis templates, summaries

Implementing an MCP Server (language-agnostic pseudocode)

MCP Server "database-server":
  tools:
    - name: "query_database"
      description: "Execute a read-only SQL query"
      parameters:
        sql: string (required)
        limit: integer (optional, default: 100)
      handler:
        validate sql is SELECT only
        execute against read replica
        return results as JSON

  resources:
    - uri: "db://schema"
      description: "Current database schema"
      handler:
        return INFORMATION_SCHEMA tables as markdown

  transport: stdio  // or HTTP+SSE for remote servers

MCP adoption

As of early 2026, there are over 10,000 public MCP servers. If you expose capabilities for agents, implementing MCP means any agent (Claude, GPT-4, Gemini, any framework) can use your tools without framework-specific integration code.

MCP vs direct tools

	Direct tools	MCP
Integration	Custom code in the same process	Separate process, standard protocol
Reusability	One implementation per framework	One implementation, any framework
Security	Direct runtime access	Isolated process, controlled attack surface
Overhead	None (in-process)	IPC or HTTP (typically <5ms)
Best for	App-specific tools	Shared tools, infrastructure, third-parties

Agent-to-agent communication protocols

When a system has multiple agents that need to coordinate:

A2A Protocol (Agent-to-Agent)

Google’s proposal (2025) for communication between agents across different systems. Where MCP solves agent↔tool, A2A solves agent↔agent:

Orchestrator Agent  →  A2A  →  Specialist Agent A
                    →  A2A  →  Specialist Agent B

Each agent publishes an Agent Card (JSON) describing its capabilities, allowing the orchestrator to discover and delegate dynamically.

Direct handoff

For simpler systems, handoff can simply be a tool call where the result includes the complete state for the next agent:

orchestrator_tool: delegate_to_researcher(task, context)
  → researcher runs
  → returns { findings, sources, confidence }
orchestrator uses findings to call: delegate_to_writer(task, findings)

Observability: agent tracing

Agents are hard to debug because errors chain together. You need traceability at the level of each decision, not just the final input/output.

The minimum unit of tracing is a span:

Span: process_user_request (root)
  ├── Span: llm_call (model=claude-3-5-sonnet, tokens=1847)
  │     input: [messages array]
  │     output: tool_call: search_orders(query="...")
  ├── Span: tool_execution (tool=search_orders)
  │     input: { query: "..." }
  │     output: [3 orders]
  ├── Span: llm_call (model=claude-3-5-sonnet, tokens=2103)
  │     input: [messages + tool result]
  │     output: final response
  └── metadata: total_tokens=3950, duration=4.2s, cost=$0.012

Relevant standards

Standard	Purpose
OpenTelemetry	Distributed traces, metrics, and logs. Natively supported by most frameworks.
LangSmith	LLM-specific platform for tracing, evals, and datasets. Integrates with LangChain/LangGraph but also works as an agnostic SDK.
Braintrust	LangSmith alternative, more focused on evals.

How to choose a framework

The decision depends primarily on the project’s language, but there are nuances:

What type of application are you building?
│
├── API/backend with AI responses (streaming, structured output)
│   ├── TypeScript → AI SDK (Vercel)
│   ├── PHP        → Symfony AI or Neuron AI
│   └── Python     → Pydantic AI
│
├── Complex agents with workflows and memory
│   ├── TypeScript → Mastra
│   ├── PHP        → Neuron AI
│   └── Python     → LangGraph
│
├── RAG + semantic search
│   ├── TypeScript → AI SDK + vector store
│   └── Python     → LangChain / LlamaIndex
│
└── Observability / evals on existing LLM
    └── LangSmith (framework-agnostic)

Avoid over-engineering

You don’t need an agent framework to add AI to your application. If you only need structured output or a streaming call, use the provider’s official SDK directly. Agent frameworks add value when you have tool loops, multi-step workflows, or persistent memory.

Integration patterns by use case

Case 1: Data enrichment

The model processes entities from your domain and adds structured information.

Input:  Order { id, description, items }
LLM:    classifies category, extracts intent, suggests tags
Output: Order + { category, intent, tags, confidence }

Framework: Direct SDK with structured output. No tools, no agent.

Case 2: Conversational interface over data

The user asks questions in natural language about your system.

User:  "How many pending orders are there over €500?"
LLM:   calls tool count_orders(status="pending", min_amount=500)
App:   executes query, returns 12
LLM:   "There are 12 pending orders over 500 euros"

Framework: SDK with tools or lightweight framework (AI SDK, Pydantic AI).

Case 3: Process automation

The agent completes a multi-step task without human intervention.

Task:   "Review open PRs, comment on those with type errors, and close ones with no activity for over 30 days"
Agent:  list_prs() → analyse_each() → comment() / close()

Framework: Full agent framework (Mastra, LangGraph, Neuron AI).

Case 4: Structured content generation

The model generates documents, reports, or code from context.

Input:  project data, metrics, history
Output: executive report in Markdown / JSON

Framework: Direct SDK with structured output + prompt engineering.

Concrete implementations for each ecosystem are in the following pages: