Topics

Core concepts and patterns for building AI agents — with prose explanations and code examples.

Advanced

Agent-Assisted Fine-Tuning

How coding agents automate the entire LLM fine-tuning workflow from GPU selection to model deployment using natural language instructions.

fine-tuningtrainingllm

Feb 2026

Agentic RAG

Beyond simple retrieve-then-generate: intelligent agents that decide when, what, and how to retrieve, then critique and correct their own retrieval.

ragretrievalself-rag

Feb 2026

Learning & Adaptation

How AI agents improve over time without retraining: token-space learning from successful trajectories, Reflexion self-critique, and self-evolving architectures.

learningreflexionself-improvement

Feb 2026

Multi-Agent Orchestration

Patterns and frameworks for coordinating multiple specialized AI agents including supervisor, peer-to-peer, debate, and mixture of experts.

multi-agentorchestrationcoordination

Feb 2026

Skills Pattern

A filesystem-based approach to tool management that achieves 98% token savings by loading tool definitions on-demand rather than sending all tools on every request.

skillstool-managementtoken-savings

Feb 2026

Dynamic Filtering for Web Search Agents

How agents use code execution to filter retrieved web content before it enters the context window, improving accuracy and reducing token costs.

web searchtool usecontext engineering

Feb 2026

Programmatic Tool Calling

How agents can execute tool calls inside a sandboxed code environment to reduce round-trip latency and token overhead in multi-step workflows.

tool-usecode-executionlatency

Feb 2026

LLM-Driven Vulnerability Discovery

How to architect autonomous AI agents that find, validate, and triage security vulnerabilities in real codebases using sandboxed tool access and multi-stage reasoning.

securitytool-useagent-patterns

Feb 2026

The Planner–Observer Pattern for Multi-Agent Research Systems

How to design multi-agent research systems using a planner that generates dynamic parallel tasks and an observer that maintains global context across all agents.

multi-agentorchestrationcontext-engineering

Feb 2026

The Agentic Web

How AI agents are becoming first-class participants on the internet — browsing autonomously, transacting on behalf of users, and communicating with other agents through emerging protocols and standards.

web-agentsbrowser-automationprotocols

Feb 2026

Coding Agent Swarms

The two-tier architecture for one-person engineering teams: an AI orchestrator with business context managing a fleet of specialized coding agents.

multi-agentorchestrationcoding-agents

Feb 2026

Deep Research Agent Architecture: Planner, Tasks, and Observer

How to architect a production deep-research multi-agent system using a planner, parallel task workers, and a context-aware observer—with structured output and progressive content retrieval.

multi-agentresearchplanner-observer

Feb 2026

Agent Lifecycle Hooks: Intercepting and Controlling Agent Execution

How to use lifecycle hooks to inspect, modify, and gate agent behavior at precise points during execution — from tool calls to session boundaries.

Topics

Advanced

Agent-Assisted Fine-Tuning

Agentic RAG

Learning & Adaptation

Multi-Agent Orchestration

Skills Pattern

Dynamic Filtering for Web Search Agents

Programmatic Tool Calling

LLM-Driven Vulnerability Discovery

The Planner–Observer Pattern for Multi-Agent Research Systems

The Agentic Web

Coding Agent Swarms

Deep Research Agent Architecture: Planner, Tasks, and Observer

Agent Lifecycle Hooks: Intercepting and Controlling Agent Execution

State Machine Memory: From Reactive Agents to Programmatic Planning

Theory of Mind and Belief Modeling in Multi-Agent LLM Systems

Uncertainty-Aware Denoising for Multi-Step Agent Workflows

Defending Agent Memory Against Poisoning: Bayesian Trust Scoring in Multi-Agent Systems

EvoSkill: Automated Skill Discovery for Multi-Agent Systems

The Proposer–Safety Oracle Pattern: Separating Decision Generation from Runtime Governance

Governed Autonomy: Supervisor-Worker Hierarchies and Stateful Skill Graphs for Complex Agent Systems

Meta-RL for LLM Agents: Strategic Exploration and Exploitation Across Episodes

Asymmetric Goal Drift: When Coding Agents Quietly Stop Following Your Rules

Emergent Coordination in Large-Scale LLM Agent Populations

Error Cascades in Multi-Agent Systems: How Small Mistakes Become System-Wide Failures

Governed Autonomy: Dual-Layer Architecture for Safe, Composable Agent Systems

Self-Attribution Bias: Why LLMs Grade Their Own Work Too Leniently

Privacy Leakage in Sequential Multi-Agent Pipelines: An Information-Theoretic View

Structured Task Planning for LLM Agents: PDDL via MCP Tool Calls

Multi-Agent Topology Showdown: Hierarchical, Adversarial, and Collaborative Architectures Compared

Evaluate-as-Action: Teaching RAG Agents to Judge Their Own Retrieval

TrustBench: Real-Time Pre-Execution Action Verification for AI Agents

Architecture

Portable Agent State: Decoupling Memory from Execution

Context Layer

Context Bloat & Context Rot

Context Engineering

Prompt Caching / KV Cache

Evaluation

Evaluation & Metrics

Context-Bench: Benchmarking Agentic Context Engineering

Automated Trace Analysis: Mining Agent Behavior Patterns at Scale

Agent Observability: Monitoring AI Agents in Production

Measuring Agent Reliability Beyond Accuracy

Evaluation Patterns for Deep Agents: Single-Step, Full-Turn, and Multi-Turn Testing

Silo-Bench: Benchmarking Distributed Coordination in Multi-Agent LLM Systems

Structured Trace Analysis for Agent Debugging: The SIR Pattern

Evaluating and Refining Agent Skills: Test, Benchmark, and Tune for Reliability

Diagnosing Memory Bottlenecks in LLM Agents: Retrieval vs. Utilization

LiveAgentBench: Benchmarking Agents Against Real-World Complexity

Regression Testing Non-Deterministic AI Agent Workflows

Agent Selection as a Recommendation Problem: Benchmarking Query-to-Agent Routing

Agent Selection at Scale: Matching Queries to the Right Agent

Chain-of-Thought Controllability: Why Reasoning Traces Are Unreliable Safety Signals

DeepFact: Evolving Benchmarks and Verification Agents for Research Factuality

Programmable Evolution for Agent Benchmarks: Keeping Evals Alive as the World Changes

MASEval: Why Your Agent Benchmark Is Missing Half the Picture

Foundational

Agent Memory Systems

The ReAct Pattern

Tool Use & Function Calling

Typed Agent Workflows: Logical Transduction Algebra for LLM Pipelines

Structured Output Strategies for AI Agents: From Schema to Validated Response

Infrastructure

Representation Reuse for Cost-Effective Safety Classifiers

Streaming Patterns for AI Agent Graphs

AI Runtime Infrastructure: The Execution Layer Between Models and Applications

The Agent Harness Pattern: Opinionated Infrastructure for Complex Agents

Adaptive Memory Admission Control: Deciding What Your Agent Should Remember

SkillNet: Building a Shared Ontology for AI Agent Skills

LDP: Identity-Aware Communication for Multi-Agent LLM Systems

Protocols

Agent2Agent Protocol (A2A)

MCP Apps

Model Context Protocol (MCP)

Universal Commerce Protocol (UCP)

Safety

Safety & Guardrails