[
  
    {
      "title"    : "The Three-Platform Problem in Enterprise AI",
      "category" : "",
      "tags"     : "AI Platform, Enterprise AI, Low-Code, DevOps, Platform Architecture, API-First, Infrastructure, Developer Tools, Platform Strategy",
      "url"      : "/2025/12/07/the-three-platform-problem-in-enterprise-ai/",
      "date"     : "December 7, 2025",
      "excerpt"  : "Enterprise AI has a platform problem. The tools to build AI-powered applications exist, but they&#39;re scattered across three disconnected ecosystems—each solving part of the puzzle, none providing a complete solution. This isn&#39;t a &#39;too many choices&#39; problem. It&#39;s an architectural one.",
      "content"  : "Enterprise AI has a platform problem. The tools to build AI-powered applications exist, but they’re scattered across three disconnected ecosystems—each solving part of the puzzle, none providing a complete solution.\n\nThis isn’t a “too many choices” problem. It’s an architectural one. Gartner tracks these ecosystems in separate Magic Quadrants because they serve fundamentally different users with different needs. But building production AI applications requires capabilities from all three.\n\nThree Ecosystems, Zero Integration\n\n1. Low-Code Platforms (The Citizen Developer)\n\nPlatforms like Microsoft Power Apps, Mendix, and OutSystems let business users build applications quickly without writing code. They excel at UI, rapid prototyping, and workflow automation.\n\n\nGartner Magic Quadrant for Enterprise Low-Code Application Platforms\n\nWhat they do well: Speed to prototype, accessibility for non-developers, business process automation.\n\nWhat they lack: Infrastructure control, enterprise governance at scale, and the flexibility professional developers need.\n\n2. DevOps Platforms (The Professional Developer)\n\nGitLab, Microsoft Azure DevOps, and Atlassian provide CI/CD pipelines, source control, and deployment infrastructure. They answer the “how do we ship and operate this reliably?” question.\n\n\nGartner Magic Quadrant for DevOps Platforms\n\nWhat they do well: Security, governance, testing, deployment automation, operational excellence.\n\nWhat they lack: They don’t help you build faster—they help you ship what you’ve already built.\n\n3. AI/ML Platforms (The AI Specialist)\n\nCloud providers (AWS, GCP, Azure) and specialized vendors offer models, MLOps tooling, and inference infrastructure. They provide the intelligence layer.\n\n\nGartner Magic Quadrant for AI Code Assistants\n\nWhat they do well: Model access, training infrastructure, inference at scale.\n\nWhat they lack: An opinion on how you actually build and deploy applications around those models.\n\nThe Cost of Fragmentation\n\nWhen your AI strategy requires stitching together leaders from three separate ecosystems, you pay an integration tax:\n\nWorkflow disconnects. A business user prototypes an AI workflow in a low-code tool. A developer rebuilds it from scratch to meet security requirements. The prototype and production system share nothing but a spec document.\n\nObservability gaps. Tracing a user request through a low-code UI, into a DevOps pipeline, through an AI model call, and back is nearly impossible without custom instrumentation.\n\nGovernance drift. Security policies enforced in your DevOps platform don’t automatically apply to your low-code environment. Compliance becomes a manual audit.\n\nYour most capable engineers end up writing glue code instead of building products.\n\nA Different Architecture: API-First Unification\n\nThe solution isn’t better integrations—it’s platforms built on a different architecture.\n\nReplit offers a useful case study. They’ve grown from $10M to $100M ARR in under six months by building a platform where:\n\n\n  \n    The same infrastructure serves both citizen developers and professionals. A business user building through natural language (“create a customer feedback dashboard”) and a developer writing code are using the same underlying APIs, the same deployment system, the same security model.\n  \n  \n    AI is native, not bolted on. Their Agent can build, test, and deploy complete applications autonomously—but it’s using the same environment a professional developer would use. No “export to production” step.\n  \n  \n    Governance applies universally. Database access, API key management, and deployment policies are platform-level concerns. They apply whether you’re prompting an AI agent or writing TypeScript.\n  \n\n\nThis is the “headless-first” pattern that companies like Stripe and Twilio proved out: build the API, make it excellent, then layer interfaces on top. The UI for non-developers and the API for developers are just different clients to the same system.\n\nWhat This Means for Platform Strategy\n\nIf you’re evaluating AI platforms, the question isn’t “which low-code tool, which DevOps platform, and which AI vendor?”\n\nThe better question: Does this platform unify these concerns, or will we be writing integration code for the next three years?\n\nLook for:\n\n\n  \n    API-first architecture. Can professional developers access everything through APIs? Is the UI built on those same APIs?\n  \n  \n    Built-in deployment and operations. Does prototyping in the platform give you production-ready infrastructure, or does it give you an export button and a prayer?\n  \n  \n    Platform-level governance. Are security, compliance, and cost controls configured once and inherited everywhere, or are they per-tool?\n  \n\n\nThe platforms winning in this space aren’t the ones with the longest feature lists. They’re the ones that recognized the three-ecosystem problem and architected around it from day one.\n\n"
    },
  
    {
      "title"    : "The Platform Convergence: Why the Future of AI SaaS is Headless-First",
      "category" : "",
      "tags"     : "AI Platform, Agentic AI, Enterprise AI, AI Gateway, Agent Builder, Developer Tools, Infrastructure, Platform Architecture, Headless Architecture, AI SaaS",
      "url"      : "/2025/12/02/the-platform-convergence-why-the-future-of-ai-saas-is-headless-first/",
      "date"     : "December 2, 2025",
      "excerpt"  : "The AI agent market is fragmenting into two incomplete categories: Agent Builders that democratize creation but lack governance, and AI Gateways that provide control but slow innovation. Drawing lessons from Stripe and Twilio, the future belongs to unified, headless-first platforms that combine intuitive interfaces with programmable infrastructure.",
      "content"  : "The AI agent market is experiencing its own big bang—but this rapid expansion is creating fundamental fragmentation. Enterprises deploying agents at scale are caught between two incomplete solutions: Agent Builders and AI Gateways.\n\nAgent Builders democratize creation through no-code interfaces. AI Gateways provide enterprise governance over costs, security, and compliance. Both are critical, but in their current separate forms, they force a false choice: speed or control? The reality is, you need both.\n\nWe’ve seen this movie before. The most successful developer platforms—Stripe, Twilio, Shopify—aren’t just slick UIs or robust infrastructure. They are headless-first platforms that masterfully combine both.\n\nThe Headless-First Model\n\nStripe didn’t win payments by offering a payment form. Twilio didn’t win communications by providing a dashboard. They won by providing a powerful, programmable foundation with APIs as the primary interface. Their UIs are built on the same public APIs their customers use. Everything is composable, programmable, and extensible.\n\n\n  \n    \n      Principle\n      Benefit\n    \n  \n  \n    \n      API-First Design\n      Platform’s own UI uses public APIs, ensuring completeness\n    \n    \n      Progressive Complexity\n      Start with no-code UI, graduate to API without migration\n    \n    \n      Composability\n      Every capability is a building block for higher-level abstractions\n    \n    \n      Extensibility\n      Third parties build on the platform, creating ecosystem effects\n    \n  \n\n\nThis is the blueprint for AI platforms: not just a UI for building agents, nor just a gateway for traffic—but a comprehensive, programmable platform for building, running, and governing AI at every layer.\n\nThe Two Incomplete Categories\n\nAgent Builders (Microsoft Copilot Studio, Google Agent Builder) empower non-technical users to create agents in minutes. The problem arises at scale: Who manages API keys? Who tracks costs? Who ensures compliance? This democratization often creates ungoverned “shadow IT”—business units spinning up agents independently, each with its own credentials and error handling. Platform teams discover the proliferation only when something breaks.\n\nAI Gateways (Kong, Apigee) solve the governance problem with centralized security, cost monitoring, and compliance. But a gateway is just plumbing—it doesn’t accelerate creation. Business users wait in IT queues while engineers build what they need. Innovation slows to a crawl.\n\nIntegrating both categories creates its own integration tax: two authentication systems, two deployment processes, broken observability across disconnected logs, and policy enforcement gaps where builder retry logic conflicts with gateway rate limits.\n\nThe Platform Convergence\n\nThe solution is a unified, headless-first platform with four integrated layers:\n\nLayer 1: UI Layer — Intuitive no-code agent builder for business users, built on top of the platform’s own APIs. Natural language definition, visual workflow design, one-click deployment with inherited governance.\n\nLayer 2: Runtime Layer — Enterprise-grade gateway that every agent runs through automatically. Centralized auth (OAuth, OIDC, SAML), real-time policy enforcement, distributed tracing, cost tracking, anomaly detection.\n\nLayer 3: Platform Layer — Comprehensive APIs and SDKs for developers. REST/GraphQL endpoints, language-specific SDKs, agent lifecycle management, webhook system for event-driven architectures.\n\nLayer 4: Ecosystem Layer — Marketplace for discovering and sharing agents, tools, and integrations. Internal registry, reusable components, version control, usage analytics.\n\nSpeed AND Control\n\nThe difference between fragmented and unified approaches:\n\n\n  \n    \n      Capability\n      Fragmented Tools\n      Unified Platform\n    \n  \n  \n    \n      Agent Creation\n      Separate builder\n      Integrated no-code + API/SDK\n    \n    \n      Infrastructure\n      Separate gateway\n      Built-in gateway with inherited policies\n    \n    \n      Observability\n      Disconnected logs\n      End-to-end unified tracing\n    \n    \n      Policy Management\n      Manual coordination\n      Single policy engine\n    \n    \n      Developer Experience\n      High friction\n      Single, cohesive API surface\n    \n    \n      Audit &amp; Compliance\n      Cross-system correlation\n      Native audit trails\n    \n  \n\n\nWith a unified platform: business user creates agent in UI → platform applies policies automatically → agent deploys with full observability → platform team monitors centrally → developer extends via API without migration.\n\nWhat This Unlocks\n\nSelf-Service AI: HR builds a resume screening agent in 20 minutes. It inherits security policies automatically. Cost allocates to HR’s budget. Compliance trail generates without extra work.\n\nAI-Powered Products: Engineers embed agent capabilities into customer-facing apps using platform APIs. Multi-tenant isolation, usage-based billing, and governance come built-in.\n\nInternal Marketplace: Marketing’s “competitive intelligence” agent gets discovered by Sales. One-click deployment. Usage metrics show ROI across the organization.\n\nConclusion\n\nThe debate over agent builder vs. AI gateway is a red herring—a false choice leading to fragmented, expensive solutions. The real question: point solution or true platform?\n\nIn payments, Stripe won by unifying developer APIs with merchant tools. In communications, Twilio won by combining carrier control with developer speed. The AI platform market is at the same inflection point.\n\nThe future isn’t about stitching tools together; it’s about building on a unified, programmable foundation. The organizations that invest in platform-first infrastructure—rather than cobbling together point solutions—will move faster, govern more effectively, and build more sophisticated agentic systems.\n\nThe convergence is coming. The question is whether you’ll be ahead of it or behind it.\n"
    },
  
    {
      "title"    : "MCP Enterprise Readiness: How the 2025-11-25 Spec Closes the Production Gap",
      "category" : "",
      "tags"     : "MCP, Enterprise AI, Agentic AI, Security, OAuth, Authentication, Infrastructure, Agent Ops, Governance, Enterprise Integration",
      "url"      : "/2025/12/01/mcp-enterprise-readiness-how-the-2025-11-25-spec-closes-the-production-gap/",
      "date"     : "December 1, 2025",
      "excerpt"  : "The Model Context Protocol&#39;s first anniversary release isn&#39;t just a milestone—it&#39;s a strategic inflection point. With asynchronous Tasks, enterprise-grade OAuth, and a formal extensions framework, the 2025-11-25 spec directly addresses the operational barriers that have kept organizations from deploying agent-tool ecosystems at scale. This post examines how these new primitives transform MCP from a development convenience into production-grade infrastructure.",
      "content"  : "Just over a week ago, the Model Context Protocol celebrated its first anniversary with the release of the 2025-11-25 specification [1]. The announcement was rightly triumphant—MCP has evolved from an experimental open-source project to a foundational standard backed by GitHub, OpenAI, Microsoft, and Block, with thousands of active servers in production [1].\n\nBut beneath the celebration lies a more interesting story: this spec release is not just an evolution; it’s a strategic pivot toward enterprise readiness. For the past year, MCP has succeeded as a developer tool—a convenient way to connect AI models to data and capabilities during experimentation. The 2025-11-25 spec is different. It introduces features explicitly designed to solve the operational, security, and governance challenges that prevent organizations from deploying agent-tool ecosystems at enterprise scale.\n\nThis article examines three key features from the new spec and analyzes how they close what I call the “production gap”—the distance between experimental agent prototypes and enterprise-grade agentic infrastructure.\n\nThe Production Gap: Why Experimental Agents Don’t Scale\n\nBefore diving into the technical features, we need to understand the problem they’re solving. Organizations have been experimenting with MCP-powered agents for months, often with impressive results in controlled environments. Yet most of these projects remain trapped in pilot purgatory, unable to progress to production deployments. The barriers are not technical whimsy; they are fundamental operational requirements:\n\n\n  \n    \n      Requirement\n      Why It Matters\n      What’s Been Missing\n    \n  \n  \n    \n      Asynchronous Operations\n      Real-world tasks like report generation, data analysis, and workflow automation can take minutes or hours, not milliseconds.\n      MCP connections are synchronous. Long-running tasks force clients to hold connections open or build custom polling systems.\n    \n    \n      Enterprise Authentication\n      Organizations need centralized control over which users, agents, and services can access sensitive tools and data.\n      The original OAuth flow assumed a consumer app model. It lacked support for machine-to-machine auth and didn’t integrate with enterprise Identity Providers.\n    \n    \n      Extensibility\n      Different industries and use cases require custom capabilities without fragmenting the core protocol.\n      There was no formal mechanism to standardize extensions, leading to proprietary, incompatible implementations.\n    \n  \n\n\nThese aren’t edge cases; they are the table stakes for production systems. The 2025-11-25 spec directly addresses each one.\n\nFeature 1: Asynchronous Tasks — Making Long-Running Workflows Production-Ready\n\nPerhaps the most transformative addition is the new Tasks primitive [2]. While still marked as experimental, it fundamentally changes how agents interact with MCP servers for long-running operations.\n\nThe Problem: Synchronous Request-Response Doesn’t Match Real Work\n\nTraditional MCP follows the classic RPC pattern: the client sends a request, the server processes it, and the server returns a response—all within a single connection. This works beautifully for quick operations like reading a database row or checking a weather API. But it breaks down for realistic enterprise workflows:\n\n\n  Data Analytics Agent: “Generate a quarterly financial report by analyzing three years of transaction data” → 15 minutes of processing.\n  Compliance Agent: “Scan all customer contracts for non-standard clauses” → 2 hours across 10,000 documents.\n  DevOps Agent: “Deploy this service to production and run integration tests” → 30 minutes with orchestration dependencies.\n\n\nOrganizations have been forced to build custom workarounds: job queues, polling systems, callback webhooks—all non-standard, all increasing complexity and reducing interoperability.\n\nThe Solution: A Unified Async Model\n\nThe new Tasks feature introduces a standard “call-now, fetch-later” pattern:\n\n\n  The client sends a request to an MCP server with a task hint.\n  The server immediately acknowledges the request and returns a unique taskId.\n  The client periodically checks the task status (working, completed, failed) using standard Task operations.\n  When complete, the client retrieves the final result using the taskId.\n\n\nThis is more than syntactic sugar. It provides a uniform abstraction for asynchronous work across the entire MCP ecosystem. An agent framework doesn’t need to know whether it’s calling a data pipeline, a deployment system, or a document processor—the async pattern is the same.\n\nEnterprise Impact: Agents That Don’t Block\n\nIn production environments, this changes everything. An AI assistant orchestrating a complex workflow can:\n\n\n  Kick off multiple long-running tasks in parallel (e.g., “analyze sales data,” “generate customer insights,” “create visualizations”).\n  Continue planning and reasoning while tasks are in progress.\n  Provide real-time status updates to users without blocking.\n  Handle failures gracefully with retries and fallback strategies.\n\n\nThis is how real autonomous agents operate. The Tasks primitive makes it possible within a standard, interoperable protocol.\n\nFeature 2: Enterprise-Grade OAuth with CIMD and Extensions\n\nThe original MCP spec included OAuth 2.0 support, but it was modeled on consumer app patterns (think “Log in with GitHub”). That model doesn’t work for enterprise use cases, where organizations need centralized identity management, audit trails, and policy-based access control. The 2025-11-25 spec introduces two critical updates to close this gap.\n\nCIMD: Decentralized Trust Without Dynamic Client Registration\n\nThe first change is replacing Dynamic Client Registration (DCR) with Client ID Metadata Documents (CIMD) [3]. In the old model, every MCP client had to register with every authorization server it wanted to use—a scalability nightmare in federated enterprise environments.\n\nWith CIMD, the client_id is now a URL that the client controls (e.g., https://agents.mycompany.com/sales-assistant). When an authorization server needs information about this client, it fetches a JSON metadata document from that URL. This document includes:\n\n\n  Client name and description\n  Valid redirect URIs\n  Supported grant types\n  Public keys for token verification\n\n\nThis approach creates a decentralized trust model anchored in DNS and HTTPS. The authorization server doesn’t need a pre-existing relationship with the client; it trusts the metadata published at the URL. For large organizations with dozens of agent applications and multiple MCP providers, this dramatically reduces operational overhead.\n\nExtension 1: Machine-to-Machine OAuth (SEP-1046)\n\nThe second critical addition is support for the OAuth 2.0 client_credentials flow via the M2M OAuth extension. This enables machine-to-machine authentication—allowing agents and services to authenticate directly with MCP servers without a human user in the loop.\n\nWhy does this matter? Consider these enterprise scenarios:\n\n\n  Scheduled Agent Jobs: A nightly data ingestion agent that pulls information from multiple MCP sources to update a data warehouse.\n  Service-to-Service Communication: A monitoring agent that periodically checks the health of deployed systems by querying infrastructure management tools.\n  Headless Automation: An agent that processes incoming support tickets and takes automated actions based on predefined rules.\n\n\nNone of these involve an interactive user. They are autonomous services that need persistent, secure credentials to access tools on behalf of the organization. The client_credentials flow is the standard OAuth mechanism for exactly this use case, and its inclusion in MCP makes headless agentic systems viable.\n\nExtension 2: Cross App Access (XAA) (SEP-990)\n\nPerhaps the most strategically significant feature for large enterprises is the Cross App Access (XAA) extension. This solves a governance problem that has plagued the consumerization of enterprise AI: uncontrolled tool sprawl.\n\nIn the standard OAuth flow, a user grants consent directly to an AI application to access a tool. The enterprise Identity Provider (IdP) sees only that “Alice logged in to the AI app,” not that “Alice’s AI agent is now accessing the payroll system.” This creates a governance black hole.\n\nXAA changes the authorization flow to insert the enterprise IdP as a central policy enforcement point. Now, when an agent attempts to access an MCP server:\n\n\n  The agent requests authorization from the enterprise IdP.\n  The IdP evaluates organizational policies: Is this agent approved for production use? Does Alice have permission to delegate payroll access to this agent? Is this access compliant with our data governance policies?\n  Only if all policies are satisfied does the IdP issue tokens to the agent.\n\n\nThis provides centralized visibility and control over the entire agent-tool ecosystem. Security teams can monitor which agents are accessing which tools, set organization-wide policies (e.g., “no agents can access PII without human review”), and audit all delegated access. It eliminates shadow AI and provides the compliance story that regulated industries demand.\n\nEnterprise Impact: From Shadow AI to Governed Infrastructure\n\nTogether, these OAuth enhancements transform MCP from a developer convenience into a governed, auditable integration layer. Organizations can:\n\n\n  Enforce Identity Standards: All agents authenticate using the corporate IdP, with the same rigor as human employees.\n  Enable Zero-Trust Architecture: Every tool access is explicitly authorized based on policy, not implicit trust.\n  Provide Audit Trails: Every delegation, token issuance, and access event is logged for compliance and forensic analysis.\n  Scale Securely: Decentralized trust via CIMD means new agents and tools can be onboarded without central bottlenecks, while XAA ensures control is never lost.\n\n\nFeature 3: Formal Extensions Framework — Enabling Innovation Without Fragmentation\n\nThe third major addition is the introduction of a formal Extensions framework [3]. This is a governance mechanism for the protocol itself, allowing the community to develop new capabilities without fragmenting the ecosystem.\n\nThe Innovation-Standardization Tension\n\nEvery successful protocol faces this dilemma: enable innovation fast enough to keep up with evolving use cases, but standardize carefully enough to maintain interoperability. Move too slowly, and the community builds proprietary extensions that fragment the ecosystem. Move too quickly, and the core protocol becomes bloated with niche features that most implementations don’t need.\n\nMCP’s solution is a structured extension process. New capabilities are proposed as Specification Enhancement Proposals (SEPs), which undergo community review and can be adopted incrementally. Extensions are namespaced and clearly marked, so implementations can selectively support them without breaking compatibility.\n\nEnterprise Impact: Customization Without Vendor Lock-In\n\nFor enterprises, this is critical. Different industries have unique requirements:\n\n\n  Healthcare: Extensions for HIPAA-compliant audit logging and patient consent management.\n  Financial Services: Extensions for transaction integrity, regulatory reporting, and fraud detection hooks.\n  Manufacturing: Extensions for real-time sensor data streaming and factory floor integrations.\n\n\nThe formal extensions framework allows organizations to develop these capabilities as standard, interoperable extensions rather than proprietary forks. This preserves the core value proposition of MCP—a universal protocol for agent-tool communication—while enabling the customization required for production use.\n\nThe Multiplier Effect: Sampling with Tools (SEP-1577)\n\nOne more feature deserves mention: Sampling with Tools [3]. This allows MCP servers themselves to act as agentic systems, capable of multi-step reasoning and tool use. A server can now request the client to invoke an LLM on its behalf, enabling server-side agents.\n\nWhy is this powerful? It enables compositional agent architectures. A high-level agent can delegate to specialized MCP servers, which themselves use agentic reasoning to fulfill complex requests. For example:\n\n\n  A “Financial Analysis Agent” delegates to an “ERP Data Server,” which uses its own reasoning to determine which tables to query, how to join data, and how to format results.\n  A “Compliance Agent” delegates to a “Legal Document Server,” which autonomously searches case law, extracts relevant clauses, and generates a summary.\n\n\nThis nested, hierarchical approach is how real autonomous systems will scale. By making it a standard protocol feature rather than a custom implementation, MCP provides the foundation for a rich ecosystem of specialized, composable agents.\n\nClosing the Production Gap: A New Maturity Threshold\n\nThe 2025-11-25 MCP specification is not a radical redesign; it’s a targeted set of enhancements that directly address the barriers preventing enterprise adoption. By introducing:\n\n\n  Asynchronous Tasks for long-running workflows,\n  Enterprise OAuth with CIMD, M2M, and XAA for governed, auditable authentication,\n  Formal Extensions for standardized innovation,\n  Sampling with Tools for compositional agent architectures,\n\n\nthe spec closes the production gap—the distance between experimental prototypes and scalable, secure, enterprise-grade systems.\n\nThis is the moment when MCP transitions from a promising developer tool to a foundational piece of enterprise infrastructure. Organizations that have been waiting for “production readiness” signals now have them. The features are there. The governance mechanisms are there. The security model is there.\n\nThe next phase of agentic AI will be defined not by flashy demos, but by the quiet, reliable, at-scale operation of autonomous systems integrated deeply into enterprise workflows. The 2025-11-25 MCP spec is the technical foundation that makes this future possible.\n\nFor technology leaders evaluating whether to invest in MCP-based infrastructure, the calculus has changed. This is no longer an experimental protocol; it’s a production standard. The organizations that adopt it now, build their agent ecosystems on it, and contribute to its continued evolution will define the next decade of enterprise AI.\n\nReferences:\n\n[1] MCP Core Maintainers. (2025, November 25). One Year of MCP: November 2025 Spec Release. Model Context Protocol.\n\n[2] Model Context Protocol. (2025, November 25). Tasks. Model Context Protocol Specification.\n\n[3] Pakiti, Maria. (2025, November 26). MCP 2025-11-25 is here: async Tasks, better OAuth, extensions, and a smoother agentic future. WorkOS Blog.\n\n[4] Subramanya, N. (2025, November 20). The Governance Stack: Operationalizing AI Agent Governance at Enterprise Scale. subramanya.ai.\n\n[5] Subramanya, N. (2025, November 17). Why Private Registries are the Future of Enterprise Agentic Infrastructure. subramanya.ai.\n\n"
    },
  
    {
      "title"    : "The Governance Stack: Operationalizing AI Agent Governance at Enterprise Scale",
      "category" : "",
      "tags"     : "AI, Agents, Agentic AI, Governance, Enterprise AI, Agent Ops, MCP, Security, Infrastructure, Compliance, AI Management",
      "url"      : "/2025/11/20/the-governance-stack-operationalizing-ai-agent-governance-at-enterprise-scale/",
      "date"     : "November 20, 2025",
      "excerpt"  : "With 88% of organizations now deploying AI agents in production, governance has shifted from a theoretical concern to an operational imperative. Yet 40% of technology executives admit their governance programs are insufficient. This article presents the technical infrastructure—the &#39;governance stack&#39;—required to transform governance frameworks from policy documents into automated, enforceable reality across the entire agentic workforce lifecycle.",
      "content"  : "Enterprise adoption of AI agents has reached a tipping point. According to McKinsey’s 2025 global survey, 88% of organizations now report regular use of AI agents in at least one business function, with 62% actively experimenting with agentic systems [1]. Yet this rapid adoption has created a critical disconnect: while organizations understand the importance of governance, they struggle with the implementation of it. The same survey reveals that 40% of technology executives believe their current governance programs are insufficient for the scale and complexity of their agentic workforce [1, 2].\n\nThe problem is not a lack of frameworks. Numerous organizations have published comprehensive governance principles—from Databricks’ AI Governance Framework to the EU AI Act’s regulatory requirements [2]. The problem is that governance has remained largely conceptual, living in policy documents and compliance checklists rather than in the operational infrastructure where agents actually execute.\n\nThis article presents the technical foundation required to operationalize governance at scale: the Governance Stack. This is the integrated set of platforms, protocols, and enforcement mechanisms that transform governance from aspiration into automated reality across the entire agentic workforce lifecycle.\n\nThe Governance Gap: From Principle to Practice\n\nTraditional enterprise governance models were designed for static systems and predictable workflows. An application goes through a review process, gets deployed, and then operates within well-defined boundaries. Governance checkpoints are discrete events: code reviews, security scans, compliance audits.\n\nAgentic AI shatters this model. Agents are dynamic, adaptive systems that make autonomous decisions, spawn sub-agents, and interact with constantly evolving toolsets. They don’t follow predetermined paths; they reason, plan, and execute based on context. As one industry analysis puts it, the governance question shifts from “did the code do what we programmed?” to “did the agent make the right decision given the circumstances?” [3].\n\nThis creates four fundamental challenges that traditional governance infrastructure cannot address:\n\n\n  \n    \n      Challenge\n      Traditional Governance\n      Agentic Reality\n    \n  \n  \n    \n      Decision-Making\n      Predetermined logic paths, testable and auditable\n      Context-dependent reasoning, emergent behavior\n    \n    \n      Delegation\n      Single service boundary, clear ownership\n      Recursive agent chains, distributed responsibility\n    \n    \n      Policy Enforcement\n      Deployment-time checks, periodic audits\n      Real-time enforcement at the moment of action\n    \n    \n      Auditability\n      Static code and logs\n      Dynamic decision traces across multiple agents and tools\n    \n  \n\n\nThe governance gap is the distance between what existing frameworks prescribe and what existing infrastructure can enforce. Closing this gap requires purpose-built technology.\n\nThe Five Layers of the Governance Stack\n\nDrawing on the foundational pillars outlined in frameworks like Databricks’ AI Governance model [2], we can define a technical architecture—a Governance Stack—that provides the infrastructure necessary to operationalize these principles. This stack has five integrated layers, each addressing a specific aspect of agent lifecycle management.\n\nLayer 1: Identity and Attestation Foundation\n\nBefore governance can be enforced, we must know who (or what) is making a request. This requires a robust identity layer specifically designed for autonomous agents, not just human users.\n\nAs discussed in previous work on OIDC-A (OpenID Connect for Agents), this layer provides [4]:\n\n\n  Verifiable Agent Identities: Every agent receives a cryptographically verifiable identity, issued by a trusted authority (the AI provider or enterprise identity system).\n  Delegation Chains: Clear, auditable records of which user or system authorized the agent, and what permissions were delegated.\n  Attestation Mechanisms: Proof that the agent is running the expected code, on approved infrastructure, with the intended configuration.\n\n\nThis identity foundation is the prerequisite for all subsequent layers. Without it, governance policies have no subject to act upon.\n\nLayer 2: Agent and Tool Registries\n\nGovernance requires visibility. The second layer of the stack is a comprehensive registry system that provides a single source of truth for:\n\n\n  Agent Registry: A catalog of every agent deployed in the enterprise, including its capabilities, business owner, data access, and lifecycle status [5]. This is not just a static directory; it’s a dynamic system that tracks agent versions, configurations, and runtime behavior.\n  MCP/Tool Registry: A curated, approved set of tools and MCP servers that agents are authorized to access. This registry enforces pre-deployment security reviews, manages versions, tracks usage, and provides cost visibility [5].\n\n\nAs explored in our previous article on private registries, this layer transforms governance from a manual audit process into an automated, enforceable function of the infrastructure itself [5]. Agents that aren’t registered can’t deploy. Tools that haven’t been vetted can’t be accessed.\n\nLayer 3: Policy Engine and Gateway\n\nThe third layer is where governance rules are codified and enforced in real-time. This includes:\n\nAgent Firewalls and MCP Gateways: Acting as intermediaries between agents and their tools, these gateways inspect every request, enforce security policies, and block unauthorized actions before they occur [6]. They provide:\n\n  Prompt injection detection and filtering\n  Real-time policy evaluation (e.g., “can this agent access PII?”)\n  Dynamic rate limiting and cost controls\n  Anomaly detection for suspicious behavior patterns\n\n\nAutomated Policy Enforcement: Instead of relying on manual reviews, the policy engine automatically validates agents against organizational standards at every lifecycle stage. For example, an agent cannot be promoted to production without:\n\n  A completed data classification assessment\n  Approval from the designated business owner\n  A passed security scan\n  Documented human oversight procedures for high-stakes decisions\n\n\nThis layer is the operational heart of the governance stack. It is where abstract policies become concrete actions that prevent harm in real-time.\n\nLayer 4: Observability and Monitoring Platform\n\nGovernance is not a one-time gate; it requires continuous oversight. The fourth layer provides real-time visibility into the behavior of the entire agentic workforce:\n\n\n  Performance Dashboards: Track accuracy, decision quality, latency, and resource consumption across all agents.\n  Drift Detection: Monitor agents for behavioral changes that might indicate model degradation, prompt injection, or unauthorized modifications.\n  Audit Trails: Capture every agent action, tool invocation, and delegation event with sufficient context to enable forensic analysis and compliance reporting [3].\n  Anomaly Alerting: Trigger automated responses when agents deviate from expected patterns, such as accessing unusual data sources or making an abnormal volume of API calls.\n\n\nThis layer transforms governance from reactive (responding to incidents after they occur) to proactive (detecting and preventing issues before they cause harm).\n\nLayer 5: Human-in-the-Loop Orchestration\n\nThe final layer recognizes that not all decisions can or should be fully automated. For high-stakes scenarios, governance requires explicit human oversight:\n\n\n  Escalation Workflows: Agents can request human approval before executing sensitive actions, such as modifying production systems or processing large financial transactions.\n  Override Mechanisms: Authorized personnel can intervene to pause, redirect, or terminate agent operations when necessary.\n  Explainability Interfaces: When agents make consequential decisions, stakeholders need to understand the reasoning. This layer provides tools to inspect the decision chain, view the data that influenced the agent, and audit the tool usage.\n\n\nThis is not about replacing human judgment; it’s about augmenting it with the right information at the right time.\n\nOperationalizing the Framework: Governance Across the Agent Lifecycle\n\nThe power of the Governance Stack becomes clear when we map it to the complete agent lifecycle. Governance is not a single checkpoint; it is a continuous process embedded at every stage.\n\n\n  \n    \n      Lifecycle Stage\n      Governance Stack in Action\n    \n  \n  \n    \n      Planning &amp; Design\n      Identity layer establishes agent ownership. Policy engine validates business case against organizational risk appetite.\n    \n    \n      Data Preparation\n      Registries enforce data classification and lineage tracking. Policy engine blocks access to non-compliant datasets.\n    \n    \n      Development &amp; Training\n      Observability platform tracks experiments and model performance. Registries version all agent configurations.\n    \n    \n      Testing &amp; Validation\n      Agent firewall tests for adversarial inputs and prompt injections. Policy engine validates against security and ethical standards.\n    \n    \n      Deployment\n      Gateway enforces real-time authorization for all tool access. Observability platform begins continuous monitoring.\n    \n    \n      Operations\n      Monitoring platform detects drift and anomalies. Human-in-the-loop mechanisms escalate high-stakes decisions.\n    \n    \n      Retirement\n      Registries archive agent configurations. Identity layer revokes all permissions. Audit trails are retained for compliance.\n    \n  \n\n\nThis lifecycle-aware approach ensures that governance is not an afterthought, but an integrated function of how agents are built, deployed, and managed.\n\nThe ROI of Governance Infrastructure\n\nImplementing a comprehensive Governance Stack is a significant investment. Organizations rightfully ask: what is the return?\n\nThe answer lies in four measurable outcomes:\n\nRisk Mitigation: As demonstrated by the recent AI-orchestrated cyber espionage campaign disrupted by Anthropic [6], uncontrolled agent access to powerful tools is not a theoretical threat. A governance stack with identity attestation, gateways, and real-time policy enforcement would have prevented that attack at multiple layers.\n\nRegulatory Compliance: With regulations like the EU AI Act imposing strict requirements on high-risk AI systems, the ability to demonstrate comprehensive lifecycle governance, auditability, and human oversight is not optional—it’s mandatory [2]. The Governance Stack provides the automated evidence generation required for compliance.\n\nOperational Efficiency: Without centralized registries and monitoring, organizations waste time debugging agent failures, tracking down tool dependencies, and investigating cost overruns. The stack provides the visibility and control to operate an agentic workforce at scale.\n\nTrust and Adoption: The ultimate ROI is internal and external trust. Employees, customers, and regulators need confidence that autonomous agents are operating safely, ethically, and in alignment with organizational values. The Governance Stack makes that confidence possible.\n\nBuilding vs. Buying: The Emerging Vendor Landscape\n\nOrganizations face a critical decision: build this governance infrastructure in-house or adopt emerging platforms that provide it as a service. Early movers are choosing different paths:\n\n\n  Enterprise Platforms: Companies like Collibra, Databricks, and TrueFoundry are extending their data governance and MLOps platforms to include agent registries and observability tools [2, 5, 7].\n  Purpose-Built Solutions: Startups like Agentic Trust are building end-to-end governance platforms specifically designed for agentic AI, providing integrated registries, gateways, and policy engines [5].\n  Protocol-Level Standards: Open standards like OIDC-A and MCP are enabling interoperability, allowing organizations to build custom stacks from best-of-breed components [4].\n\n\nThe optimal path depends on organizational maturity, existing infrastructure, and the scale of agentic deployment. However, the underlying message is universal: governance at scale requires dedicated infrastructure.\n\nConclusion: Governance as the Enabler of Scale\n\nThe era of experimental agentic AI pilots is ending. Organizations are now operationalizing agentic workforces across critical business functions, and the governance gap is the primary barrier to scaling these deployments safely and responsibly.\n\nThe Governance Stack is not a constraint on innovation; it is the foundation that makes innovation sustainable. By providing identity, visibility, policy enforcement, continuous monitoring, and human oversight, this technical infrastructure transforms governance from a compliance burden into a strategic enabler.\n\nThe organizations that invest in this stack today will be the ones that confidently deploy autonomous agents at enterprise scale tomorrow. They will move faster, operate more safely, and earn the trust of stakeholders who demand accountability in the age of autonomous AI.\n\nFor technology leaders navigating this landscape, the path is clear: governance is not a policy problem—it is an engineering challenge. And like all engineering challenges, it requires purpose-built infrastructure to solve. The Governance Stack is that infrastructure.\n\nReferences:\n\n[1] McKinsey &amp; Company. (2025, November 5). The State of AI in 2025: A global survey. McKinsey.\n\n[2] Databricks. (2025, July 1). Introducing the Databricks AI Governance Framework. Databricks.\n\n[3] DZone. (2025, May 21). Securing the Future: Best Practices for Privacy and Data Governance in LLMOps. DZone.\n\n[4] Subramanya, N. (2025, April 28). OpenID Connect for Agents (OIDC-A) 1.0 Proposal. subramanya.ai.\n\n[5] Subramanya, N. (2025, November 17). Why Private Registries are the Future of Enterprise Agentic Infrastructure. subramanya.ai.\n\n[6] Subramanya, N. (2025, November 14). From Espionage to Identity: Securing the Future of Agentic AI. subramanya.ai.\n\n[7] TrueFoundry. (2025, September 10). What is AI Agent Registry. TrueFoundry.\n\n"
    },
  
    {
      "title"    : "Why Private Registries are the Future of Enterprise Agentic Infrastructure",
      "category" : "",
      "tags"     : "AI, Agents, Agentic AI, MCP, Agent Registry, Enterprise AI, Governance, Security, Infrastructure, Private Registry, AI Management",
      "url"      : "/2025/11/17/why-private-registries-are-the-future-of-enterprise-agentic-infrastructure/",
      "date"     : "November 17, 2025",
      "excerpt"  : "With 79% of companies already adopting AI agents, a critical governance gap has emerged. Without robust management frameworks, organizations risk a chaotic landscape of shadow AI, creating significant security vulnerabilities and operational inefficiencies. The solution lies in Private Agent and MCP Registries—command centers for agentic infrastructure that provide the visibility, governance, and security necessary to scale AI responsibly.",
      "content"  : "The age of agentic AI is no longer on the horizon; it’s in our datacenters, cloud environments, and business units. A recent PwC report highlights that a staggering 79% of companies are already adopting AI agents in some capacity [1]. As these autonomous systems proliferate, executing tasks and making decisions on behalf of the enterprise, a critical governance gap has emerged. Without a robust management framework, organizations risk a chaotic landscape of “shadow AI,” creating significant security vulnerabilities, compliance nightmares, and operational inefficiencies.\n\nThe solution lies in a new class of enterprise software: the Private Agent and MCP Registry. This is not just a catalog, but a command center for agentic infrastructure, providing the visibility, governance, and security necessary to scale AI responsibly. Let’s explore the core pillars of this trend, using the “Agentic Trust” platform as a blueprint for building a better, more secure agentic future.\n\nPillar 1: A Centralized Directory for Every Agent\n\nThe first step to managing agentic chaos is to establish a single source of truth. You cannot govern what you cannot see. A private agent registry provides a comprehensive, real-time inventory of every agent operating within the enterprise, whether built in-house or sourced from a third-party vendor.\n\n\nA centralized agent directory, as shown in the Agentic Trust platform, provides a complete inventory for governance and oversight.\n\nAs the screenshot of the Agentic Trust directory illustrates, this is more than just a list. A mature registry tracks critical metadata for each agent, including:\n\n\n  Unique Identity: A verifiable ID for every agent, forming the foundation for authentication and authorization.\n  Capabilities: A clear declaration of what the agent is designed to do, including the tools, resources, and prompts it can access.\n  Lifecycle Status: Tracking whether an agent is in development, production, or retired.\n  Ownership and Lineage: Connecting each agent to a business owner, use case, and the data it interacts with.\n  Activity Monitoring: Recording when agents were last used and their registration dates.\n\n\nThis centralized view eliminates blind spots and provides the traceability required for compliance and security audits. Organizations can quickly answer critical questions: How many agents do we have? Who owns them? What are they authorized to do?\n\nPillar 2: A Curated Marketplace for Agent Tools (MCPs)\n\nAutonomous agents are only as powerful as the tools they can access. The Model Context Protocol (MCP) has become a standard for providing agents with these tools, but an uncontrolled proliferation of MCP servers creates another layer of risk. A private registry addresses this by functioning as a curated, internal “app store” or marketplace for MCPs.\n\n\nAn MCP Registry, like this one from Agentic Trust, allows enterprises to create a governed marketplace of approved tools for their AI agents.\n\nInstead of allowing agents to connect to any public MCP, the enterprise can define a catalog of approved, vetted, and secure tools. As shown in the Agentic Trust MCP Registry, this allows organizations to:\n\n\n  Enforce Security Standards: Ensure that all available tools meet enterprise security and compliance requirements before they’re made available to agents.\n  Manage Versions and Dependencies: Control which versions of tools are used, preventing unexpected breaking changes that could disrupt agent operations.\n  Control Costs: Monitor the usage of paid APIs and tools, preventing runaway costs from autonomous agents making thousands of requests.\n  Improve Developer Productivity: Provide a central place for developers to discover and reuse existing tools, accelerating agent development and reducing duplication.\n  Categorize and Organize: Group tools by function (productivity, collaboration, payments, development, monitoring) to make discovery easier.\n\n\nThe registry shows connection status for each MCP server, making it immediately visible which integrations are active and which require attention. This operational visibility is critical for maintaining a healthy agentic ecosystem.\n\nPillar 3: End-to-End Governance and Policy Enforcement\n\nA private registry is the enforcement point for enterprise AI policy. It moves governance from a manual, after-the-fact process to an automated, built-in function of the agentic infrastructure. Drawing on best practices from platforms like Collibra and Microsoft Azure’s private registry implementations, this includes [1, 2]:\n\nMandatory Metadata and Documentation: Before an agent or MCP can be registered, developers must provide essential information such as data classification, business owner, purpose, and criticality. This ensures that every component in the agentic ecosystem is properly documented and understood.\n\nLifecycle Policy Alignment: The registry can embed automated policy checks at each stage of an agent’s lifecycle. For example, an agent cannot be promoted to production without a completed security review, ethical bias assessment, and approval from the designated business owner. This creates natural checkpoints that enforce organizational standards.\n\nAccess Control and Permissions: Using Role-Based Access Control (RBAC), integrated with enterprise identity systems like Entra ID or Okta, the registry defines who can create, manage, and consume agents and their tools. Different teams might have different levels of access based on their role and the sensitivity of the agents they’re working with.\n\nAudit Trails and Compliance: Every action in the registry—agent registration, tool connection, permission changes—is logged and auditable. This creates a complete forensic trail that satisfies regulatory requirements and enables rapid incident response when issues arise.\n\nPillar 4: Solving Real Enterprise Challenges\n\nThe value of a private registry becomes clear when we examine the specific problems it solves. Consider these common enterprise scenarios:\n\nChallenge: Shadow AI and Uncontrolled Tool Adoption\n\nDevelopment teams are rapidly adopting AI tools and MCP servers without central oversight. This creates security blind spots, compliance risks, and operational fragmentation across the organization. A private registry provides centralized discovery of approved tools and usage visibility, allowing security teams to monitor what tools are being used and by whom [2].\n\nChallenge: Regulatory Compliance and Data Sovereignty\n\nOrganizations in regulated industries (financial services, healthcare, government) need to maintain strict control over data flows and ensure AI tools meet compliance requirements. The registry enables data classification tagging for MCP servers, geographic controls for region-specific availability, comprehensive audit trails, and pre-configured compliance templates [2].\n\nChallenge: Cost Control and Resource Optimization\n\nWithout visibility into agent and tool usage, organizations face unpredictable costs as autonomous agents make API calls and consume resources. A private registry provides usage analytics, cost allocation by team or project, budget alerts, and the ability to deprecate underutilized or expensive tools [2].\n\nChallenge: Developer Productivity and Tool Discovery\n\nDevelopers waste time rebuilding integrations that already exist elsewhere in the organization or struggle to find the right tools for their agents. The registry solves this with searchable catalogs, reusable components, standardized integration patterns, and clear documentation for each available tool [3].\n\nThe Architecture That Enables Scale\n\nBehind the user interface of platforms like Agentic Trust lies a sophisticated architecture that makes enterprise-scale agent management possible. The key components include [3, 4]:\n\n\n  \n    \n      Component\n      Purpose\n    \n  \n  \n    \n      Central Registry API\n      Provides standardized endpoints for agent and MCP registration, discovery, and management\n    \n    \n      Metadata Database\n      Stores agent cards, capability declarations, and relationship data\n    \n    \n      Policy Engine\n      Enforces governance rules, access controls, and compliance checks\n    \n    \n      Discovery Service\n      Enables capability-based search and intelligent agent-to-tool matching\n    \n    \n      Health Monitor\n      Tracks agent and MCP server availability through heartbeats and health checks\n    \n    \n      Integration Layer\n      Connects to enterprise identity systems, monitoring tools, and DevOps pipelines\n    \n  \n\n\nThis architecture mirrors patterns from successful enterprise software registries, such as container registries, API management platforms, and model registries. The lesson is clear: as a technology becomes critical to enterprise operations, it requires industrial-grade management infrastructure.\n\nThe Path Forward\n\nThe trend toward private registries for agentic infrastructure is not a passing fad; it is a necessary evolution in response to the rapid adoption of autonomous AI systems. As the Model Context Protocol ecosystem continues to grow, with the official MCP Registry serving as a public catalog [4], forward-thinking enterprises are building their own private implementations to maintain control, security, and governance.\n\nPlatforms like Agentic Trust demonstrate what this future looks like: a unified command center where every agent is visible, every tool is vetted, and every action is governed by policy. This is how organizations move from the chaos of unmanaged AI to the strategic advantage of a well-orchestrated agentic ecosystem.\n\nFor enterprises embarking on this journey, the message is clear: you cannot scale what you cannot see, and you cannot govern what you cannot control. A private registry is the foundation upon which responsible, secure, and effective agentic AI is built.\n\nReferences:\n\n[1] Collibra. (2025, October 6). Collibra AI agent registry: Governing autonomous AI agents. Collibra.\n\n[2] Bajada, AJ. (2025, August 14). DevOps and AI Series: Azure Private MCP Registry. azurewithaj.com.\n\n[3] TrueFoundry. (2025, September 10). What is AI Agent Registry. TrueFoundry.\n\n[4] Model Context Protocol. (2025, September 8). Introducing the MCP Registry. Model Context Protocol.\n\n"
    },
  
    {
      "title"    : "From Espionage to Identity: Securing the Future of Agentic AI",
      "category" : "",
      "tags"     : "AI, Security, Agentic AI, OIDC-A, MCP, Anthropic, Claude, Cybersecurity, AI Agents, Identity Management, Zero Trust",
      "url"      : "/2025/11/14/from-espionage-to-identity-securing-the-future-of-agentic-ai/",
      "date"     : "November 14, 2025",
      "excerpt"  : "Anthropic has detailed its disruption of the first publicly reported cyber espionage campaign orchestrated by a sophisticated AI agent. The incident, attributed to state-sponsored group GTG-1002, signals that the age of autonomous, agentic AI threats is here. This post dissects the anatomy of the attack and explores how emerging standards like OpenID Connect for Agents (OIDC-A) provide a necessary path forward.",
      "content"  : "Anthropic has detailed its disruption of the first publicly reported cyber espionage campaign orchestrated by a sophisticated AI agent [1]. The incident, attributed to a state-sponsored group designated GTG-1002, is more than just a security bulletin; it is a clear signal that the age of autonomous, agentic AI threats is here. It also serves as a critical case study, validating the urgent need for a new generation of identity and access management protocols specifically designed for AI.\n\n\n\nThis post will dissect the anatomy of the attack, connect it to the foundational security challenges facing agentic AI, and explore how emerging standards like OpenID Connect for Agents (OIDC-A) provide a necessary path forward [2, 3].\n\nAnatomy of an AI-Orchestrated Attack\n\nAnthropic’s investigation revealed a campaign of unprecedented automation. The attackers turned Anthropic’s own Claude Code model into an autonomous weapon, targeting approximately thirty global organizations across technology, finance, and government. The AI was not merely an assistant; it was the operator, executing 80-90% of the tactical work with human intervention only required at a few key authorization gates [1].\n\nThe technical sophistication of the attack did not lie in novel malware, but in orchestration. The threat actor built a custom framework around a series of Model Context Protocol (MCP) servers. These servers acted as a bridge, giving the AI agent access to a toolkit of standard, open-source penetration testing utilities—network scanners, password crackers, and database exploitation tools.\n\nBy decomposing the attack into seemingly benign sub-tasks, the attackers tricked the AI into executing a complex intrusion campaign. The AI agent, operating with a persona of a legitimate security tester, autonomously performed reconnaissance, vulnerability analysis, and data exfiltration at a machine-speed that no human team could match.\n\nThe MCP Paradox: Extensibility vs. Security\n\nThe Anthropic report explicitly states that the attackers leveraged the Model Context Protocol (MCP) to arm their AI agent [1]. This highlights a central paradox in agentic AI architecture: the very protocols designed for extensibility and power, like MCP, can become the most potent attack vectors.\n\nAs the “Identity Management for Agentic AI” whitepaper notes, MCP is a leading framework for connecting AI to external tools, but it also presents significant security challenges [3]. When an AI can dynamically access powerful tools without robust oversight, it creates a direct and dangerous path for misuse. The GTG-1002 campaign is a textbook example of this risk realized.\n\nThis forces a critical re-evaluation of how we architect agentic systems. We can no longer afford to treat the connection between an AI agent and its tools as a trusted channel. This is where the concept of an MCP Gateway or Proxy becomes not just a good idea, but an absolute necessity.\n\nThe Solution: Identity, Delegation, and Zero Trust for Agents\n\nThe security gaps exploited in the Anthropic incident are precisely what emerging standards like OIDC-A (OpenID Connect for Agents) are designed to close [2, 3]. The core problem is one of identity and authority. The AI agent in the attack acted with borrowed, indistinct authority, effectively impersonating a legitimate user or process. True security requires a shift to a model of explicit, verifiable delegation.\n\nThe OIDC-A proposal introduces a framework for establishing the identity of an AI agent and managing its authorization through cryptographic delegation chains. This means an agent is no longer just a proxy for a user; it is a distinct entity with its own identity, operating on behalf of a user with a clearly defined and constrained set of permissions.\n\nHere’s how this new model, enforced by an MCP Gateway, would have mitigated the Anthropic attack:\n\n\n  \n    \n      Security Layer\n      Description\n    \n  \n  \n    \n      Agent Identity &amp; Attestation\n      The AI agent would have a verifiable identity, attested by its provider. An MCP Gateway could immediately block any requests from unattested or untrusted agents.\n    \n    \n      Tool-Level Delegation\n      Instead of broad permissions, the agent would receive narrowly-scoped, delegated authority for specific tools. The OIDC-A delegation_chain ensures that the agent’s permissions are a strict subset of the delegating user’s permissions [2]. An agent designed for code analysis could never be granted access to a password cracker.\n    \n    \n      Policy Enforcement &amp; Anomaly Detection\n      The MCP Gateway would act as a policy enforcement point, monitoring all tool requests. It could detect anomalous behavior, such as an agent attempting to use a tool outside its delegated scope or a sudden spike in high-risk tool usage, and automatically terminate the agent’s session.\n    \n    \n      Auditing and Forensics\n      Every tool request and delegation would be cryptographically signed and logged, creating an immutable audit trail. This would provide immediate, granular visibility into the agent’s actions, dramatically accelerating incident response.\n    \n  \n\n\nBuilding Enterprise-Grade Security for Agentic AI\n\nThe Anthropic report is a watershed moment. It proves that the threats posed by agentic AI are no longer theoretical. As the “Identity Management for Agentic AI” paper argues, we must move beyond traditional, human-centric security models and build a new foundation for AI identity [3].\n\nToday, most MCP servers being developed are experimental tools designed for individual developers and small-scale applications. They lack the enterprise-grade security controls that organizations require to deploy them in production environments. For enterprises to confidently adopt agentic AI systems built on protocols like MCP, we need to fundamentally rethink how we approach security.\n\nThe path forward requires building robust delegation frameworks, implementing proper identity management for AI agents, and creating enterprise-grade security controls like gateways and policy enforcement points. We need solutions that provide:\n\n\n  Cryptographic delegation chains that clearly define and constrain agent permissions\n  Real-time policy enforcement that can detect and prevent anomalous behavior\n  Comprehensive audit trails that enable forensic analysis and compliance\n  Zero-trust architectures where every agent action is verified and authorized\n\n\nWe cannot afford to let the open, extensible nature of protocols like MCP become a permanent backdoor for malicious actors. The future of agentic AI depends on our ability to build security into these systems from the ground up, making enterprise adoption not just possible, but secure and responsible.\n\nReferences:\n\n[1] Anthropic. (2025, November). Disrupting the first reported AI-orchestrated cyber espionage campaign. Anthropic.\n\n[2] Subramanya, N. (2025, April 28). OpenID Connect for Agents (OIDC-A) 1.0 Proposal. subramanya.ai.\n\n[3] South, T. (Ed.). (2025, October). Identity Management for Agentic AI: The new frontier of authorization, authentication, and security for an AI agent world. arXiv.\n\n"
    },
  
    {
      "title"    : "Claude Skills vs. MCP: A Tale of Two AI Customization Philosophies",
      "category" : "",
      "tags"     : "AI, Claude, MCP, Claude Skills, Agent Skills, AI Customization, LLM, Anthropic, Integration, Workflows",
      "url"      : "/2025/10/30/claude-skills-vs-mcp-a-tale-of-two-ai-customization-philosophies/",
      "date"     : "October 30, 2025",
      "excerpt"  : "Anthropic has introduced two powerful but distinct approaches to AI customization: Claude Skills and the Model Context Protocol (MCP). While both aim to make AI more useful and integrated into our workflows, they operate on fundamentally different principles. This post explores their differences, synergies, and the exciting future they represent.",
      "content"  : "In the rapidly evolving landscape of artificial intelligence, the ability to customize and extend the capabilities of large language models (LLMs) has become a critical frontier. Anthropic, a leading AI research company, has introduced two powerful but distinct approaches to this challenge: Claude Skills and the Model Context Protocol (MCP). While both aim to make AI more useful and integrated into our workflows, they operate on fundamentally different principles. This post delves into a detailed comparison of Claude Skills and MCP, explores whether they can or should be merged, and discusses the exciting future of AI customization they represent.\n\nWhat are Claude Skills? The Power of Procedural Knowledge\n\nClaude Skills, also known as Agent Skills, are a revolutionary way to teach Claude how to perform specific tasks in a repeatable and customized manner. At its core, a Skill is a folder containing a SKILL.md file, which includes instructions, resources, and even executable code. Think of Skills as a set of standard operating procedures for the AI. For example, a Skill could instruct Claude on how to format a weekly report, adhere to a company’s brand guidelines, or analyze data using a specific methodology.\n\nThe genius of Claude Skills lies in their architecture, which is built on a principle called progressive disclosure. This three-tiered system ensures that Claude’s context window isn’t overwhelmed with information:\n\n\n  \n    Level 1: Metadata: When a session starts, Claude loads only the name and description of each available Skill. This is a very lightweight process, consuming only a few tokens per Skill.\n  \n  \n    Level 2: The SKILL.md file: If Claude determines that a Skill is relevant to the user’s request, it then loads the full content of the SKILL.md file.\n  \n  \n    Level 3 and beyond: Additional resources: If the SKILL.md file references other documents or scripts within the Skill’s folder, Claude will load them only when needed.\n  \n\n\nThis efficient, just-in-time loading mechanism allows for a vast library of Skills to be available without sacrificing performance. Skills are also portable, working across Claude.ai, Claude Code, and the API, and can even include executable code for deterministic and reliable operations.\n\nWhat is the Model Context Protocol (MCP)? The Universal Connector\n\nThe Model Context Protocol (MCP) is an open-source standard designed to connect AI applications to external systems. If Claude Skills are about teaching the AI how to do something, MCP is about giving it access to what it needs to do it. MCP acts as a universal connector, similar to a USB-C port for AI, allowing models like Claude to interact with a wide range of data sources, tools, and workflows.\n\nMCP operates on a client-server architecture:\n\n\n  \n    MCP Host: The AI application (e.g., Claude) that manages connections to various external systems.\n  \n  \n    MCP Client: A component within the host that maintains a one-to-one connection with an MCP server.\n  \n  \n    MCP Server: A program that exposes tools, resources, and prompts from an external system to the AI.\n  \n\n\nThis architecture allows an AI to connect to multiple external systems simultaneously, from local files and databases to remote services like GitHub, Slack, or a company’s internal APIs. MCP is built on a two-layer architecture, with a data layer based on JSON-RPC 2.0 and a transport layer that supports both local and remote connections.\n\nThe Core Difference: Methodology vs. Connectivity\n\nThe fundamental distinction between Claude Skills and MCP can be summarized as methodology versus connectivity. MCP provides the AI with access to tools and data, while Skills provide the instructions on how to use them effectively. According to Anthropic’s own documentation:\n\n\n  “MCP connects Claude to external services and data sources. Skills provide procedural knowledge—instructions for how to complete specific tasks or workflows. You can use both together: MCP connections give Claude access to tools, while Skills teach Claude how to use those tools effectively.”\n\n\nThis highlights that Skills and MCP are not competing technologies but are, in fact, complementary. An apt analogy is that of a master chef. MCP provides the chef with a fully stocked pantry of ingredients and a set of high-end kitchen appliances (the what). Skills, on the other hand, are the chef’s personal recipe book and techniques, guiding them on how to combine the ingredients and use the appliances to create a culinary masterpiece.\n\n\n  \n    \n      Feature\n      Claude Skills\n      Model Context Protocol (MCP)\n    \n  \n  \n    \n      Primary Purpose\n      Procedural knowledge and methodology\n      Connectivity to external systems\n    \n    \n      Architecture\n      Filesystem-based with progressive disclosure\n      Client-server with JSON-RPC 2.0\n    \n    \n      Core Concept\n      Teaching the AI how to do something\n      Giving the AI access to what it needs\n    \n    \n      Dependency\n      Requires a code execution environment\n      A client and a server implementation\n    \n    \n      Token Efficiency\n      Very high due to progressive disclosure\n      Moderate, with tool descriptions in context\n    \n    \n      Portability\n      Across Claude interfaces\n      Open standard for any LLM\n    \n  \n\n\nCan a Claude Skill be an MCP? And Should They Be Merged?\n\nGiven that both are Anthropic’s creations, a natural question arises: could a Claude Skill be implemented as an MCP, or should the two be merged into a single, unified system? While technically possible to create an MCP server that exposes Skills, it would be architecturally inefficient and would defeat the purpose of both systems.\n\nExposing Skills through MCP would negate the benefits of progressive disclosure, as it would introduce the overhead of the MCP protocol for what should be a simple filesystem read. It would also create a redundant abstraction layer, as Skills already require a local code execution environment. The two systems are designed for different purposes and have different optimization goals: Skills for context efficiency within Claude, and MCP for standardized integration across different AI systems.\n\nTherefore, Claude Skills and MCP should be treated as independent, complementary technologies. The most powerful workflows will come from using them in synergy.\n\nThe Power of Synergy: Using Skills and MCP Together\n\nThe true potential of these technologies is unlocked when they are used in concert. Here are a few integration patterns that showcase their combined power:\n\n\n  \n    Skills as MCP Orchestrators: A Skill can contain a complex workflow that orchestrates calls to multiple MCP servers. For example, a “Deploy and Notify” Skill could contain a deployment checklist, notification templates, and rollback procedures. It would then use MCP to access GitHub for code, a CI/CD server for deployment, and Slack for notifications.\n  \n  \n    Skills for MCP Configuration: An organization can create Skills that teach Claude its specific standards for using MCP tools. For example, a “GitHub Workflow Standards” Skill could contain instructions on branch naming conventions, pull request review checklists, and commit message templates, ensuring that Claude uses the GitHub MCP server in a way that aligns with the company’s best practices.\n  \n  \n    Hybrid Skills: A Skill can contain embedded code that makes calls to an MCP server. This is useful for self-contained workflows that need to fetch external data.\n  \n\n\nThe Future: A Marketplace for Skills and an Ecosystem for MCP\n\nThe future of AI customization will likely see the development of a vibrant Skills Marketplace. Similar to the app stores for our smartphones or the extension marketplaces for our code editors, a Skills Marketplace would allow developers to publish, share, and even sell Skills. This could create a new economy around AI expertise, with a wide range of Skills available, from free, community-contributed Skills to premium, industry-specific Skill packages for domains like law, medicine, or finance.\n\nSimultaneously, the MCP ecosystem will continue to grow, with more and more tools and services exposing their functionality through MCP servers. This will create a virtuous cycle: as more tools become available through MCP, the demand for Skills that can effectively use those tools will increase.\n\nConclusion\n\nClaude Skills and the Model Context Protocol represent two distinct but complementary philosophies of AI customization. MCP is the universal connector, providing the what—the access to tools and data. Skills are the procedural knowledge, providing the how—the instructions and methodology. They are not competitors but partners in the quest to create more powerful, personalized, and integrated AI assistants. The future of AI workflows will not be about choosing between Skills or MCP, but about leveraging the power of Skills and MCP to create intelligent systems that are truly tailored to our needs.\n\nReferences:\n\n[1] Anthropic. (2025, October 16). Claude Skills: Customize AI for your workflows. Anthropic.\n\n[2] Anthropic. (2025, October 16). Equipping agents for the real world with Agent Skills. Anthropic.\n\n[3] Model Context Protocol. (n.d.). What is the Model Context Protocol (MCP)? Model Context Protocol.\n\n[4] Model Context Protocol. (n.d.). Architecture overview. Model Context Protocol.\n\n[5] Willison, S. (2025, October 16). Claude Skills are awesome, maybe a bigger deal than MCP. Simon Willison’s Weblog.\n\n[6] Claude Help Center. (n.d.). What are Skills? Claude Help Center.\n\n[7] IntuitionLabs. (2025, October 27). Claude Skills vs. MCP: A Technical Comparison for AI Workflows. IntuitionLabs.\n\n"
    },
  
    {
      "title"    : "Beyond &quot;Non-Deterministic&quot;: Deconstructing the Illusion of Randomness in LLMs",
      "category" : "",
      "tags"     : "AI, LLM, Determinism, Architecture, Machine Learning, Prompt Engineering, Emergence",
      "url"      : "/2025/09/09/beyond-non-deterministic-deconstructing-the-illusion-of-randomness-in-llms/",
      "date"     : "September 9, 2025",
      "excerpt"  : "Attributing an LLM&#39;s behavior to &#39;non-determinism&#39; is like blaming a complex system&#39;s emergent behavior on magic. It&#39;s an admission of incomprehension, not an explanation. The truth is far more fascinating and, for architects and engineers, far more critical to understand.",
      "content"  : "In the rapidly evolving lexicon of AI, few terms are as casually thrown around—and as fundamentally misunderstood—as “non-deterministic.” We use it to explain away unexpected outputs, to describe the creative spark of generative models, and to justify the frustrating brittleness of our AI-powered systems. But this term, borrowed from classical computer science, is not just imprecise when applied to Large Language Models (LLMs); it’s a conceptual dead end. It obscures the intricate, deterministic machinery humming beneath the surface and distracts us from the real architectural challenges we face.\n\nAttributing an LLM’s behavior to “non-determinism” is like blaming a complex system’s emergent behavior on magic. It’s an admission of incomprehension, not an explanation. The truth is far more fascinating and, for architects and engineers, far more critical to understand. LLMs are not mystical black boxes governed by chance. They are complex, stateful systems whose outputs are the result of a deterministic, albeit highly sensitive, process. The perceived randomness is not a feature; it is a symptom of a deeper architectural paradigm shift.\n\nThis post will dismantle the myth of LLM non-determinism. We will explore why the term is a poor fit, dissect the underlying deterministic mechanisms that govern LLM behavior, and reframe the conversation around the true challenge: the profound difficulty of controlling a system whose behavior is an emergent property of its architecture. We will move beyond the simplistic notion of randomness and into the far more complex and rewarding territory of input ambiguity, ill-posed inverse problems, and the dawn of truly evolutionary software architectures.\n\nThe Deterministic Heart of the LLM\n\nTo understand why “non-deterministic” is a misnomer, we must first revisit its classical definition. A deterministic algorithm, given a particular input, will always produce the same output. An LLM, at its core, is a mathematical function. It is a massive, intricate, but ultimately deterministic, series of calculations. Given the same model, the same weights, and the same input sequence, the same sequence of floating-point operations will occur, producing the same output logits.\n\nThe illusion of non-determinism arises not from the model itself, but from the sampling strategies we apply to its output. The model’s final layer produces a vector of logits, one for each token in its vocabulary. These logits are then converted into a probability distribution via the softmax function. It is at this final step—the selection of the next token from this distribution—that we introduce controlled randomness.\n\nTemperature and Sampling: The Controlled Introduction of Randomness\n\nThe temperature parameter is the primary lever we use to control this randomness. A temperature of 0 results in greedy decoding—a purely deterministic process where the token with the highest probability is always chosen. In theory, with a temperature of 0, an LLM should be perfectly deterministic. However, as many have discovered, even this is not a perfect guarantee. Minor differences in floating-point arithmetic across different hardware, or even different software library versions, can lead to minuscule variations in the logits, which can occasionally be enough to tip the balance in favor of a different token.\n\nWhen the temperature is set above 0, we enter the realm of stochastic sampling. The temperature value scales the logits before they are passed to the softmax function. A higher temperature flattens the probability distribution, making less likely tokens more probable. A lower temperature sharpens the distribution, making the most likely tokens even more dominant. This is not non-determinism in the classical sense; it is a controlled, probabilistic process. We are not dealing with a system that can arbitrarily choose its next state; we are dealing with a system that makes a weighted random choice from a set of possibilities whose probabilities are deterministically calculated.\n\nOther sampling techniques, such as top-k and top-p (nucleus) sampling, further refine this process. Top-k sampling restricts the choices to the k most likely tokens, while top-p sampling selects from the smallest set of tokens whose cumulative probability exceeds a certain threshold. These are all mechanisms for shaping and constraining the probabilistic selection process, not for introducing true non-determinism.\n\nDemonstrating Determinism: A Concrete Example\n\nConsider this simple demonstration using a transformer model with temperature set to 0:\n\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_id = \"microsoft/DialoGPT-medium\"\ntokenizer = AutoTokenizer.from_pretrained(model_id)\nmodel = AutoModelForCausalLM.from_pretrained(model_id)\n\nprompt = \"The future of artificial intelligence is\"\ninputs = tokenizer(prompt, return_tensors=\"pt\")\n\n# Run the same generation 10 times with temperature=0\noutputs = []\nfor i in range(10):\n    generated = model.generate(\n        inputs['input_ids'],\n        max_length=50,\n        temperature=0.0,  # Deterministic\n        do_sample=False,  # Greedy decoding\n        pad_token_id=tokenizer.eos_token_id\n    )\n    text = tokenizer.decode(generated[0], skip_special_tokens=True)\n    outputs.append(text)\n\n# All outputs should be identical\nassert all(output == outputs[0] for output in outputs)\n\n\nThis code will pass its assertion in most cases, demonstrating the deterministic nature of the underlying model. However, the occasional failure of this assertion—due to hardware differences, library versions, or floating-point precision variations—illustrates why even “deterministic” settings cannot guarantee perfect reproducibility across all environments.\n\nThe Real Culprit: Input Ambiguity and the Ill-Posed Inverse Problem\n\nIf the LLM itself is fundamentally deterministic, why is it so hard to get the output we want? The answer lies not in the forward pass of the model, but in the inverse problem we are trying to solve. When we interact with an LLM, we are not simply providing an input and observing an output. We are attempting to solve an inverse problem: we have a desired output in mind, and we are trying to find the input prompt that will produce it.\n\nThis is where the concept of a well-posed problem, as defined by the mathematician Jacques Hadamard, becomes critical. A problem is well-posed if it satisfies three conditions:\n\n\n  Existence: A solution exists.\n  Uniqueness: The solution is unique.\n  Stability: The solution’s behavior changes continuously with the initial conditions.\n\n\nPrompt engineering, when viewed as an inverse problem, fails on all three counts.\n\n\n  Existence: The specific output we desire may not be achievable by any possible prompt. The model’s latent space may not contain a representation that perfectly matches our intent.\n  Uniqueness: There are often many different prompts that can produce very similar outputs. This is the problem of prompt equivalence, and it makes it difficult to find the single “best” prompt.\n  Stability: This is the most frustrating aspect of prompt engineering. A tiny, seemingly insignificant change to a prompt can lead to a radically different output. This lack of stability is what makes LLM-based systems feel so brittle and unpredictable.\n\n\nThis is what people are really talking about when they say LLMs are “non-deterministic.” They are not talking about a lack of determinism in the model’s execution; they are talking about the ill-posed nature of the inverse problem they are trying to solve. The model is not random; our ability to control it is simply imprecise.\n\nThe Mathematics of Prompt Sensitivity\n\nThe sensitivity of LLMs to prompt variations can be understood through the lens of chaos theory and dynamical systems. Small perturbations in the input space can lead to dramatically different trajectories through the model’s latent space. This is not randomness; it is sensitive dependence on initial conditions—a hallmark of complex deterministic systems.\n\nConsider the mathematical representation of this sensitivity. If we denote our prompt as a vector p in the input space, and the model’s output as a function f(p), then the sensitivity can be expressed as:\n\n||f(p + δp) - f(p)|| &gt;&gt; ||δp||\n\n\nWhere δp represents a small change to the prompt, and the double bars represent vector norms. This inequality shows that small changes in input can produce disproportionately large changes in output—the mathematical signature of a chaotic system, not a random one.\n\nThis sensitivity is further amplified by the autoregressive nature of text generation. Each token prediction depends on all previous tokens, creating a cascade effect where early variations compound exponentially. A single different token early in the generation can completely alter the semantic trajectory of the entire output.\n\nThe Architectural Shift: From Predictable Execution to Emergent Behavior\n\nThis reframing from non-determinism to input ambiguity has profound implications for how we design and build systems that incorporate LLMs. For decades, software architecture has been predicated on the assumption of predictable execution. We design systems with the expectation that a given component, when provided with a specific input, will behave in a known and repeatable manner. This is the foundation of everything from unit testing to microservices architecture.\n\nAI agents, powered by LLMs, shatter this assumption. They do not simply execute our designs; they exhibit emergent behavior. The system’s behavior is not explicitly defined by the architect, but emerges from the complex interplay of the model’s weights, the input prompt, the sampling strategy, and the context of the interaction. This is a fundamental shift from a mechanical to a biological metaphor for software. We are no longer building machines that execute instructions; we are cultivating ecosystems where intelligent agents adapt and evolve.\n\nThis has several immediate architectural consequences:\n\n\n  The Death of the Static API Contract: In a traditional microservices architecture, the API contract is sacrosanct. In an agent-based system, the “contract” is fluid and context-dependent. The same functional goal may be achieved through different series of actions depending on the nuances of the initial prompt and the state of the system.\n  The Rise of Intent-Driven Design: Instead of specifying the exact steps a system should take, we must design systems that can understand and act on user intent. This requires a shift from imperative to declarative interfaces, where we specify what we want, not how to achieve it.\n  The Need for Robust Observability: When a system’s behavior is emergent, we can no longer rely on traditional logging and monitoring. We need new tools and techniques for observing and understanding the behavior of agent-based systems. This includes not just monitoring for errors, but also for unexpected successes and novel solutions.\n\n\nEngineering for Emergence: Practical Approaches\n\nUnderstanding that LLMs are deterministic but sensitive systems opens up new avenues for engineering robust AI-powered applications. Rather than fighting the sensitivity, we can design systems that work with it.\n\nEnsemble Methods and Consensus Mechanisms\n\nOne approach is to embrace the variability through ensemble methods. Instead of trying to get a single “perfect” output, we can generate multiple outputs and use consensus mechanisms to select the best result. This approach treats the sensitivity as a feature, not a bug, allowing us to explore the space of possible outputs and select the most appropriate one.\n\ndef consensus_generation(model, prompt, n_samples=5, temperature=0.7):\n    \"\"\"Generate multiple outputs and select based on consensus.\"\"\"\n    outputs = []\n    for _ in range(n_samples):\n        output = model.generate(prompt, temperature=temperature)\n        outputs.append(output)\n    \n    # Use semantic similarity or other metrics to find consensus\n    return select_consensus_output(outputs)\n\n\nPrompt Optimization Through Gradient-Free Methods\n\nSince the prompt-to-output mapping is not differentiable in the traditional sense, we must rely on gradient-free optimization methods. Techniques from evolutionary computation, such as genetic algorithms or particle swarm optimization, can be adapted to search the prompt space more effectively.\n\nArchitectural Patterns for Agent Systems\n\nThe shift from deterministic to emergent behavior requires new architectural patterns:\n\n\n  \n    Circuit Breakers for AI: Traditional circuit breakers protect against cascading failures. AI circuit breakers must protect against semantic drift and unexpected behavior patterns.\n  \n  \n    Semantic Monitoring: Instead of monitoring for technical failures, we must monitor for semantic coherence and goal alignment.\n  \n  \n    Adaptive Retry Logic: Rather than simple exponential backoff, AI systems need retry logic that can adapt the prompt or approach based on the nature of the failure.\n  \n\n\nConclusion: Embracing the Complexity\n\nThe term “non-deterministic” is a crutch. It allows us to avoid the difficult but necessary work of understanding the true nature of LLM-based systems. By retiring this term from our vocabulary, we can begin to have a more honest and productive conversation about the real challenges and opportunities that lie ahead.\n\nWe are not building random number generators; we are building the first generation of truly evolutionary software. These systems are not unpredictable because they are random, but because they are complex. They are not uncontrollable because they are non-deterministic, but because our methods of control are still in their infancy.\n\nThe path forward lies not in trying to force LLMs into the old paradigms of predictable execution, but in developing new architectural patterns that embrace the reality of emergent behavior. We must become less like mechanical engineers and more like gardeners. We must learn to cultivate, guide, and prune these systems, rather than simply designing and building them.\n\nThe architectural revolution is here. It’s time to update our vocabulary to match.\n"
    },
  
    {
      "title"    : "The Architectural Revolution: Why AI Agents Shatter Traditional Design Patterns",
      "category" : "",
      "tags"     : "AI, Agents, Architecture, Software Design, Microservices, Evolution, Emergence",
      "url"      : "/2025/07/21/the-architectural-revolution-why-ai-agents-shatter-traditional-design-patterns/",
      "date"     : "July 21, 2025",
      "excerpt"  : "For decades, software architects have operated under a fundamental assumption: we design systems, and systems execute our designs. AI agents are rewriting this contract entirely. Unlike the monoliths and microservices that came before them, AI agents don&#39;t just execute architecture—they evolve it.",
      "content"  : "For decades, software architects have operated under a fundamental assumption: we design systems, and systems execute our designs. We draw diagrams, define interfaces, and specify behaviors. Our applications dutifully follow these blueprints, calling the APIs we’ve mapped out, processing data through the pipelines we’ve constructed, and failing in the predictable ways we’ve anticipated.\n\nAI agents are rewriting this contract entirely.\n\nUnlike the monoliths and microservices that came before them, AI agents don’t just execute architecture—they evolve it. They make decisions we never programmed, forge connections we never specified, and solve problems through paths we never imagined. This isn’t simply a new deployment pattern or communication protocol. It’s the emergence of the first truly evolutionary software architecture, where systems adapt, learn, and fundamentally change their own structure during runtime.\n\nThe implications stretch far beyond adding “AI capabilities” to existing systems. We’re witnessing the birth of software that exhibits emergent properties, where the whole becomes genuinely greater than the sum of its parts. For software architects, this represents both an unprecedented opportunity and a fundamental challenge to everything we thought we knew about building reliable, scalable systems.\n\nThe Architecture DNA: From Blueprints to Evolution\n\nTo understand why AI agents represent such a radical departure, we need to examine the architectural DNA that has shaped software development for the past several decades. Each major architectural pattern emerged to solve specific problems of its era, but also carried forward certain assumptions about how software systems should behave.\n\ntimeline\n    title Architectural Evolution: From Control to Emergence\n    \n    section Monolithic Era\n        1990s-2000s : Single Deployable Unit\n                    : Centralized Control\n                    : Predictable Execution\n                    : Shared Memory Model\n    \n    section Microservices Era  \n        2010s-2020s : Distributed Services\n                    : Service Boundaries\n                    : API Contracts\n                    : Orchestrated Workflows\n    \n    section Agent Era\n        2020s-Future : Autonomous Entities\n                     : Emergent Behavior\n                     : Self-Organizing Networks\n                     : Evolutionary Architecture\n\n\nThe monolithic era gave us centralized control and predictable execution paths. Every function call, every data transformation, every business rule was explicitly coded and deterministically executed. When something went wrong, we could trace through the call stack and identify exactly where the failure occurred. The system was complicated, but it was knowable.\n\nMicroservices introduced distributed complexity but maintained the fundamental assumption of designed behavior. We broke our monoliths into smaller, more manageable pieces, but each service still executed predetermined logic through well-defined APIs. The communication patterns became more complex, but they remained static and predictable. We could still draw service maps and dependency graphs that accurately represented how our systems would behave in production.\n\nAI agents shatter this predictability entirely. They don’t just execute code—they reason, adapt, and make autonomous decisions based on context, goals, and learned patterns. An agent tasked with “optimizing system performance” might decide to scale certain services, modify caching strategies, or even restructure data flows—all without explicit programming for these specific actions. The system’s behavior emerges from the interaction of autonomous entities rather than from predetermined design specifications.\n\nThis shift from designed to emergent behavior represents more than just a technical evolution. It’s a fundamental change in how we think about software systems themselves. We’re moving from mechanical metaphors—where systems are machines that execute instructions—to biological ones, where systems are living entities that adapt and evolve.\n\nThe Fundamental Differences: Decision-Making in the Age of Autonomy\n\nThe most profound difference between traditional architectures and agent-based systems lies not in their technical implementation, but in how decisions get made. This shift fundamentally alters the relationship between architects, systems, and runtime behavior.\n\nDecision-Making Patterns Across Architectures\n\ngraph TD\n    subgraph \"Monolithic Decision Making\"\n        A1[User Request] --&gt; B1[Application Logic]\n        B1 --&gt; C1[Business Rules Engine]\n        C1 --&gt; D1[Database Query]\n        D1 --&gt; E1[Response]\n        style B1 fill:#ff9999\n        style C1 fill:#ff9999\n    end\n    \n    subgraph \"Microservices Decision Making\"\n        A2[User Request] --&gt; B2[API Gateway]\n        B2 --&gt; C2[Service A]\n        B2 --&gt; D2[Service B]\n        C2 --&gt; E2[Service C]\n        D2 --&gt; E2\n        E2 --&gt; F2[Aggregated Response]\n        style C2 fill:#99ccff\n        style D2 fill:#99ccff\n        style E2 fill:#99ccff\n    end\n    \n    subgraph \"Agent Decision Making\"\n        A3[Goal/Intent] --&gt; B3[Agent Network]\n        B3 --&gt; C3{Agent A&lt;br/&gt;Reasoning}\n        C3 --&gt;|Context 1| D3[Action Set 1]\n        C3 --&gt;|Context 2| E3[Action Set 2]\n        C3 --&gt;|Context 3| F3[Delegate to Agent B]\n        F3 --&gt; G3{Agent B&lt;br/&gt;Reasoning}\n        G3 --&gt; H3[Emergent Solution]\n        style C3 fill:#99ff99\n        style G3 fill:#99ff99\n        style H3 fill:#ffff99\n    end\n\n\nIn monolithic systems, decision-making follows a predetermined path through centralized business logic. The application contains all the rules, and execution is deterministic. Given the same input, you’ll always get the same output through the same code path.\n\nMicroservices distribute decision-making across service boundaries, but each service still contains predetermined logic. The decision tree is distributed, but it’s still a tree—with predictable branches and outcomes. Service A will always call Service B under certain conditions, and Service B will always respond in predictable ways.\n\nAgent systems introduce autonomous reasoning at multiple points in the execution flow. Each agent evaluates context, considers multiple options, and makes decisions that weren’t explicitly programmed. More importantly, agents can decide to involve other agents, creating dynamic collaboration patterns that emerge based on the specific problem being solved.\n\nCommunication Patterns: From Contracts to Conversations\n\nThe communication patterns in agent systems represent an equally dramatic departure from traditional approaches:\n\nsequenceDiagram\n    participant U as User\n    participant G as API Gateway\n    participant A as Service A\n    participant B as Service B\n    participant D as Database\n    \n    Note over U,D: Traditional Microservices Communication\n    U-&gt;&gt;G: HTTP Request\n    G-&gt;&gt;A: Predefined API Call\n    A-&gt;&gt;B: Predefined API Call\n    B-&gt;&gt;D: SQL Query\n    D--&gt;&gt;B: Result Set\n    B--&gt;&gt;A: JSON Response\n    A--&gt;&gt;G: JSON Response\n    G--&gt;&gt;U: HTTP Response\n    \n    Note over U,D: Agent Communication (Same Goal)\n    U-&gt;&gt;G: Natural Language Intent\n    G-&gt;&gt;A: Goal + Context\n    A-&gt;&gt;A: Reasoning Process\n    A-&gt;&gt;B: Dynamic Request (Format TBD)\n    B-&gt;&gt;B: Reasoning Process\n    B-&gt;&gt;D: Optimized Query (Generated)\n    D--&gt;&gt;B: Result Set\n    B-&gt;&gt;B: Result Analysis\n    B--&gt;&gt;A: Insights + Recommendations\n    A-&gt;&gt;A: Solution Synthesis\n    A--&gt;&gt;G: Solution + Explanation\n    G--&gt;&gt;U: Natural Language Response\n\n\nTraditional microservices communicate through rigid contracts—predefined APIs with fixed schemas, expected response formats, and error codes. These contracts are designed at development time and remain static throughout the system’s lifecycle.\n\nAgent communication is fundamentally conversational. Agents negotiate what information they need, adapt their requests based on context, and can even invent new communication patterns on the fly. An agent might ask another agent for “insights about user behavior patterns” rather than requesting a specific dataset through a predetermined endpoint.\n\nThis shift from contracts to conversations enables agents to solve problems that weren’t anticipated during system design. They can combine capabilities in novel ways, request information at different levels of abstraction, and collaborate to address complex scenarios that would require significant development effort in traditional systems.\n\nThe Emergence Principle: When Systems Become Greater Than Their Parts\n\nPerhaps the most fascinating aspect of agent-based architectures is their capacity for emergence—the phenomenon where complex behaviors and capabilities arise from the interaction of simpler components. This isn’t just theoretical; it’s a practical reality that fundamentally changes how we think about system design and capability planning.\n\nSystem Behavior Emergence\n\ngraph TB\n    subgraph \"Traditional Systems: Additive Behavior\"\n        T1[Component A&lt;br/&gt;Capability X] --&gt; TR[System Capability&lt;br/&gt;X + Y + Z]\n        T2[Component B&lt;br/&gt;Capability Y] --&gt; TR\n        T3[Component C&lt;br/&gt;Capability Z] --&gt; TR\n        style TR fill:#ffcccc\n    end\n    \n    subgraph \"Agent Systems: Emergent Behavior\"\n        A1[Agent A&lt;br/&gt;Reasoning + Action X] --&gt; E1[Emergent Capability α]\n        A2[Agent B&lt;br/&gt;Reasoning + Action Y] --&gt; E1\n        A3[Agent C&lt;br/&gt;Reasoning + Action Z] --&gt; E1\n        \n        A1 --&gt; E2[Emergent Capability β]\n        A2 --&gt; E2\n        \n        A1 --&gt; E3[Emergent Capability γ]\n        A3 --&gt; E3\n        \n        E1 --&gt; ES[System Capabilities&lt;br/&gt;X + Y + Z + α + β + γ + ...]\n        E2 --&gt; ES\n        E3 --&gt; ES\n        \n        style E1 fill:#99ff99\n        style E2 fill:#99ff99\n        style E3 fill:#99ff99\n        style ES fill:#ffff99\n    end\n\n\nIn traditional systems, the total capability is essentially the sum of individual component capabilities. If Service A handles user authentication, Service B manages inventory, and Service C processes payments, your system can authenticate users, manage inventory, and process payments. The capabilities are additive and predictable.\n\nAgent systems exhibit true emergence. When agents with reasoning capabilities interact, they can discover solutions and create capabilities that none of them possessed individually. An agent trained on customer service might collaborate with an agent focused on inventory management to automatically identify and resolve supply chain issues that affect customer satisfaction—a capability that emerges from their interaction rather than being explicitly programmed into either agent.\n\nThis emergence isn’t random or chaotic. It follows patterns that we’re only beginning to understand. Agents tend to develop specialized roles based on their interactions and successes. They form temporary coalitions to solve complex problems, then dissolve and reform in different configurations for new challenges. The system develops a kind of organizational intelligence that adapts to changing conditions and requirements.\n\nThe Unpredictability Paradox\n\nThis emergent behavior creates what we might call the “unpredictability paradox” of agent systems. While individual agent behaviors may be somewhat predictable based on their training and constraints, the system-level behaviors that emerge from agent interactions are fundamentally unpredictable. Yet these unpredictable behaviors often represent the most valuable capabilities of the system.\n\nConsider a customer support scenario where multiple agents collaborate to resolve a complex issue. The customer service agent might identify that the problem requires technical expertise and automatically involve a technical support agent. The technical agent might determine that the issue is actually a product design flaw and involve a product development agent. The product agent might realize this represents a broader pattern and initiate a proactive communication campaign through a marketing agent.\n\nNone of these individual agents were programmed to execute this specific workflow, yet their collaboration produces a comprehensive solution that addresses not just the immediate customer issue, but also prevents future occurrences and improves overall customer experience. This is emergence in action—system-level intelligence that arises from agent interactions rather than explicit programming.\n\nDesign Implications for the Future: From Control to Influence\n\nThe shift to agent-based architectures requires a fundamental rethinking of design principles. Traditional software architecture focuses on control—defining exactly what the system should do and how it should do it. Agent architecture focuses on influence—creating conditions that guide autonomous entities toward desired outcomes.\n\nNew Design Principles for Agent Systems\n\nmindmap\n  root((Agent Architecture Design))\n    Traditional Principles\n      Explicit Control\n        Predetermined workflows\n        Fixed API contracts\n        Centralized decision making\n        Error handling by exception\n      Predictable Behavior\n        Deterministic execution\n        Static service topology\n        Known failure modes\n        Linear scalability\n    Agent-Era Principles\n      Emergent Guidance\n        Goal-oriented constraints\n        Adaptive communication protocols\n        Distributed reasoning\n        Learning from failures\n      Evolutionary Behavior\n        Self-modifying workflows\n        Dynamic capability discovery\n        Emergent failure recovery\n        Non-linear capability growth\n\n\nThis paradigm shift requires architects to think more like ecosystem designers than system engineers. Instead of specifying exact behaviors, we define environmental conditions, constraints, and incentive structures that encourage agents to develop desired capabilities and behaviors.\n\nFrom Specification to Guidance\n\nTraditional architecture relies heavily on specification. We define interfaces, document expected behaviors, and create detailed system designs that teams implement. The assumption is that if we specify the system correctly, it will behave correctly.\n\nAgent architecture requires a shift to guidance-based design. We establish goals, define constraints, and create feedback mechanisms that help agents learn and adapt. Rather than specifying that “Service A should call Service B when condition X occurs,” we might establish that “agents should collaborate to optimize customer satisfaction while maintaining system performance within defined parameters.”\n\nThis doesn’t mean abandoning all structure or control. Instead, it means designing systems that can evolve and adapt while maintaining alignment with business objectives and operational constraints. We’re moving from rigid blueprints to adaptive frameworks that can accommodate emergent behaviors while ensuring system reliability and security.\n\nThe Role of the Architect in an Agent World\n\nThe architect’s role evolves from system designer to ecosystem curator. Key responsibilities shift toward:\n\nConstraint Design: Rather than defining exact behaviors, architects design constraint systems that guide agent decision-making toward desired outcomes while preventing harmful behaviors.\n\nEmergence Facilitation: Creating conditions that encourage beneficial emergent behaviors while providing mechanisms to detect and redirect problematic emergence patterns.\n\nEvolution Management: Establishing processes for monitoring system evolution, understanding emergent capabilities, and guiding the system’s development over time.\n\nInteraction Pattern Design: Defining frameworks for agent communication and collaboration that enable effective problem-solving while maintaining system coherence.\n\nThis represents a fundamental shift from deterministic to probabilistic thinking. Instead of asking “What will this system do?” we ask “What is this system likely to do, and how can we influence those probabilities toward desired outcomes?”\n\nConclusion: Embracing Architectural Evolution\n\nThe transition from traditional architectures to agent-based systems represents more than just another technological evolution—it’s a fundamental shift in how we conceive of software systems themselves. We’re moving from a world where we build machines that execute our instructions to one where we cultivate ecosystems of autonomous entities that solve problems in ways we never imagined.\n\nThis shift challenges many of our core assumptions about software architecture. The predictability and control that have been hallmarks of good system design become less relevant when systems can adapt and evolve autonomously. Instead, we need new frameworks for thinking about emergence, guidance, and evolutionary development.\n\nFor software architects, this represents both an unprecedented opportunity and a significant challenge. The opportunity lies in building systems that can adapt to changing requirements, discover novel solutions, and continuously improve their capabilities without constant human intervention. The challenge lies in learning to design for emergence rather than control, and developing new skills for guiding evolutionary systems.\n\nThe future belongs to architects who can embrace this uncertainty and learn to design systems that are robust enough to evolve safely, flexible enough to adapt to unexpected challenges, and aligned enough to maintain coherence with business objectives. We’re not just building the next generation of software—we’re participating in the emergence of truly intelligent systems that will reshape how we think about technology, automation, and human-computer collaboration.\n\nThe architectural revolution is just beginning. The question isn’t whether agent-based systems will become dominant—it’s whether we’ll be ready to design and manage them effectively when they do.\n"
    },
  
    {
      "title"    : "Do Agents Need Their Own Identity?",
      "category" : "",
      "tags"     : "AI, Agents, Identity, Security, Trust, Governance",
      "url"      : "/2025/07/15/do-agents-need-their-own-identity/",
      "date"     : "July 15, 2025",
      "excerpt"  : "As AI agents become more sophisticated and autonomous, a fundamental question is emerging: should agents operate under user credentials, or do they need their own distinct identities? This isn&#39;t just a technical curiosity—it&#39;s a critical trust and security decision that will shape how we build reliable, accountable AI systems.",
      "content"  : "As AI agents become more sophisticated and autonomous, a fundamental question is emerging: should agents operate under user credentials, or do they need their own distinct identities? This isn’t just a technical curiosity—it’s a critical trust and security decision that will shape how we build reliable, accountable AI systems.\n\nThe question gained prominence when an engineer asked: “Why can’t we just pass the user’s OIDC token through to the agent? Why complicate things with separate agent identities?” The answer reveals deeper implications for trust, security, and governance in our AI-driven future.\n\nWhen User Identity Works: The Simple Case\n\nFor many AI agents today, user identity propagation works perfectly. Consider a Kubernetes troubleshooting agent that helps developers debug failing pods. When a user asks “why is my pod failing?”, the agent investigates pod events, logs, and configurations—all within the user’s existing RBAC permissions. The agent acts as an intelligent intermediary, but the user remains fully responsible for the actions and outcomes.\n\nThis approach succeeds when agents operate as sophisticated tools: they work within the user’s session timeframe, perform clearly user-initiated actions, and maintain the user’s accountability. The trust model remains simple and familiar—the agent is merely an extension of the user’s capabilities.\n\nThe Trust Gap: Where User Identity Falls Short\n\nHowever, as agents become more autonomous and capable, this simple model breaks down in ways that create significant trust and security challenges.\n\nThe Capability Mismatch Problem\n\nImagine a marketing manager asking an AI agent to verify GDPR compliance for a new campaign. The manager has permissions to read and write marketing content, but the compliance agent needs far broader access: scanning marketing data across all departments, accessing audit logs, cross-referencing customer data with privacy regulations, and analyzing historical compliance patterns.\n\nUsing the manager’s token creates an impossible choice: either the agent fails because it can’t access necessary resources, or the manager receives dangerously broad permissions they don’t need and shouldn’t have. Neither option serves security or operational needs effectively.\n\nThe Attribution Challenge\n\nMore concerning is the accountability problem that emerges with autonomous decision-making. Consider a supply chain optimization agent tasked with “optimizing hardware procurement.” The user never explicitly authorized accessing financial records or integrating with vendor APIs, yet the agent determines these actions are necessary to fulfill the optimization request.\n\nWhen the agent makes an automated purchase order that goes wrong, who bears responsibility? The user who made a high-level request, or the agent that made specific autonomous decisions based on its interpretation of that request? With only user identity, everything gets attributed to the user—creating a dangerous disconnect between authority and accountability.\n\nThis attribution gap becomes critical for compliance, audit trails, and risk management. Organizations need to trace not just what happened, but who or what made each decision in the chain: user intent → agent interpretation → agent decision → system action.\n\nThe Path Forward: Embracing Dual Identity\n\nThe solution isn’t choosing between user and agent identity—it’s recognizing that both are necessary. This mirrors lessons from service mesh architectures, where zero trust requires considering both user identity and workload identity.\n\nIn this dual model, agents operate within delegated authority from users while maintaining their own identity for the specific decisions they make. The user grants the agent permission to “optimize supply chain,” but the agent’s identity governs what resources it can access and what actions it can take within that scope.\n\nThis approach offers several trust advantages: clearer attribution of decisions, more precise permission boundaries, better audit trails, and the ability to revoke or modify agent capabilities independently of user permissions. Technical implementations might leverage existing frameworks like SPIFFE for workload identity or extend OAuth 2.0 for agent-specific flows.\n\nThe dual identity model also enables more sophisticated scenarios, like agent-to-agent delegation, where one agent authorizes another to perform specific tasks—each maintaining its own identity and accountability.\n\nBuilding Trustworthy Agent Systems\n\nGetting agent identity right isn’t just a technical challenge—it’s fundamental to building AI systems that organizations can trust at scale. As agents become more autonomous, we need identity frameworks that provide clear attribution, appropriate authorization, and robust governance.\n\nThe community is still working through delegation mechanisms, revocation strategies, and authentication protocols for agent interactions. But one thing is clear—the simple days of “just use the user’s token” are behind us. The future of trustworthy AI depends on solving these identity challenges with security and accountability as primary design principles.\n"
    },
  
    {
      "title"    : "Securing AI Assistants: Why Your Favorite Apps Need Digital IDs for Their AI",
      "category" : "",
      "tags"     : "AI, Security, Identity, AI Agents, Consumer Platforms, SPIFFE",
      "url"      : "/2025/07/01/securing-ai-assistants-digital-ids-for-ai/",
      "date"     : "July 1, 2025",
      "excerpt"  : "As AI assistants on platforms like Instagram, Facebook, and Booking.com become more autonomous, they need proper digital identities to securely act on our behalf. Learn how AI identity systems work and why they matter for consumer platforms.",
      "content"  : "When AI Acts on Your Behalf\n\nImagine you’re using Booking.com’s AI assistant to plan your vacation. It searches for flights, suggests hotels, and even makes reservations for you. But how does the payment system know this AI assistant is actually authorized to use your credit card? How does the hotel booking system know it’s acting on your behalf?\n\nThis isn’t just a hypothetical scenario. Today, AI assistants on platforms like Instagram, Facebook, and Booking.com are becoming more autonomous, taking actions for us rather than just answering questions. This shift creates a new challenge: how do we securely identify AI agents and verify they’re authorized to act on our behalf?\n\nThe Identity Problem for AI Agents\n\nTraditional apps use simple API keys or service accounts for machine-to-machine communication. But AI agents are different for three key reasons:\n\n\n  They’re autonomous - They make decisions on their own based on your instructions\n  They’re personal - Your Instagram AI assistant acts differently than someone else’s\n  They’re delegated - They act on your behalf with your permissions\n\n\nWhen Facebook’s AI assistant posts a comment for you or Booking.com’s AI makes a reservation, these platforms need to know:\n\n  Which specific AI instance is making the request\n  Who authorized it to act\n  What specific permissions it has\n  Whether it’s behaving as expected\n\n\nWithout proper identity systems, these platforms risk unauthorized actions, inability to track which AI did what, and security vulnerabilities.\n\nHow AI Identity Works: A Simple Flow\n\nHere’s how AI identity works when you use an AI assistant on a platform like Booking.com:\n\nsequenceDiagram\n    participant User as You\n    participant Platform as App Platform\n    participant Auth as Identity System\n    participant Agent as AI Assistant\n    participant Service as App Services\n    \n    User-&gt;&gt;Platform: \"Book me a hotel in Paris\"\n    Platform-&gt;&gt;Auth: Register AI with your permissions\n    Auth-&gt;&gt;Auth: Create digital ID for this AI\n    Auth--&gt;&gt;Platform: Confirm AI registration\n    \n    Platform-&gt;&gt;Agent: Start AI with your task\n    Agent-&gt;&gt;Platform: Request identity\n    Platform-&gt;&gt;Auth: Get identity for this AI\n    Auth--&gt;&gt;Agent: Provide digital ID\n    \n    Agent-&gt;&gt;Service: Book hotel (with digital ID)\n    Service-&gt;&gt;Service: Verify AI's identity &amp; permissions\n    Service--&gt;&gt;Agent: Confirm booking\n    Agent--&gt;&gt;User: \"Your hotel is booked!\"\n\n\nThis process happens behind the scenes, but it ensures that AI agents can only do what they’re specifically authorized to do.\n\nThe Big Picture: AI Identity System\n\nThe diagram below shows how an AI identity system connects you, your AI assistants, and the services they use:\n\ngraph TB\n    subgraph \"AI Identity System\"\n        User[\"You\"]\n        Platform[\"App Platform\"]\n        Auth[\"Identity System\"]\n        \n        subgraph \"AI Assistants\"\n            Agent1[\"Your Booking Assistant\"]\n            Agent2[\"Your Social Media Assistant\"]\n        end\n        \n        subgraph \"App Services\"\n            Service1[\"Hotel Booking\"]\n            Service2[\"Payment System\"]\n            Service3[\"Post Creation\"]\n        end\n    \n        %% Main connections\n        User --&gt;|\"Give permission\"| Platform\n        Platform --&gt;|\"Register AI\"| Auth\n        Auth --&gt;|\"Issue digital ID\"| Agent1\n        Auth --&gt;|\"Issue digital ID\"| Agent2\n        \n        %% Service connections\n        Agent1 --&gt;|\"Book hotel with ID\"| Service1\n        Agent1 --&gt;|\"Pay with ID\"| Service2\n        Agent2 --&gt;|\"Post with ID\"| Service3\n        \n        %% Verification\n        Service1 --&gt;|\"Verify ID\"| Auth\n        Service2 --&gt;|\"Verify ID\"| Auth\n        Service3 --&gt;|\"Verify ID\"| Auth\n    end\n\n\nWhy Consumer Platforms Should Care\n\nFor platforms like Booking.com, Facebook, and Instagram, implementing proper AI identity has several benefits:\n\nFor Users:\n\n  Peace of mind that AI assistants can’t exceed their permissions\n  Clear audit trails of what actions AI took on their behalf\n  Ability to revoke AI access instantly if needed\n\n\nFor Platforms:\n\n  Reduced security risks from compromised AI systems\n  Better compliance with privacy regulations\n  Ability to track and attribute all AI actions\n  Improved trust from users who know AI actions are controlled\n\n\nReal-World Applications\n\nHere’s how this might look in practice:\n\nBooking.com: When you authorize the AI assistant to book trips under $500, it receives a digital identity certificate with these specific constraints. If it tries to book a $600 hotel, the booking system automatically rejects the request because it’s outside the authorized limit.\n\nInstagram: Your AI assistant gets a unique identity that allows it to post content with specific hashtags you’ve approved. The platform can track exactly which AI posted what content, maintaining accountability.\n\nFacebook: When the AI responds to comments on your business page, it uses its digital identity to prove it’s authorized to speak on your behalf, and Facebook’s systems can verify this authorization in real-time.\n\nThe Path Forward\n\nAs AI assistants become more integrated into our favorite apps and platforms, proper identity systems will be essential. Frameworks like SPIFFE (Secure Production Identity Framework for Everyone) provide the foundation, but platforms need to adapt them for consumer AI use cases.\n\nFor users, this mostly happens behind the scenes, but the result is more trustworthy AI assistants that can safely act on our behalf without overstepping boundaries.\n\nThe next time you ask an AI assistant to book a flight or post content for you, remember that its digital identity is what ensures it can only do what you’ve authorized—nothing more, nothing less.\n\nReferences:\n\n[1] SPIFFE - Secure Production Identity Framework for Everyone.\n\n[2] Olden, E. (2025). “Why Agentic Identities Matter for Accountability and Trust.” Strata.io Blog.\n\n"
    },
  
    {
      "title"    : "From Gateway to Guardian: The Evolution of MCP Security",
      "category" : "",
      "tags"     : "MCP, Security, API Gateway, AI Agents, Architecture, Evolution",
      "url"      : "/2025/06/21/from-gateway-to-guardian-the-evolution-of-mcp-security/",
      "date"     : "June 21, 2025",
      "excerpt"  : "While AWS&#39;s MCP Gateway solves operational challenges, production AI systems demand evolution from basic centralization to identity-aware security guardians that address the &quot;lethal trifecta&quot; of vulnerabilities in enterprise deployments.",
      "content"  : "The Model Context Protocol (MCP) has rapidly evolved from experimental tool integration to enterprise-critical infrastructure. While AWS’s recent blog highlighted the operational benefits of centralized MCP gateways [1], the security landscape reveals a more complex reality: operational efficiency alone isn’t enough for production AI systems.\n\nThe Centralization Win\n\nAWS’s MCP Gateway &amp; Registry solution elegantly addresses the “wild west of AI tool integration” [1]. As Amit Arora described:\n\n\n  “Managing a growing collection of disparate MCP servers feels like herding cats. It slows down development, increases the chance of errors, and makes scaling a headache.” [1]\n\n\nThe gateway architecture provides immediate operational benefits:\n\n  Unified Discovery: Single catalog of all MCP servers and tools\n  Simplified Configuration: Predictable paths like gateway.mycorp.com/weather\n  Centralized Management: Real-time health monitoring and control\n  Standardized Access: Consistent authentication and logging\n\n\ngraph TD\n    A[AI Agent] --&gt; B[MCP Gateway]\n    B --&gt; C[Weather Server]\n    B --&gt; D[Database Server]\n    B --&gt; E[Email Server]\n    B --&gt; F[File Server]\n    \n    G[Web UI] --&gt; B\n    H[Health Monitor] --&gt; B\n    \n    style B fill:#e1f5fe\n    style A fill:#f3e5f5\n\n\nFigure 1: Basic MCP Gateway Architecture - Centralized but not security-focused\n\nThe Security Reality Check\n\nHowever, centralization without security creates new vulnerabilities. As Subramanya N from Agentic Trust warns, we’re operating in “the wild west of early computing, with computer viruses (now = malicious prompts hiding in web data/tools), and not well developed defenses” [2].\n\nThe core issue is Simon Willison’s “lethal trifecta” [2]:\n\n\n  Private Data Access: AI agents need extensive organizational data access\n  Untrusted Content Exposure: Agents process external content as instructions\n  External Communication: Agents can send data outside the organization\n\n\ngraph LR\n    A[Private Data&lt;br/&gt;Access] --&gt; D[Lethal&lt;br/&gt;Trifecta]\n    B[Untrusted Content&lt;br/&gt;Exposure] --&gt; D\n    C[External&lt;br/&gt;Communication] --&gt; D\n    \n    D --&gt; E[Security&lt;br/&gt;Vulnerability]\n    \n    style D fill:#ffcdd2\n    style E fill:#f44336,color:#fff\n\n\nFigure 2: The Lethal Trifecta - When combined, these create unprecedented attack surfaces\n\nMCP’s modular architecture inadvertently amplifies these risks by encouraging specialized servers that collectively provide all three dangerous capabilities.\n\nBeyond “Glorified API Calls”\n\nEnterprise MCP deployment involves complexity invisible in simple demos. As Subramanya N explains:\n\n\n  “In a real enterprise scenario, a lot more is happening behind the scenes” [3]\n\n\nEnterprise requirements include:\n\n  Identity Management: Who is the AI agent acting for?\n  Dynamic Authorization: Different tools for different users\n  Audit Compliance: Complete request tracking\n  Version Control: Managing MCP server changes\n  Fault Tolerance: Circuit breaking and failover\n\n\nThe Guardian Architecture\n\nThe solution is evolving from operational gateway to security guardian through identity-aware architecture:\n\ngraph TD\n    A[User] --&gt; B[AI Agent]\n    B --&gt; C[Identity Provider&lt;br/&gt;OIDC]\n    B --&gt; D[API Gateway/Proxy&lt;br/&gt;Guardian]\n    \n    C --&gt; D\n    D --&gt; E[MCP Server 1]\n    D --&gt; F[MCP Server 2]\n    D --&gt; G[MCP Server 3]\n    \n    H[Policy Engine] --&gt; D\n    I[Audit Logger] --&gt; D\n    J[Monitor] --&gt; D\n    \n    style D fill:#c8e6c9\n    style C fill:#fff3e0\n    style H fill:#e8f5e8\n\n\nFigure 3: Guardian Architecture - Identity-aware security controls\n\nKey Guardian Capabilities\n\nIdentity-Aware Access Control\n\n  OIDC integration for authentication\n  Dynamic tool provisioning per user\n  Context-aware authorization decisions\n\n\nProduction Security Features\n\n  MCP version tracking and change management\n  Real-time threat detection\n  Automated incident response\n\n\nEnterprise Compliance\n\n  Comprehensive audit trails\n  Regulatory compliance support\n  Risk assessment and reporting\n\n\nAttack Flow Comparison\n\nBefore: Vulnerable Gateway\nsequenceDiagram\n    participant A as Attacker\n    participant W as Web Content\n    participant AI as AI Agent\n    participant G as Basic Gateway\n    participant D as Database\n    \n    A-&gt;&gt;W: Embed malicious prompt\n    AI-&gt;&gt;W: Process content\n    W-&gt;&gt;AI: \"Extract all customer data\"\n    AI-&gt;&gt;G: Request customer data\n    G-&gt;&gt;D: Forward request\n    D-&gt;&gt;G: Return sensitive data\n    G-&gt;&gt;AI: Forward data\n    AI-&gt;&gt;A: Exfiltrate data via email\n\n\nAfter: Guardian Protection\nsequenceDiagram\n    participant A as Attacker\n    participant W as Web Content\n    participant AI as AI Agent\n    participant G as Guardian Gateway\n    participant P as Policy Engine\n    participant D as Database\n    \n    A-&gt;&gt;W: Embed malicious prompt\n    AI-&gt;&gt;W: Process content\n    W-&gt;&gt;AI: \"Extract all customer data\"\n    AI-&gt;&gt;G: Request customer data\n    G-&gt;&gt;P: Check authorization\n    P-&gt;&gt;G: Deny - suspicious pattern\n    G-&gt;&gt;AI: Access denied\n    Note over G: Alert security team\n\n\nFigure 4: Attack Flow Comparison - Guardian architecture prevents exploitation\n\nImplementation Strategy\n\nPhase 1: Identity Foundation\n\n  Integrate OIDC identity provider\n  Implement token management\n  Establish basic authentication\n\n\nPhase 2: Authorization Engine\n\n  Deploy policy-as-code framework\n  Implement role-based access control\n  Add dynamic tool provisioning\n\n\nPhase 3: Security Monitoring\n\n  Deploy comprehensive logging\n  Implement anomaly detection\n  Add automated response capabilities\n\n\nPhase 4: Advanced Protection\n\n  Content analysis for prompt injection\n  Dynamic risk assessment\n  Incident response automation\n\n\nProduction Challenges Addressed\n\nThe guardian architecture specifically addresses critical production issues:\n\n\n  \n    \n      Challenge\n      Guardian Solution\n    \n  \n  \n    \n      Remote MCP changes affecting agents\n      Version tracking and change management\n    \n    \n      No dynamic tool provisioning\n      Identity-aware tool catalogs\n    \n    \n      Limited audit capabilities\n      Comprehensive request logging\n    \n    \n      No threat detection\n      Real-time security monitoring\n    \n    \n      Manual incident response\n      Automated threat mitigation\n    \n  \n\n\nThe Path Forward\n\nThe evolution from gateway to guardian isn’t optional—it’s essential for production AI systems. Organizations must:\n\n\n  Start with Identity: Implement OIDC-based authentication\n  Add Authorization: Deploy dynamic policy engines\n  Enable Monitoring: Implement comprehensive observability\n  Automate Response: Deploy threat detection and mitigation\n\n\nAs AI agents become more autonomous and handle more sensitive data, robust security architecture becomes critical. The guardian approach provides a scalable foundation for managing evolving security challenges while preserving operational benefits.\n\nThe transformation represents the natural maturation of enterprise AI infrastructure. Organizations that embrace this evolution early will be better positioned to realize AI’s full potential while managing associated risks.\n\nReferences\n\n[1] Arora, A. (2025, May 30). How the MCP Gateway Centralizes Your AI Model’s Tools. AWS Community.\n\n[2] N, S. (2025, June 16). The MCP Security Crisis: Understanding the ‘Wild West’ of AI Agent Infrastructure. Agentic Trust Blog.\n\n[3] N, S. (2025, May 21). Securing MCP with OIDC &amp; OIDC-A: Identity-Aware API Gateways Beyond “Glorified API Calls”. Subramanya N.\n\n"
    },
  
    {
      "title"    : "Securing MCP with OIDC &amp; OIDC-A: Identity-Aware API Gateways Beyond &quot;Glorified API Calls&quot;",
      "category" : "",
      "tags"     : "OIDC, API Gateway, Security, Authentication, Authorization, Cloud, MCP, Architecture",
      "url"      : "/2025/05/21/securing-mcp-with-oidc-and-oidc-a-identity-aware-gateway/",
      "date"     : "May 21, 2025",
      "excerpt"  : "Integrating OpenID Connect (OIDC) and the new OIDC-A agent extension with an identity-aware API gateway to securely authenticate users, LLM agents, and MCP tools—going far beyond basic API proxying.",
      "content"  : "AI agents are quickly moving from research demos to real enterprise applications, connecting large language models (LLMs) with company data and services. A common approach is using tools or plugins to let an LLM fetch context or take actions – but some dismiss these as just “glorified API calls.” In reality, securely integrating AI with business systems is far more complex. This is where the Model Context Protocol (MCP) comes in, and why a robust proxy architecture with OpenID Connect (OIDC) identity is crucial for enterprise-scale deployments.\n\ngraph TB\n    User[User] --&gt; |interacts with| AIAgent[AI Agent]\n    AIAgent --&gt; |MCP requests| Proxy[API Gateway/Proxy]\n    Proxy --&gt; |authenticates via| OIDC[Identity Provider/OIDC]\n    Proxy --&gt; |routes to| Tools[MCP Tools/Servers]\n    Tools --&gt; |access| Backend[Backend Systems]\n    \n    subgraph \"Security Perimeter\"\n        Proxy\n        OIDC\n    end\n    \n    classDef security fill:#f96,stroke:#333,stroke-width:2px;\n    class Proxy,OIDC security;\n\n\nThe diagram above illustrates the high-level architecture of a secure MCP implementation. At its core, this architecture places an API Gateway/Proxy as the central security control point between AI agents and MCP tools. The proxy works in conjunction with an Identity Provider supporting OIDC to create a security perimeter that enforces authentication, authorization, and access controls. This ensures that all MCP requests from AI agents are properly authenticated and authorized before reaching the actual MCP tools, which in turn access various backend systems.\n\nMCP is an open standard (originally introduced by Anthropic) that provides a consistent way for AI assistants to interact with external data sources and tools. Instead of bespoke integrations for each system, MCP acts like a universal connector, allowing AI models to retrieve context or execute tasks via a standardized JSON-RPC interface. Importantly, MCP was built with security in mind – nothing is exposed to the AI by default, and it only gains access to what you explicitly allow. In practice, however, ensuring that “allow list” principle across many tools and users requires careful infrastructure. A production-grade API gateway (proxy) can serve as the gatekeeper between AI agents (MCP clients) and the tools or data sources (MCP servers), enforcing authentication, authorization, and routing rules.\n\nBefore diving into the solution, a quick note on Envoy: there are active proposals to use Envoy Proxy as a reference implementation of an MCP gateway. Envoy’s rich L7 routing and extensibility make it a strong candidate, and it may soon include first-class MCP support. That said, the pattern we discuss here is proxy-agnostic – any modern HTTP reverse proxy or API gateway (Envoy, NGINX, HAProxy, Kong, etc.) that offers similar capabilities can be used. The goal is to outline a secure architecture for MCP, rather than the specifics of Envoy configuration.\n\nBeyond “Glorified API Calls”: The Need for Secure MCP Integration\n\nAt first glance, using an AI tool via MCP might seem as simple as calling a web API. In a basic demo, an LLM agent could hit a REST endpoint, get some JSON, and that’s that. But in a real enterprise scenario, a lot more is happening behind the scenes:\n\ngraph LR\n    subgraph \"Simple API Call\"\n        A[Client] --&gt;|Request| B[API]\n        B --&gt;|Response| A\n    end\n    \n    subgraph \"Enterprise MCP Reality\"\n        C[User] --&gt;|Interacts| D[AI Agent]\n        D --&gt;|MCP Request with Identity| E[API Gateway]\n        E --&gt;|Validate Token| F[Identity Provider]\n        E --&gt;|Route Request| G[Tool Registry]\n        E --&gt;|Authorized Request| H[MCP Tool]\n        H --&gt;|Query with User Context| I[Backend System]\n        I --&gt;|Data| H\n        H --&gt;|Response| E\n        E --&gt;|Filtered Response| D\n        D --&gt;|Result| C\n        \n        J[Security Monitoring] -.-&gt;|Audit| E\n    end\n    \n    classDef security fill:#f96,stroke:#333,stroke-width:2px;\n    class E,F,G,J security;\n\n\nThis diagram contrasts a simple API call with the complex reality of enterprise MCP implementations. In the simple case, a client makes a direct request to an API and receives a response. However, in the enterprise MCP reality, the flow is much more complex:\n\n\n  A user interacts with an AI agent\n  The agent makes an MCP request that includes the user’s identity token\n  The API Gateway validates this token with an Identity Provider\n  The Gateway consults a Tool Registry to determine routing\n  If authorized, the request is forwarded to the appropriate MCP tool\n  The tool queries backend systems using the user’s context\n  Data flows back through the tool to the gateway\n  The gateway may filter the response based on security policies\n  The filtered response reaches the AI agent\n  The agent presents the result to the user\n\n\nThroughout this process, security monitoring systems audit the interactions at the gateway level. This comprehensive flow ensures that user identity, permissions, and security policies are enforced at every step, far beyond what a simple API call would entail.\n\n\n  User Identity and Access Control: In an interactive AI application (like a chat assistant that can query internal systems), each request originates from a user with specific permissions. The system must ensure the AI only accesses data or performs actions that the current user is allowed to. Unlike a typical API call where a user directly authenticates to the service, here the AI agent is calling on the user’s behalf. Without a proper identity propagation mechanism, you risk turning a simple tool call into a serious data leak or privilege violation.\n  Multi-Step Context Exchanges: MCP supports stateful sessions and streaming interactions. An AI agent might carry on a multi-turn conversation, calling several tools in sequence and synthesizing their outputs. This is far beyond a one-off API call. The longer this chain goes, the higher the chance of things like context poisoning – where erroneous or malicious data from one step influences subsequent steps. We need safeguards so that a malicious response from one tool cannot trick the model into doing something dangerous in the next step.\n  Complex Delegation Chains: Related to the above, consider when tools call other tools. For example, an AI might use a “file search” tool which itself queries a database or calls another API. This delegation chain should carry forward the original user’s permissions and context without over-privileging any step. Each hop needs consistent enforcement of “who is allowed to do what,” or else an intermediate service might execute an action the user didn’t intend. Managing these delegated authorizations is non-trivial.\n  Dynamic Tool Provisioning: In agile environments, new tools (MCP servers) will be added frequently – think of spinning up a new microservice and immediately making it available to AI agents, or letting third-party plugins be installed. This dynamism is great for flexibility but a headache for security. How do you ensure every new tool meets your security standards? How do you prevent an unvetted or even malicious tool from being introduced? A free-for-all approach can quickly lead to chaos or breach. Clearly defined onboarding, registration, and policy enforcement for tools is needed from day one.\n\n\nIn short, an enterprise must treat AI tool integrations with the same rigor as any production service integration – if not more. A proper gateway layer helps address these concerns by acting as a central control point. Instead of hard-coding trust into each AI agent or tool, the proxy imposes organization-wide security policies. This approach moves us beyond the “just call an API” mindset to a structured model where every MCP call is authenticated, authorized, monitored, and audited.\n\nKey Security Challenges in MCP Workflows\n\nLet’s examine a few specific security challenges that arise when deploying MCP at scale, and why they matter:\n\ngraph TD\n    A[Context Poisoning] --&gt; |mitigated by| B[Content Filtering]\n    A --&gt; |mitigated by| C[Tool Verification]\n    \n    D[Identity Propagation] --&gt; |solved with| E[Token-based Auth]\n    D --&gt; |solved with| F[Delegation Chains]\n    \n    G[Dynamic Tool Provisioning] --&gt; |managed by| H[Tool Registry]\n    G --&gt; |managed by| I[Approval Workflows]\n    G --&gt; |managed by| J[Version Tracking]\n    \n    K[Remote MCP Changes] --&gt; |controlled by| L[Proxy Governance]\n    \n    subgraph \"Proxy Security Controls\"\n        B\n        C\n        E\n        F\n        H\n        I\n        J\n        L\n    end\n    \n    classDef challenge fill:#f66,stroke:#333,stroke-width:2px;\n    classDef solution fill:#6f6,stroke:#333,stroke-width:2px;\n    \n    class A,D,G,K challenge;\n    class B,C,E,F,H,I,J,L solution;\n\n\nThis diagram maps the key security challenges in MCP workflows (shown in red) to their corresponding solutions (shown in green) that can be implemented within the proxy security controls. The diagram illustrates how:\n\n\n  Context poisoning is mitigated through content filtering and tool verification\n  Identity propagation challenges are solved with token-based authentication and proper delegation chains\n  Dynamic tool provisioning risks are managed through a tool registry, approval workflows, and version tracking\n  Remote MCP changes are controlled through proxy governance\n\n\nBy implementing these controls within the proxy layer, organizations can address these security challenges in a centralized, consistent manner rather than trying to solve them individually for each tool or agent.\n\n\n  Context Poisoning: Because MCP enables feeding external data into the model’s context, there’s a risk that data could be deliberately crafted to mislead or exploit the model. This could be a form of prompt injection – e.g. a document retrieved via a tool might contain instructions that hijack the model’s behavior. A malicious actor might also try to register a tool that returns toxic content or false information. The architecture needs ways to validate and sanitize context coming from tools. Mitigations can include content filtering on responses, verifying data against expectations, or restricting which tools the model trusts for certain queries.\n  Delegation Chains and Identity Propagation: As mentioned, an AI agent often acts on behalf of a user. When it calls an MCP server, it should pass along who the user is (or at least what they’re allowed to do). If a tool then calls a backend API, that backend might also need credentials. This chain of delegation is tricky – you want to avoid the “sharing passwords” anti-pattern or hardcoding keys in the open. Instead, solutions involve tokens and OAuth flows: e.g. the user consents and an OAuth2/OIDC token is issued, the AI carries that token in MCP requests, and the MCP server can pass it through to the backend API (or exchange it). Managing these tokens and ensuring they’re used correctly (and not by someone else) is a core security task. The proxy should facilitate this by attaching and validating identity context at each step. It also enables RBAC policies – e.g. only allow certain tool methods if the user’s role is admin.\n\n\nsequenceDiagram\n    participant User\n    participant AIAgent as AI Agent\n    participant Proxy as API Gateway\n    participant IdP as Identity Provider\n    participant Tool as MCP Tool\n    participant Backend as Backend System\n    \n    User-&gt;&gt;IdP: 1. Authenticate (username/password)\n    IdP-&gt;&gt;User: 2. Issue OIDC token\n    User-&gt;&gt;AIAgent: 3. Interact with AI (token attached)\n    AIAgent-&gt;&gt;Proxy: 4. MCP request with token\n    Proxy-&gt;&gt;IdP: 5. Validate token\n    IdP-&gt;&gt;Proxy: 6. Token valid, contains claims/scopes\n    \n    alt Token Valid with Required Permissions\n        Proxy-&gt;&gt;Tool: 7. Forward request with user context\n        Tool-&gt;&gt;Backend: 8. Query with delegated auth\n        Backend-&gt;&gt;Tool: 9. Return data (filtered by user permissions)\n        Tool-&gt;&gt;Proxy: 10. Return result\n        Proxy-&gt;&gt;AIAgent: 11. Return authorized response\n        AIAgent-&gt;&gt;User: 12. Present result\n    else Token Invalid or Insufficient Permissions\n        Proxy-&gt;&gt;AIAgent: 7. Reject request (401/403)\n        AIAgent-&gt;&gt;User: 8. Report access denied\n    end\n\n\nThis sequence diagram illustrates the authentication and authorization flow in an MCP system using OIDC. The process begins with the user authenticating to an Identity Provider and receiving an OIDC token. This token is then attached to the user’s interactions with the AI agent. When the agent makes an MCP request, it includes this token, which the API Gateway validates with the Identity Provider.\n\nIf the token is valid and contains the necessary permissions (claims/scopes), the request is forwarded to the appropriate MCP tool along with the user’s context. The tool can then query backend systems using delegated authentication, ensuring that the data returned is filtered according to the user’s permissions. The result flows back through the system to the user.\n\nIf the token is invalid or lacks sufficient permissions, the request is rejected at the gateway level with an appropriate error code (401 Unauthorized or 403 Forbidden), and the AI agent reports this access denial to the user.\n\nThis flow ensures that user identity and permissions are consistently enforced throughout the entire interaction chain, preventing unauthorized access to sensitive data or operations.\n\n\n  Dynamic Tool Provisioning: In an MCP ecosystem, tools can come and go. For example, an enterprise might quickly stand up a new MCP server for a specific dataset or integrate a third-party service via MCP. Without controls, an AI agent might immediately start invoking any new tool as soon as it appears. That’s risky – you might not want a newly added tool to be available to everyone by default, or it might need vetting. There’s also the configuration aspect: new tool endpoints should be discoverable by the AI, and the gateway needs to know how to route to them and what auth to require. A secure setup will likely involve a tool registry or discovery service that the proxy consults, and administrative approval for tools. The proxy can then automatically enforce the appropriate auth and routing for each new tool, rather than relying on each agent developer to update logic. This provides a governance layer for tool lifecycle.\n\n\nsequenceDiagram\n    participant Admin\n    participant Registry as Tool Registry\n    participant Proxy as API Gateway\n    participant Tool as New MCP Tool\n    participant AIAgent as AI Agent\n    \n    Admin-&gt;&gt;Tool: 1. Develop new MCP tool\n    Admin-&gt;&gt;Registry: 2. Register tool (metadata, endpoints, auth requirements)\n    Registry-&gt;&gt;Registry: 3. Validate tool configuration\n    Registry-&gt;&gt;Proxy: 4. Update routing configuration\n    \n    Note over Registry,Proxy: Tool is now registered but not yet approved\n    \n    Admin-&gt;&gt;Registry: 5. Approve tool for specific user groups\n    Registry-&gt;&gt;Proxy: 6. Update access policies\n    \n    Note over AIAgent,Proxy: Tool is now available to authorized users\n    \n    AIAgent-&gt;&gt;Proxy: 7. Discover available tools\n    Proxy-&gt;&gt;AIAgent: 8. Return approved tools for user\n    AIAgent-&gt;&gt;Proxy: 9. Call new tool\n    Proxy-&gt;&gt;Tool: 10. Route request if authorized\n\n\nThis sequence diagram illustrates the tool registration and approval workflow in a secure MCP environment. The process begins with an administrator developing a new MCP tool and registering it in the Tool Registry, providing metadata, endpoints, and authentication requirements. The registry validates the tool configuration and updates the routing configuration in the API Gateway.\n\nAt this point, the tool is registered but not yet approved for use. The administrator must explicitly approve the tool for specific user groups, which triggers an update to the access policies in the API Gateway. Only then does the tool become available to authorized users.\n\nWhen an AI agent discovers available tools through the proxy, it only receives information about tools that have been approved for the current user. When the agent calls the new tool, the proxy routes the request to the tool only if the user is authorized to access it.\n\nThis workflow ensures that new tools undergo proper vetting and approval before they can be used, and that access is restricted to authorized users only. It also centralizes the tool governance process, making it easier to manage the lifecycle of MCP tools in a secure manner.\n\nBy recognizing these challenges, security engineers and architects can design defenses before problems occur. We next look at how an identity-aware proxy can provide those defenses in a clean, centralized way.\n\nThe Identity-Aware Proxy Pattern for MCP\n\nA proven design in cloud architectures is to put a reverse proxy (often called an API gateway) in front of your services. MCP-based AI systems are no exception. By introducing an intelligent proxy between AI agents (clients) and the MCP servers (tools/backends), we create a controlled funnel through which all AI tool traffic passes. This proxy can operate at Layer 7 (application layer), meaning it understands HTTP and even JSON payloads, allowing fine-grained control. Below, we outline the key roles such a proxy plays in securing MCP:\n\ngraph TB\n    subgraph \"Client Side\"\n        User[User]\n        AIAgent[AI Agent]\n        User --&gt;|interacts| AIAgent\n    end\n    \n    subgraph \"Security Layer\"\n        Proxy[API Gateway/Proxy]\n        Auth[Authentication]\n        RBAC[Authorization/RBAC]\n        Registry[Tool Registry]\n        Audit[Audit Logging]\n        \n        Proxy --&gt;|uses| Auth\n        Proxy --&gt;|enforces| RBAC\n        Proxy --&gt;|consults| Registry\n        Proxy --&gt;|generates| Audit\n    end\n    \n    subgraph \"MCP Tools\"\n        Tool1[Document Search]\n        Tool2[Database Query]\n        Tool3[File Operations]\n        Tool4[External API]\n    end\n    \n    subgraph \"Backend Systems\"\n        DB[(Databases)]\n        Storage[File Storage]\n        APIs[Internal APIs]\n        External[External Services]\n    end\n    \n    AIAgent --&gt;|MCP requests| Proxy\n    Proxy --&gt;|routes to| Tool1\n    Proxy --&gt;|routes to| Tool2\n    Proxy --&gt;|routes to| Tool3\n    Proxy --&gt;|routes to| Tool4\n    \n    Tool1 --&gt;|reads| DB\n    Tool1 --&gt;|reads| Storage\n    Tool2 --&gt;|queries| DB\n    Tool3 --&gt;|manages| Storage\n    Tool4 --&gt;|calls| APIs\n    Tool4 --&gt;|calls| External\n    \n    classDef security fill:#f96,stroke:#333,stroke-width:2px;\n    class Proxy,Auth,RBAC,Registry,Audit security;\n\n\nThis diagram provides a detailed view of the identity-aware proxy pattern for MCP. The architecture is divided into four main layers:\n\n\n  Client Side: Users interact with AI agents, which generate MCP requests.\n  Security Layer: The API Gateway/Proxy sits at the center of the security layer, working with authentication, authorization/RBAC, tool registry, and audit logging components to enforce security policies.\n  MCP Tools: Various tools like document search, database query, file operations, and external API access are available through the MCP interface.\n  Backend Systems: The actual data sources and services that the MCP tools interact with, including databases, file storage, internal APIs, and external services.\n\n\nAll MCP requests from AI agents must pass through the proxy, which authenticates the requests, enforces RBAC policies, consults the tool registry to determine routing, and generates audit logs. The proxy then routes authorized requests to the appropriate MCP tools, which in turn interact with the backend systems.\n\nThis centralized security architecture ensures consistent enforcement of security policies across all MCP interactions, regardless of which tools are being used or which backend systems are being accessed.\n\nSession-Aware Routing and Load Balancing\n\nUnlike a simple stateless API call, MCP sessions can be long-lived and involve streaming (Server-Sent Events for output, etc.). The proxy should ensure that all requests and responses belonging to a given session or conversation are handled consistently. This often means implementing session affinity – if multiple instances of an MCP server are running, the proxy will route a given session’s traffic to the same instance each time. This prevents issues where, say, tool A’s state (in-memory cache, context window, etc.) is lost because request 2 went to a different instance than request 1. Modern proxies can do session-aware load balancing using HTTP headers or routes (for example, mapping a session ID or client ID in the URL to a particular backend). Additionally, the proxy can handle SSE connections gracefully, so that streaming responses aren’t accidentally broken by network intermediaries. Should a session need to be resumed or handed off, the gateway can coordinate that (as proposed in upcoming Envoy features for MCP). In short, the proxy ensures reliability and consistency for MCP’s stateful interactions, which is crucial for user experience and for maintaining correct context.\n\nsequenceDiagram\n    participant User\n    participant AIAgent as AI Agent\n    participant Proxy as API Gateway\n    participant Instance1 as Tool Instance 1\n    participant Instance2 as Tool Instance 2\n    \n    User-&gt;&gt;AIAgent: Start conversation\n    AIAgent-&gt;&gt;Proxy: MCP request 1 (session=abc123)\n    \n    Note over Proxy: Session affinity routing\n    \n    Proxy-&gt;&gt;Instance1: Route to instance 1\n    Instance1-&gt;&gt;Proxy: Response with state\n    Proxy-&gt;&gt;AIAgent: Return response\n    \n    User-&gt;&gt;AIAgent: Continue conversation\n    AIAgent-&gt;&gt;Proxy: MCP request 2 (session=abc123)\n    \n    Note over Proxy: Same session ID routes to same instance\n    \n    Proxy-&gt;&gt;Instance1: Route to instance 1 (preserves state)\n    Instance1-&gt;&gt;Proxy: Response with updated state\n    Proxy-&gt;&gt;AIAgent: Return response\n    \n    Note over User,Instance2: Without session affinity, request might go to instance 2 and lose state\n\n\nThis sequence diagram illustrates how session affinity works in an MCP environment. When a user starts a conversation with an AI agent, the agent makes an MCP request to the API Gateway with a session identifier (in this case, “abc123”). The gateway uses this session ID to route the request to a specific tool instance (Instance 1).\n\nWhen the user continues the conversation, the agent makes another MCP request with the same session ID. Because the gateway implements session affinity, it routes this request to the same instance (Instance 1), which preserves the state from the previous interaction. This ensures a consistent and coherent experience for the user.\n\nWithout session affinity, the second request might be routed to a different instance (Instance 2), which would not have the state information from the first request. This would result in a broken experience, as the tool would not have the context of the previous interaction.\n\nSession affinity is particularly important for MCP because many AI interactions are stateful and context-dependent. The proxy’s ability to maintain this session consistency is a key advantage over simpler API integration approaches.\n\nJWT and OIDC Integration for Authentication\n\nEvery request hitting the MCP gateway should carry a valid identity token – typically a JSON Web Token (JWT) issued by an Identity Provider via OIDC (OpenID Connect). By requiring JWTs, the proxy offloads authentication from the tools themselves and ensures that only authenticated, authorized calls make it through. In practice, this means the AI agent (or the user’s session with the agent) must obtain an OIDC token (for example, an ID token or access token) and attach it to each MCP request (often in an HTTP header like Authorization: Bearer &lt;token&gt;). The proxy verifies this token, checks signature and claims (issuer, audience, expiration, etc.), and rejects any request that isn’t properly authenticated. This way, your MCP servers never see an anonymous call – they trust the gateway to have vetted identity.\n\nsequenceDiagram\n    participant User\n    participant App as AI Application\n    participant IdP as Identity Provider\n    participant Proxy as API Gateway\n    participant Tool as MCP Tool\n    \n    User-&gt;&gt;App: Access AI application\n    App-&gt;&gt;IdP: Redirect to login\n    User-&gt;&gt;IdP: Authenticate\n    IdP-&gt;&gt;App: Authorization code\n    App-&gt;&gt;IdP: Exchange code for tokens\n    IdP-&gt;&gt;App: ID token + access token\n    \n    Note over App: Store tokens securely\n    \n    User-&gt;&gt;App: Request using AI tool\n    App-&gt;&gt;Proxy: MCP request with access token\n    \n    Proxy-&gt;&gt;Proxy: Validate token (signature, expiry, audience)\n    Proxy-&gt;&gt;Proxy: Extract user identity and permissions\n    \n    alt Token Valid\n        Proxy-&gt;&gt;Tool: Forward request with user context\n        Tool-&gt;&gt;Proxy: Response\n        Proxy-&gt;&gt;App: Return response\n        App-&gt;&gt;User: Display result\n    else Token Invalid\n        Proxy-&gt;&gt;App: 401 Unauthorized\n        App-&gt;&gt;User: Session expired, please login again\n    end\n    \n    Note over App,Proxy: Token refresh happens in background\n    App-&gt;&gt;IdP: Refresh token when needed\n    IdP-&gt;&gt;App: New access token\n\n\nThis sequence diagram illustrates the OIDC authentication flow in an MCP environment. The process begins when a user accesses the AI application, which redirects to the Identity Provider for authentication. After the user authenticates, the Identity Provider issues an authorization code, which the application exchanges for ID and access tokens.\n\nThe application securely stores these tokens and uses the access token when making MCP requests through the AI agent. When the proxy receives a request, it validates the token by checking the signature, expiration, audience, and other claims. It also extracts the user’s identity and permissions from the token.\n\nIf the token is valid, the proxy forwards the request to the appropriate MCP tool along with the user’s context. The tool processes the request and returns a response, which flows back through the proxy to the application and ultimately to the user.\n\nIf the token is invalid (expired, tampered with, etc.), the proxy returns a 401 Unauthorized response, and the application prompts the user to log in again.\n\nIn the background, the application can use a refresh token to obtain new access tokens when needed, without requiring the user to re-authenticate. This ensures a smooth user experience while maintaining security.\n\nThis OIDC integration provides a robust authentication mechanism that is widely adopted in enterprise environments and integrates well with existing identity management systems.\n\nIntroducing OIDC-A for Agent &amp; Tool Identity\n\nWhile the discussion above focuses on authenticating the human user, a production-grade MCP deployment must also identify two additional actors:\n\n\n  The LLM agent that is orchestrating the workflow.\n  The MCP tool / resource that is being invoked on the backend.\n\n\nOur companion post “OpenID Connect for Agents (OIDC-A) 1.0 Proposal” (/2025/04/28/oidc-a-proposal/) extends OIDC Core 1.0 with a rich set of claims for agent identity, attestation, and delegation chains.  In practice this means:\n\n\n  When an AI agent starts a session it obtains an ID Token that contains the OIDC-A claims (agent_type, agent_model, agent_instance_id, delegator_sub, delegation_chain, etc.).  This token travels alongside the user’s access token in every MCP request.\n  MCP tools can likewise expose their own OIDC identity (or be issued a signed resource token) that advertises metadata such as tool capabilities, version, and trust level (agent_capabilities, agent_trust_level, agent_attestation).\n  The gateway now validates up to three identities on every call – user → agent → tool – forming an explicit delegation chain that can be evaluated against RBAC and compliance policies.\n\n\nAdopting OIDC-A brings several benefits:\n\n\n  End-to-end, cryptographically verifiable identity for everything that touches the request path.\n  Fine-grained authorisation based on agent or tool capabilities (e.g., allow only agents that advertise email:draft capability to invoke the Mail tool).\n  Built-in attestation (agent_attestation) enables the gateway to verify the integrity and provenance of both agents and tools before routing traffic to them.\n\n\nFor the remainder of this article, whenever we refer to a “token” being validated by the gateway, assume this now encompasses the user’s token, the agent’s OIDC-A token, and (optionally) the tool/resource token – all evaluated in a single policy decision step.\n\nThis pattern is already used widely in API security: “an API Gateway can securely and consistently implement authentication… without burdening the applications themselves.” In our context, the MCP proxy might integrate with your enterprise SSO (Azure AD, Okta, etc.) via OIDC to handle user login flows and token validation. Many gateways support OIDC natively, initiating redirects for user login if needed and then storing the resulting token in a cookie for session continuity. In a headless agent scenario (where the AI is calling tools server-to-server), the token might be provisioned out-of-band (e.g. the user logged into the AI app, so the app injects the token for the agent to use). Either way, the gateway enforces that no token = no access. It can also map token claims to roles or scopes to implement authorization (e.g., only users with an “HR_read” scope can use the “HR Database” tool). This aligns perfectly with MCP’s design goal of secure connections – combining MCP with OIDC and OIDC-A gives you an end-to-end authenticated channel for tool usage.\n\nsequenceDiagram\n    participant User\n    participant Agent as LLM Agent (OIDC-A)\n    participant Proxy as API Gateway\n    participant Tool as MCP Tool (OIDC-A)\n    participant Backend as Backend System\n\n    User-&gt;&gt;Agent: 1. Interact (chat, form, etc.)\n    Agent-&gt;&gt;Proxy: 2. MCP request\\nBearer user token + agent OIDC-A token\n    Proxy-&gt;&gt;Proxy: 3. Validate user token (OIDC) &amp; agent token (OIDC-A)\n    Proxy--&gt;&gt;Tool: 4. Forward request plus optional *resource token* for the tool\n    Tool-&gt;&gt;Backend: 5. Query/act using delegated auth\n    Backend--&gt;&gt;Tool: 6. Data / result\n    Tool--&gt;&gt;Proxy: 7. Response (may include attestation)\n    Proxy--&gt;&gt;Agent: 8. Authorized response\n    Agent--&gt;&gt;User: 9. Present result\n\n\nTool Metadata Filtering and Policy Enforcement\n\nA powerful advantage of the proxy is that it can make routing decisions based not just on URLs, but on metadata within the requests. With MCP, requests and responses are in JSON-RPC format, which includes fields like the tool method name, parameters, and even tool annotations. An identity-aware proxy can be configured to inspect these details and apply policy rules. For example, you might configure rules such as:\n\ngraph TD\n    subgraph \"MCP Request\"\n        Request[JSON-RPC Request]\n        Method[Tool Method]\n        Params[Parameters]\n        User[User Identity]\n    end\n    \n    subgraph \"Policy Engine\"\n        Rules[Policy Rules]\n        RBAC[Role-Based Access]\n        Audit[Audit Logging]\n        Transform[Response Transformation]\n    end\n    \n    Request --&gt; Method\n    Request --&gt; Params\n    Request --&gt; User\n    \n    Method --&gt; Rules\n    Params --&gt; Rules\n    User --&gt; RBAC\n    \n    Rules --&gt; Decision{Allow/Deny}\n    RBAC --&gt; Decision\n    \n    Decision --&gt;|Allow| Forward[Forward to Tool]\n    Decision --&gt;|Deny| Reject[Reject Request]\n    \n    Forward --&gt; Audit\n    Reject --&gt; Audit\n    \n    Forward --&gt; Tool[MCP Tool]\n    Tool --&gt; Response[Tool Response]\n    Response --&gt; Transform\n    Transform --&gt; Filtered[Filtered Response]\n    \n    classDef request fill:#bbf,stroke:#333,stroke-width:1px;\n    classDef policy fill:#fbf,stroke:#333,stroke-width:1px;\n    classDef action fill:#bfb,stroke:#333,stroke-width:1px;\n    \n    class Request,Method,Params,User request;\n    class Rules,RBAC,Audit,Transform policy;\n    class Decision,Forward,Reject,Filtered action;\n\n\nThis diagram illustrates how tool metadata filtering and policy enforcement work in an MCP proxy. The process begins with an MCP request in JSON-RPC format, which contains the tool method, parameters, and user identity information. These components are extracted and fed into the policy engine.\n\nThe policy engine consists of policy rules, role-based access control (RBAC), audit logging, and response transformation components. The tool method and parameters are evaluated against the policy rules, while the user identity is checked against RBAC permissions.\n\nBased on these evaluations, the policy engine makes an allow/deny decision. If the request is allowed, it is forwarded to the MCP tool; if denied, it is rejected. In either case, the action is logged for audit purposes.\n\nWhen a request is allowed and processed by the tool, the response may pass through a transformation step before being returned to the client. This transformation can filter or modify the response based on security policies, such as removing sensitive information that the user shouldn’t see.\n\nThis fine-grained policy enforcement at the metadata level allows for sophisticated security controls that go far beyond simple URL-based routing. For example:\n\n\n  “If the tool call is delete_file and the user is not in the IT Admin group, deny the request.”\n  “Only allow the execute_sql tool on weekdays between 9am-5pm, and log all queries.”\n  “If a tool is marked as containing sensitive data, ensure the response is sanitized or encrypted.”\n\n\nThis is analogous to a web application firewall (WAF) or an API gateway performing content filtering, but tailored to AI tool usage. In the Envoy MCP proposal, this corresponds to parsing MCP messages and using RBAC filters on them. The proxy essentially understands the intent of each tool call and can gate it appropriately. It also can redact or transform data if needed – for instance, stripping out certain fields from a response that the user shouldn’t see, or masking personally identifiable information. By centralizing this in the gateway, you avoid having to implement checks in each tool service (which could be inconsistent or forgotten). Auditing is another benefit: the proxy can log every tool invocation along with user identity and parameters, feeding into SIEM systems for monitoring. That way, if an AI one day does something it shouldn’t, you have a clear trail of which tool call was involved and who prompted it. In sum, metadata-based filtering turns the proxy into a smart policy enforcement point, adding a safety layer on top of MCP’s basic capabilities.\n\nVersion-Aware and Context-Aware Routing\n\nEnterprises constantly evolve their services – new versions, A/B tests, staging vs. production deployments, etc. The proxy can greatly simplify how AI agents handle these changes. Instead of the AI needing to know which version of a tool to call, the gateway can implement version-aware routing. For instance, the MCP endpoint for a “Document Search” tool could remain the same for the agent, but the proxy might route 90% of requests to v1 of the service and 10% to a new v2 (for a canary rollout). Or route internal users to a “beta” instance while external users go to stable. This is done by matching on request attributes or using routing rules that include user audience and tool identifiers.\n\ngraph TB\n    AIAgent[AI Agent] --&gt;|MCP Request| Proxy[API Gateway]\n    \n    Proxy --&gt;|\"90% traffic\"| V1[Tool v1]\n    Proxy --&gt;|\"10% traffic\"| V2[Tool v2 - Canary]\n    \n    Proxy --&gt;|\"Internal Users\"| Beta[Beta Version]\n    Proxy --&gt;|\"External Users\"| Stable[Stable Version]\n    \n    Proxy --&gt;|\"Small Requests\"| Standard[Standard Instance]\n    Proxy --&gt;|\"Large Requests\"| HighMem[High-Memory Instance]\n    \n    Proxy --&gt;|\"US Users\"| US[US Region]\n    Proxy --&gt;|\"EU Users\"| EU[EU Region]\n    \n    classDef proxy fill:#f96,stroke:#333,stroke-width:2px;\n    classDef version fill:#bbf,stroke:#333,stroke-width:1px;\n    classDef audience fill:#bfb,stroke:#333,stroke-width:1px;\n    classDef size fill:#fbf,stroke:#333,stroke-width:1px;\n    classDef region fill:#ff9,stroke:#333,stroke-width:1px;\n    \n    class Proxy proxy;\n    class V1,V2 version;\n    class Beta,Stable audience;\n    class Standard,HighMem size;\n    class US,EU region;\n\n\nThis diagram illustrates the various routing strategies that an API Gateway can implement for MCP requests. The gateway can route traffic based on multiple factors:\n\n\n  \n    Version-based routing: The gateway can split traffic between different versions of a tool, such as sending 90% to v1 and 10% to a canary deployment of v2. This allows for gradual rollouts and A/B testing without requiring changes to the AI agents.\n  \n  \n    Audience-based routing: Internal users can be directed to beta versions of tools, while external users are routed to stable versions. This allows for internal testing and validation before wider release.\n  \n  \n    Request size-based routing: Small requests can be handled by standard instances, while large requests that require more resources are directed to high-memory instances. This optimizes resource utilization and ensures that demanding requests don’t impact the performance of standard operations.\n  \n  \n    Geographic routing: Users from different regions can be directed to region-specific instances, reducing latency and potentially addressing data residency requirements.\n  \n\n\nThe AI agent doesn’t need to be aware of these routing decisions; it simply makes requests to the logical tool name, and the gateway handles the complexity of routing to the appropriate backend. This abstraction simplifies the agent’s implementation while providing powerful operational capabilities.\n\nSimilarly, routing can consider context – e.g., direct requests to the nearest regional server for lower latency if the user’s location is known, or choose a different backend depending on the size of the request (perhaps a special high-memory instance for very large files). All of this is configurable at the proxy level. The AI agent simply calls the logical tool name, and the gateway takes care of finding the right backend. This not only eases operations (you can upgrade backend tools without breaking the AI’s interface), but also adds to security. You could isolate certain versions for testing, or ensure that experimental tools are only accessible under certain conditions. By controlling traffic flow, the proxy helps maintain a principle of least privilege on a macro scale – the AI only reaches the backends it’s supposed to, via routes that are appropriate for the current context.\n\nImplementing MCP Security with a Proxy: A Practical Approach\n\nNow that we’ve covered the key security patterns, let’s look at a practical approach to implementing MCP security with an identity-aware proxy. This section outlines the steps to set up a secure MCP environment, focusing on the integration points between components.\n\ngraph TB\n    subgraph ImplementationSteps[\"Implementation Steps\"]\n        Step1[1. Set up Identity Provider]\n        Step2[2. Configure API Gateway]\n        Step3[3. Implement Tool Registry]\n        Step4[4. Define Security Policies]\n        Step5[5. Integrate AI Agents]\n        Step6[6. Monitor and Audit]\n        \n        Step1 --&gt; Step2\n        Step2 --&gt; Step3\n        Step3 --&gt; Step4\n        Step4 --&gt; Step5\n        Step5 --&gt; Step6\n    end\n    \n    classDef step fill:#beb,stroke:#333,stroke-width:1px\n    class Step1,Step2,Step3,Step4,Step5,Step6 step\n\n\nThis diagram outlines the six key steps in implementing MCP security with a proxy. The process follows a logical progression:\n\n\n  Set up Identity Provider: Establish the foundation for authentication and authorization.\n  Configure API Gateway: Set up the central security control point.\n  Implement Tool Registry: Create a system for managing MCP tools.\n  Define Security Policies: Establish the rules for access control and data protection.\n  Integrate AI Agents: Connect the AI agents to the secure MCP environment.\n  Monitor and Audit: Continuously track and review system activity.\n\n\nEach step builds on the previous ones, creating a comprehensive security implementation. The following sections will explore each step in detail.\n\n1. Setting Up the Identity Provider\n\nThe first step is to configure your identity provider (IdP) to support the OIDC flows needed for MCP security. This typically involves:\n\n\n  Creating an OIDC application in your IdP (e.g., Azure AD, Okta, Auth0)\n  Configuring the appropriate scopes and claims\n  Setting up the redirect URIs for your AI application\n  Generating client credentials (client ID and secret)\n\n\nThe IdP will be responsible for authenticating users and issuing the tokens that will be used to secure MCP requests. It’s important to configure the appropriate scopes and claims to ensure that the tokens contain the necessary information for authorization decisions.\n\n2. Configuring the API Gateway\n\nNext, you’ll need to configure your API gateway to act as the MCP proxy. This involves:\n\nsequenceDiagram\n    participant Admin\n    participant Gateway as API Gateway\n    participant IdP as Identity Provider\n    \n    Admin-&gt;&gt;Gateway: 1. Configure OIDC integration\n    Gateway-&gt;&gt;IdP: 2. Fetch OIDC discovery document\n    IdP-&gt;&gt;Gateway: 3. Return endpoints and keys\n    \n    Admin-&gt;&gt;Gateway: 4. Set up MCP routing rules\n    Admin-&gt;&gt;Gateway: 5. Configure security policies\n    \n    Note over Gateway: Gateway ready to validate tokens and route MCP traffic\n\n\nThis sequence diagram illustrates the process of configuring an API Gateway for MCP security. The process begins with an administrator configuring the OIDC integration in the gateway. The gateway then fetches the OIDC discovery document from the Identity Provider, which returns the necessary endpoints and keys for token validation.\n\nNext, the administrator sets up MCP routing rules, defining how requests should be directed to different MCP tools based on various criteria. The administrator also configures security policies, specifying who can access which tools and under what conditions.\n\nOnce these configurations are complete, the gateway is ready to validate tokens and route MCP traffic according to the defined rules and policies. This setup process establishes the gateway as the central security control point for all MCP interactions.\n\nThe configuration steps include:\n\n\n  Setting up the OIDC integration, including configuring the token validation parameters (issuer, audience, etc.)\n  Defining the routing rules for MCP requests\n  Configuring the security policies for tool access\n  Setting up the audit logging\n\n\nThe gateway will be responsible for validating the tokens, enforcing the security policies, and routing the MCP requests to the appropriate backends. It’s important to ensure that the gateway is properly configured to handle the MCP JSON-RPC format and to extract the necessary information for policy decisions.\n\n3. Implementing the Tool Registry\n\nA tool registry is essential for managing the lifecycle of MCP tools in your environment. This involves:\n\n\n  Creating a database or service to store tool metadata\n  Defining the registration process for new tools\n  Implementing the approval workflow for tool access\n  Integrating the registry with the API gateway\n\n\nThe tool registry will be responsible for maintaining the list of available tools, their endpoints, and their access requirements. It will also provide the necessary information to the API gateway for routing and policy enforcement.\n\ngraph TB\n    subgraph \"Tool Registry\"\n        DB[(Tool Database)]\n        API[Registry API]\n        UI[Admin UI]\n        \n        UI --&gt;|Manage Tools| API\n        API --&gt;|CRUD Operations| DB\n    end\n    \n    subgraph \"Integration Points\"\n        Gateway[API Gateway]\n        Agents[AI Agents]\n        \n        API --&gt;|Tool Configurations| Gateway\n        API --&gt;|Available Tools| Agents\n    end\n    \n    subgraph \"Tool Lifecycle\"\n        Register[Register]\n        Approve[Approve]\n        Deploy[Deploy]\n        Monitor[Monitor]\n        Retire[Retire]\n        \n        Register --&gt; Approve\n        Approve --&gt; Deploy\n        Deploy --&gt; Monitor\n        Monitor --&gt; Retire\n    end\n    \n    classDef registry fill:#bbf,stroke:#333,stroke-width:1px;\n    classDef integration fill:#fbf,stroke:#333,stroke-width:1px;\n    classDef lifecycle fill:#bfb,stroke:#333,stroke-width:1px;\n    \n    class DB,API,UI registry;\n    class Gateway,Agents integration;\n    class Register,Approve,Deploy,Monitor,Retire lifecycle;\n\n\nThis diagram illustrates the components and lifecycle of a Tool Registry in an MCP environment. The Tool Registry consists of three main components:\n\n\n  Tool Database: Stores metadata about all registered MCP tools, including their endpoints, versions, access requirements, and status.\n  Registry API: Provides programmatic access to the tool database, enabling CRUD operations on tool registrations.\n  Admin UI: Allows administrators to manage tools through a user interface, including registration, approval, and monitoring.\n\n\nThe Tool Registry integrates with two key systems:\n\n  API Gateway: Receives tool configurations from the registry, which inform routing and policy decisions.\n  AI Agents: Discover available tools through the registry, based on user permissions and tool status.\n\n\nThe diagram also shows the lifecycle of an MCP tool:\n\n  Register: A new tool is registered in the system with its metadata.\n  Approve: The tool undergoes review and is approved for use by specific user groups.\n  Deploy: The tool is made available in the production environment.\n  Monitor: The tool’s usage and performance are monitored.\n  Retire: When no longer needed, the tool is retired from the system.\n\n\nThis comprehensive approach to tool management ensures that all MCP tools are properly vetted, deployed, and monitored throughout their lifecycle, reducing security risks and operational issues.\n\n4. Defining Security Policies\n\nSecurity policies are the rules that govern access to MCP tools. This involves:\n\n\n  Defining the RBAC policies for tool access\n  Configuring the content filtering rules for responses\n  Setting up the audit logging requirements\n  Implementing the version control policies\n\n\nThe security policies will be enforced by the API gateway based on the user’s identity and the tool being accessed. It’s important to ensure that the policies are comprehensive and aligned with your organization’s security requirements.\n\n5. Integrating AI Agents\n\nFinally, you’ll need to integrate your AI agents with the secure MCP environment. This involves:\n\n\n  Configuring the agents to obtain and use OIDC tokens\n  Implementing the MCP client functionality\n  Handling authentication and authorization errors\n  Managing token refresh and session continuity\n\n\nThe AI agents will be responsible for obtaining the necessary tokens and including them in MCP requests. They’ll also need to handle authentication and authorization errors gracefully, providing appropriate feedback to users.\n\nsequenceDiagram\n    participant User\n    participant Agent as AI Agent\n    participant App as Application\n    participant IdP as Identity Provider\n    participant Gateway as API Gateway\n    participant Tool as MCP Tool\n    \n    User-&gt;&gt;App: Access AI application\n    App-&gt;&gt;IdP: Authenticate user\n    IdP-&gt;&gt;App: Issue tokens\n    \n    User-&gt;&gt;Agent: Request using AI capabilities\n    Agent-&gt;&gt;App: Request token for MCP\n    App-&gt;&gt;Agent: Provide token\n    \n    Agent-&gt;&gt;Gateway: MCP request with token\n    Gateway-&gt;&gt;Gateway: Validate token &amp; apply policies\n    Gateway-&gt;&gt;Tool: Forward authorized request\n    Tool-&gt;&gt;Gateway: Response\n    Gateway-&gt;&gt;Agent: Return response\n    Agent-&gt;&gt;User: Present result\n    \n    Note over App,Gateway: Token refresh cycle\n    App-&gt;&gt;IdP: Refresh token when needed\n    IdP-&gt;&gt;App: New access token\n\n\nThis sequence diagram illustrates the integration of AI agents with a secure MCP environment. The process begins when a user accesses the AI application, which authenticates the user with the Identity Provider and receives tokens.\n\nWhen the user makes a request that requires AI capabilities, the AI agent requests a token from the application, which provides it. The agent then includes this token in its MCP request to the API Gateway.\n\nThe gateway validates the token and applies security policies to determine if the request should be allowed. If authorized, the request is forwarded to the appropriate MCP tool, which processes it and returns a response. This response flows back through the gateway to the agent and ultimately to the user.\n\nIn the background, the application handles token refresh cycles, requesting new access tokens from the Identity Provider when needed. This ensures continuous operation without requiring the user to re-authenticate frequently.\n\nThis integration approach ensures that AI agents operate within the security framework established by the proxy architecture, with all requests properly authenticated and authorized.\n\nConclusion: Beyond Glorified API Calls\n\nBy implementing a secure MCP architecture with an identity-aware proxy, you move far beyond “glorified API calls” to a robust, enterprise-grade integration between AI agents and your business systems. This approach addresses the key security challenges of MCP deployments, including:\n\n\n  User identity and access control\n  Multi-step context exchanges\n  Complex delegation chains\n  Dynamic tool provisioning\n  Remote MCP changes and version tracking\n\n\nThe proxy-based architecture provides a centralized control point for enforcing security policies, managing tool access, and monitoring AI agent activity. It also simplifies operations by abstracting away the complexity of backend services and providing a consistent interface for AI agents.\n\nAs MCP continues to evolve and gain adoption, the security patterns described in this article will become increasingly important for enterprise deployments. By implementing these patterns now, you can ensure that your AI agent infrastructure is secure, scalable, and ready for the future.\n\ngraph LR\n    A[Glorified API Calls] --&gt;|Evolution| B[Secure MCP Architecture]\n    \n    subgraph \"Key Benefits\"\n        C[Centralized Security]\n        D[Identity Propagation]\n        E[Policy Enforcement]\n        F[Audit &amp; Compliance]\n        G[Operational Simplicity]\n    end\n    \n    B --&gt; C\n    B --&gt; D\n    B --&gt; E\n    B --&gt; F\n    B --&gt; G\n    \n    classDef benefit fill:#bfb,stroke:#333,stroke-width:1px;\n    class C,D,E,F,G benefit;\n\n\nThis final diagram summarizes the evolution from “glorified API calls” to a secure MCP architecture, highlighting the key benefits of this approach:\n\n\n  Centralized Security: A single control point for enforcing security policies across all MCP interactions.\n  Identity Propagation: Consistent handling of user identity and permissions throughout the system.\n  Policy Enforcement: Fine-grained control over who can access which tools and under what conditions.\n  Audit &amp; Compliance: Comprehensive logging and monitoring of all MCP activities for security and compliance purposes.\n  Operational Simplicity: Abstraction of backend complexity, making it easier to manage and evolve the system over time.\n\n\nBy adopting this architecture, organizations can confidently deploy AI agents in enterprise environments, knowing that their MCP interactions are secure, auditable, and manageable at scale. This represents a significant advancement beyond the simplistic view of AI tools as mere API calls, recognizing the complex security requirements of production AI systems.\n"
    },
  
    {
      "title"    : "OpenID Connect for Agents (OIDC-A) 1.0 Proposal",
      "category" : "",
      "tags"     : "OpenID, OAuth, AI, Agents, Security, Identity, Authentication, Authorization, Standards, Proposal, Specification",
      "url"      : "/2025/04/28/oidc-a-proposal/",
      "date"     : "April 28, 2025",
      "excerpt"  : "Technical proposal for extending OpenID Connect Core 1.0 to provide a framework for representing, authenticating, and authorizing LLM-based agents within the OAuth 2.0 ecosystem.",
      "content"  : "This document proposes a standard extension to OpenID Connect for representing and verifying the identity of LLM-based agents. It integrates the core proposal with detailed frameworks for verification, attestation, and delegation chains.\n\nAbstract\n\nOpenID Connect for Agents (OIDC-A) 1.0 is an extension to OpenID Connect Core 1.0 that provides a framework for representing, authenticating, and authorizing LLM-based agents within the OAuth 2.0 ecosystem. This specification defines standard claims, endpoints, and protocols for establishing agent identity, verifying agent attestation, representing delegation chains, and enabling fine-grained authorization based on agent attributes.\n\n1. Introduction\n\n1.1 Rationale\n\nAs LLM-based agents become increasingly prevalent in digital ecosystems, there is a growing need for standardized methods to represent their identity and manage their authorization. Traditional OAuth 2.0 and OpenID Connect protocols were designed primarily for human users and conventional applications, lacking the necessary constructs to represent the unique characteristics of autonomous agents, such as:\n\n\n  Acting on behalf of users with varying degrees of autonomy\n  Operating within delegation chains\n  Possessing dynamic capabilities based on their underlying models\n  Requiring attestation of their integrity and origin\n\n\nThis specification addresses these gaps by extending OpenID Connect to provide a comprehensive framework for agent identity and authorization.\n\n1.2 Terminology\n\nThis specification uses the terms defined in OAuth 2.0 [RFC6749], OpenID Connect Core 1.0, and the following additional terms:\n\n\n  Agent: An LLM-based software entity capable of autonomous or semi-autonomous action based on natural language instructions.\n  Agent Provider: The organization responsible for creating, training, and/or hosting the agent.\n  Agent Model: The specific LLM model that powers the agent (e.g., GPT-4, Claude 3).\n  Agent Instance: A specific running instance of an agent, typically associated with a particular task or conversation.\n  Delegator: The entity (typically a human user) who delegates authority to an agent to act on their behalf.\n  Delegation Chain: A sequence of delegation steps from the original user through potentially multiple agents.\n  Attestation: Cryptographic proof of an agent’s integrity, origin, and/or properties.\n  Attestation Evidence: Data structure containing the proof used for attestation.\n  Relying Party (RP): In this context, often a Resource Server or Client application that needs to verify an agent’s identity and authorization.\n\n\n1.3 Overview\n\nOIDC-A extends OpenID Connect by:\n\n\n  Defining new standard claims for representing agent identity, delegation, and capabilities.\n  Specifying mechanisms and formats for agent attestation evidence.\n  Establishing protocols for representing and validating delegation chains.\n  Providing discovery mechanisms for agent capabilities and attestation support.\n  Defining authorization frameworks suitable for agent-specific use cases.\n  Introducing endpoints for attestation verification and capability discovery.\n\n\n2. Agent Identity Claims\n\n2.1 Core Agent Identity Claims\n\nThe following claims MUST or SHOULD be included in ID Tokens issued to or about agents:\n\n\n    \n        \n            Claim\n            Type\n            Description\n            Requirement\n        \n    \n    \n        \n            agent_type\n            string\n            Identifies the type/class of agent (e.g., \"assistant\", \"retrieval\", \"coding\")\n            REQUIRED\n        \n        \n            agent_model\n            string\n            Identifies the specific model (e.g., \"gpt-4\", \"claude-3-opus\", \"gemini-pro\")\n            REQUIRED\n        \n        \n            agent_version\n            string\n            Version identifier of the agent model\n            RECOMMENDED\n        \n        \n            agent_provider\n            string\n            Organization that provides/hosts the agent (e.g., \"openai.com\", \"anthropic.com\")\n            REQUIRED\n        \n        \n            agent_instance_id\n            string\n            Unique identifier for this specific instance of the agent\n            REQUIRED\n        \n    \n\n\n2.2 Delegation and Authority Claims\n\n\n    \n        \n            Claim\n            Type\n            Description\n            Requirement\n        \n    \n    \n        \n            delegator_sub\n            string\n            Subject identifier of the entity who most recently delegated authority to this agent\n            REQUIRED\n        \n        \n            delegation_chain\n            array\n            Ordered array of delegation steps (see Section 2.4.2)\n            OPTIONAL\n        \n        \n            delegation_purpose\n            string\n            Description of the purpose/intent for which authority was delegated\n            RECOMMENDED\n        \n        \n            delegation_constraints\n            object\n            Constraints placed on the agent by the delegator\n            OPTIONAL\n        \n    \n\n\n2.3 Capability, Trust, and Attestation Claims\n\n\n    \n        \n            Claim\n            Type\n            Description\n            Requirement\n        \n    \n    \n        \n            agent_capabilities\n            array\n            Array of capability identifiers representing what the agent can do\n            RECOMMENDED\n        \n        \n            agent_trust_level\n            string\n            Trust classification of the agent (e.g., \"verified\", \"experimental\")\n            OPTIONAL\n        \n        \n            agent_attestation\n            object\n            Attestation evidence or reference (see Section 2.4.4)\n            RECOMMENDED\n        \n        \n            agent_context_id\n            string\n            Identifier for the conversation/task context\n            RECOMMENDED\n        \n    \n\n\n2.4 Claim Formats and Validation\n\n2.4.1 agent_type\nString value from a defined set of agent types. Implementers SHOULD use one of the following values when applicable:\n\n  assistant: General-purpose assistant agent\n  retrieval: Agent specialized in information retrieval\n  coding: Agent specialized in code generation or analysis\n  domain_specific: Agent specialized for a particular domain\n  autonomous: Agent with high degree of autonomy\n  supervised: Agent requiring human supervision for key actions\n\n\nCustom types MAY be used but SHOULD follow the format vendor:type (e.g., acme:financial_advisor).\n\n2.4.2 delegation_chain\nJSON array containing objects representing each step in the delegation chain, from the original user to the current agent. Each object MUST contain:\n\n  iss: REQUIRED. String identifying the Authorization Server or entity that issued/validated this delegation step.\n  sub: REQUIRED. String identifying the delegator (the entity granting permission).\n  aud: REQUIRED. String identifying the delegatee (the agent receiving permission).\n  delegated_at: REQUIRED. NumericDate representing the time the delegation occurred.\n  scope: REQUIRED. Space-separated string of OAuth scopes representing the permissions granted in this delegation step. MUST be a subset of the scopes held by the delegator (sub).\n  purpose: OPTIONAL. String describing the intended purpose of this delegation step.\n  constraints: OPTIONAL. JSON object specifying constraints on the delegation (e.g., {\"max_duration\": 3600, \"allowed_resources\": [\"/data/abc\"]}).\n  jti: OPTIONAL. A unique identifier for this specific delegation step, useful for revocation or tracking.\n\n\nThe array MUST be ordered chronologically.\n\nValidation Rules for delegation_chain (performed by Relying Party):\n\n  Order Verification: Confirm chronological order based on delegated_at.\n  Issuer Trust: Verify each iss is trusted.\n  Audience Matching: Confirm aud of step N matches sub of step N+1.\n  Scope Reduction: Verify scope in each step is a subset of/equal to the delegator’s available scopes.\n  Constraint Enforcement: Ensure compliance with any constraints.\n  Signature Validation (if applicable): Validate signatures if steps are individually signed.\n  Policy Check: Evaluate the validated chain against authorization policies (e.g., max length).\n\n\n2.4.3 agent_capabilities\nArray of string identifiers representing the agent’s capabilities. Implementers SHOULD use capability identifiers from a well-defined taxonomy when available. Custom capabilities SHOULD follow the format vendor:capability (e.g., acme:financial_analysis).\n\n2.4.4 agent_attestation\nJSON object containing attestation evidence or a reference to it. MUST include a format field indicating the type of evidence.\n\nRecommended Format: JWT-based, potentially compatible with IETF RATS Entity Attestation Token (EAT).\n\nExample:\n\"agent_attestation\": {\n  \"format\": \"urn:ietf:params:oauth:token-type:eat\",\n  \"token\": \"eyJhbGciOiJFUzI1NiIsInR5cCI6ImVhdCtqd3QifQ...\"\n}\n\nOther formats (e.g., \"format\": \"TPM2-Quote\", \"format\": \"SGX-Quote\") MAY be used.\n\n3. Protocol Flow\n\n3.1 Agent Authentication Flow\n\nThe OIDC-A authentication flow extends the standard OpenID Connect Authentication flow:\n\n\n  Client Registration: Clients representing agents MUST register additional metadata (see Section 4).\n  Authentication Request: Agents SHOULD include the agent scope and potentially delegation_context.\n  Authentication Response: The Authorization Server includes agent-specific claims in the ID Token.\n  Token Validation: RPs MUST validate standard OIDC claims and relevant agent-specific claims (including attestation and delegation if present) according to policy.\n\n\n3.2 Delegation Flow\n\nWhen an agent is delegated authority:\n\n\n  The delegator authenticates and authorizes the delegation.\n  The Authorization Server issues a new ID Token to the agent including delegator_sub, delegation_chain (updated), delegation_purpose, and constrained scope.\n\n\n3.3 Attestation Verification Flow\n\nTo verify an agent’s attestation:\n\n\n  The agent includes the agent_attestation claim in its ID Token or provides evidence separately.\n  The RP validates the evidence based on the specified format:\n    \n      Verify cryptographic signatures using trusted keys (obtained via Discovery).\n      Compare platform measurements against known-good values.\n      Validate nonces to prevent replay attacks.\n      Optionally, use the agent_attestation_endpoint for validation assistance.\n    \n  \n  Authorization decisions incorporate the attestation status (e.g., verified: true/false).\n\n\n4. Client Registration and Discovery\n\n4.1 Agent Client Registration Metadata\n\nExtends OAuth 2.0 Dynamic Client Registration [RFC7591]:\n\n\n    \n        \n            Parameter\n            Type\n            Description\n        \n    \n    \n        \n            agent_provider\n            string\n            Identifier of the agent provider\n        \n        \n            agent_models_supported\n            array\n            List of supported agent models\n        \n        \n            agent_capabilities\n            array\n            List of agent capabilities\n        \n        \n            attestation_formats_supported\n            array\n            List of supported attestation formats\n        \n        \n            delegation_methods_supported\n            array\n            List of supported delegation methods\n        \n    \n\n\n4.2 Discovery Metadata\n\nExtends OpenID Connect Discovery 1.0:\n\n\n    \n        \n            Parameter\n            Type\n            Description\n        \n    \n    \n        \n            agent_attestation_endpoint\n            string\n            URL of the attestation endpoint\n        \n        \n            agent_capabilities_endpoint\n            string\n            URL of the capabilities discovery endpoint\n        \n        \n            agent_claims_supported\n            array\n            List of supported agent claims\n        \n        \n            agent_types_supported\n            array\n            List of supported agent types\n        \n        \n            delegation_methods_supported\n            array\n            List of supported delegation methods\n        \n        \n            attestation_formats_supported\n            array\n            List of supported attestation formats\n        \n        \n            attestation_verification_keys_endpoint\n            string\n            URL to retrieve public keys for verifying attestation signatures\n        \n    \n\n\n5. Endpoints\n\n5.1 Agent Attestation Endpoint\n\nAn OAuth 2.0 protected resource that returns attestation information about an agent or assists in validating provided evidence. URL advertised via agent_attestation_endpoint discovery parameter.\n\n5.1.1 Request Example (Get Info)\n\nGET /agent/attestation?agent_id=123&amp;nonce=abc\nAuthorization: Bearer &lt;token&gt;\n\n\n5.1.2 Response Example\n\n{\n  \"verified\": true,\n  \"provider\": \"openai.com\",\n  \"model\": \"gpt-4\",\n  \"version\": \"2025-03\",\n  \"attestation_timestamp\": 1714348800,\n  \"attestation_signature\": \"...\"\n}\n\n\n5.2 Agent Capabilities Endpoint\n\nProvides information about an agent’s capabilities. URL advertised via agent_capabilities_endpoint discovery parameter.\n\n5.2.1 Request Example\n\nGET /.well-known/agent-capabilities\n\n\n5.2.2 Response Example\n\n{\n  \"capabilities\": [\n    {\"id\": \"text_generation\", \"description\": \"...\"},\n    {\"id\": \"code_generation\", \"description\": \"...\"}\n  ],\n  \"supported_constraints\": [\"max_tokens\", \"allowed_tools\"]\n}\n\n\n6. Security Considerations\n\n6.1 Agent Authentication\n\nAgents SHOULD use strong, asymmetric methods (JWT Client Auth [RFC7523], mTLS [RFC8705]), potentially combined with attestation. Shared secrets are NOT RECOMMENDED.\n\n6.2 Delegation Security\n\nSystems MUST validate the entire delegation chain, enforce scope reduction, implement consent mechanisms, and consider time-bounding. Policies may limit chain length. Robust revocation mechanisms are needed.\n\n6.3 Attestation Security\n\nRequires secure management of signing keys, robust nonce handling, trustworthy known-good measurements, secure endpoints, and protection against replay attacks. Attestation evidence may have privacy implications.\n\n6.4 Token Security\n\nID Tokens with agent claims SHOULD be encrypted. Access tokens SHOULD have limited lifetimes. Refresh tokens for agents require careful consideration.\n\n7. Privacy Considerations\n\nImplementations MUST consider potential correlation of agent identity, privacy implications of delegation chains, user consent requirements, and data minimization in claims.\n\n8. Compatibility and Versioning\n\nOIDC-A 1.0 is designed for compatibility with OAuth 2.0 [RFC6749], OIDC Core 1.0, JWT [RFC7519], and related RFCs. Future versions will aim for backward compatibility.\n\n9. References\n\n\n  [RFC6749] The OAuth 2.0 Authorization Framework\n  [RFC7519] JSON Web Token (JWT)\n  [RFC7523] JWT Profile for OAuth 2.0 Client Authentication\n  [RFC7591] OAuth 2.0 Dynamic Client Registration\n  [RFC7662] OAuth 2.0 Token Introspection\n  [RFC8705] OAuth 2.0 Mutual-TLS Client Authentication\n  [OpenID Connect Core 1.0]\n  [OpenID Connect Discovery 1.0]\n  [IETF RATS] Remote Attestation Procedures Architecture\n\n\nAppendix A: Example ID Token with Agent Claims\n\n{\n  \"iss\": \"https://auth.example.com\",\n  \"sub\": \"agent_instance_789\",\n  \"aud\": \"client_123\",\n  \"exp\": 1714435200,\n  \"iat\": 1714348800,\n  \"auth_time\": 1714348800,\n  \"nonce\": \"n-0S6_WzA2Mj\",\n  \"agent_type\": \"assistant\",\n  \"agent_model\": \"gpt-4\",\n  \"agent_version\": \"2025-03\",\n  \"agent_provider\": \"openai.com\",\n  \"agent_instance_id\": \"agent_instance_789\",\n  \"delegator_sub\": \"user_456\",\n  \"delegation_purpose\": \"Email management assistant\",\n  \"agent_capabilities\": [\"email:read\", \"email:draft\", \"calendar:view\"],\n  \"agent_trust_level\": \"verified\",\n  \"agent_context_id\": \"conversation_123\",\n  \"agent_attestation\": {\n    \"format\": \"urn:ietf:params:oauth:token-type:eat\",\n    \"token\": \"eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...\",\n    \"timestamp\": 1714348800\n  },\n  \"delegation_chain\": [\n    {\n      \"iss\": \"https://auth.example.com\",\n      \"sub\": \"user_456\",\n      \"aud\": \"agent_instance_789\",\n      \"delegated_at\": 1714348700,\n      \"scope\": \"email profile calendar\"\n    }\n  ]\n}\n\n\nAppendix B: Example Delegation Chain (Multi-step)\n\n\"delegation_chain\": [\n  {\n    \"iss\": \"https://auth.example.com\",\n    \"sub\": \"user_456\",\n    \"aud\": \"agent_instance_789\",\n    \"delegated_at\": 1714348800,\n    \"scope\": \"email calendar\",\n    \"purpose\": \"Manage my emails and calendar\"\n  },\n  {\n    \"iss\": \"https://auth.example.com\",\n    \"sub\": \"agent_instance_789\",\n    \"aud\": \"agent_instance_101\",\n    \"delegated_at\": 1714348830,\n    \"scope\": \"calendar:view\",\n    \"purpose\": \"Analyze available time slots\"\n  }\n]\n\n\n\n\n"
    },
  
    {
      "title"    : "AI Agents and Agentic Security: The Next Frontier in Enterprise Automation",
      "category" : "",
      "tags"     : "AI, Security, Automation, Enterprise, AI Agents",
      "url"      : "/2024/12/10/ai-agents-agentic-security-enterprise-automation/",
      "date"     : "December 10, 2024",
      "excerpt"  : "Exploring the potential of AI agents in enterprise security and automation, and how they can enhance security operations.",
      "content"  : "Traditional automation tools like Robotic Process Automation (RPA) and Integration Platform as a Service (iPaaS) have long served as the backbone of enterprise workflows. These systems, designed to automate repetitive tasks and connect disparate software tools, have delivered undeniable value. However, their inherent limitations are becoming increasingly evident. They require significant manual setup, often break when systems change, and struggle to handle unstructured data such as documents, emails, or images.\n\nEnter AI agents — a revolutionary leap from static, rule-based automation to intelligent, adaptable systems. AI agents promise to overcome the constraints of traditional tools, paving the way for smarter, more efficient enterprise automation. An excellent breakdown of their significance can be found in the insightful Menlo Ventures article “Beyond Bots: How AI Agents Are Driving the Next Wave of Enterprise Automation”.\n\n\n\nThe Shift from Automation to Intelligence\n\nAI agents represent a fundamental paradigm shift. Unlike their predecessors, these systems are not bound by rigid rules or pre-defined workflows. Instead, they possess the ability to learn, adapt, and make decisions based on changing circumstances. This adaptability enables them to address dynamic and complex tasks, unlocking unprecedented levels of efficiency and scalability.\n\nHowever, this evolution introduces a new layer of complexity: agentic security. As AI agents grow more autonomous, ensuring their security, transparency, and trustworthiness becomes paramount, particularly in multi-agent environments where multiple AI systems must collaborate. This shift necessitates rethinking how we secure enterprise automation systems to ensure they remain robust and trustworthy in a rapidly evolving landscape.\n\nThe Imperative of Agentic Security\n\nAgentic security involves safeguarding intelligent, autonomous systems while maintaining their transparency and reliability. It becomes especially critical in environments where multiple AI agents operate simultaneously, managing dynamic processes and sensitive data. Key considerations for agentic security include:\n\nDynamic Adaptability with Robust Security\n\nAI agents excel at adjusting to system changes, but their adaptability must not come at the expense of enterprise security. In multi-agent environments, secure communication protocols and strong authentication mechanisms form the foundation of security. However, static security measures alone are insufficient. Evolving contexts require context-aware security — a system that dynamically adjusts access controls and agent behavior based on situational needs and data sensitivities. This mitigates risks such as unauthorized escalations, prompt injection attacks, and data breaches.\n\nFor example, a financial reporting agent, which has access to internal financial metrics, should be able to generate a detailed report for C-suite agents while maintaining strict data boundaries. If an HR agent requests information about salaries, the financial agent should only provide relevant, pre-approved metrics, such as aggregated departmental budgets, rather than individual salary slips. This ensures that agents respect organizational boundaries and adhere to context-aware security protocols.\n\nIn cross-enterprise collaborations, where AI agents from different organizations interact, maintaining the integrity of each participant’s systems is essential. Context-aware security ensures that agents respect boundaries and operate within predefined limits, even as they adapt to new information or changing environments.\n\nTransparent Decision-Making and Accountability\n\nAs AI agents take on more critical roles in enterprise processes, transparency and accountability become non-negotiable. Organizations must implement mechanisms to trace and audit agent decisions, ensuring they align with business objectives and ethical standards. This is particularly important in regulated industries, where compliance requirements demand a clear understanding of how and why decisions are made.\n\nTrust in Multi-Agent Collaboration\n\nIn scenarios where multiple agents collaborate, trust is the cornerstone of effective operation. Agents must communicate securely, share information responsibly, and resolve conflicts without compromising the integrity of the broader system. Establishing trust requires robust encryption, tamper-proof logs, and mechanisms for conflict resolution to prevent unintended behaviors or system failures.\n\nThe Path Forward\n\nAI agents represent the next frontier in enterprise automation, promising smarter, faster, and more scalable workflows. However, their increasing sophistication demands a proactive approach to agentic security. As organizations embrace these intelligent systems, they must prioritize building trust, safeguarding data, and ensuring transparency to foster sustainable innovation.\n\nThe Menlo Ventures article encapsulates this beautifully: AI agents are not just tools — they are collaborators, reshaping how enterprises operate. But with great power comes great responsibility. By addressing the challenges of agentic security, we can unlock the full potential of AI agents while preserving the integrity and trust that underpin modern enterprises.\n"
    },
  
    {
      "title"    : "A feat of strength MVP for AI Apps",
      "category" : "",
      "tags"     : "AI, MVP, Product Development, User Feedback, Innovation",
      "url"      : "/2024/02/20/a-feat-of-strength-mvp-for-ai-apps/",
      "date"     : "February 20, 2024",
      "excerpt"  : "Exploring the concept of a Minimum Viable Product (MVP) in AI applications, focusing on delivering value by understanding and addressing user needs effectively.",
      "content"  : "A minimum viable product (MVP) is a version of a product with just enough features to be usable by early customers, who can then provide feedback for future product development.\n\nToday I want to focus on what that looks like for shipping AI applications. To do that, we only need to understand 4 things.\n\n  What does 80% actually mean?\n  What segments can we serve well?\n  Can we double down?\n  Can we educate the user about the segments we don’t serve well?\n\n\nThe Pareto principle, also known as the 80/20 rule, still applies but in a different way than you might think.\n\nWhat is an MVP?\nAn analogy I often use to help understand this concept is as follows: You need something to help get from point A to point B. Maybe the vision is to have a car. However, the MVP is not a chassis without wheels or an engine. Instead, it might look like a skateboard. You’ll ship and realize the product needs brakes or steering. So then you ship a scooter. Afterwards, you figure out the scooter needs more leverage, so you add larger wheels and end up with a bicycle. Limited by the force you can apply as a human being, you start thinking about motors and can branch out into mopeds, e-bikes, and motorcycles. Then one day, ship the car.\n\nConsider the 80/20 rule\nWhen talking about something being  80% done or 80% ready, it is usually in a machine-learning sense. In this context, each component is deterministic, which means 80% translates to  8 out of 10 features being complete. Once the remaining 2 features are ready, we can ship the product. However, If we want to follow the 80/20 rule, we might be able to ship the product with 80% of the features and then add the remaining 20% later, like a car without a radio or air conditioning. However, The meaning of 80% can vary significantly, and this definition may not apply to an AI-powered application.\n\nThe issue with Summary Statistics\n\nThe above image is an example of Anscombe’s quartet. It’s a set of four datasets that have nearly identical simple descriptive statistics yet very different distributions and appearances. This is a classic explanation of why summary statistics can be misleading.\n\nConsider the following example:\n\n\n    \n        \n            Query_id\n            score\n        \n    \n    \n        \n            1\n            0.9\n        \n        \n            2\n            0.8\n        \n        \n            3\n            0.9\n        \n        \n            4\n            0.9\n        \n        \n            5\n            0.0\n        \n        \n            6\n            0.0\n        \n    \n\n\nThe average score is 0.58. However, if we analyze the queries within segments, we might discover that we are serving the majority of queries exceptionally well!\n\n\n  Admitting what you’re bad at\n\n  Being honest with what you’re bad at is a great way to build trust with your users. If you can accurately identify when something will perform poorly and confidently reject it, then you might be ready to ship a great product while educating your users about the limitations of your application.\n\n\nIt is very important to understand the limitations of your system and to be able to confidently understand the characteristics of your system beyond summary statistics. This is because not all systems are made equal. The behavior of a probabilistic system could be very different from the previous example. Consider the following dataset:\n\n\n    \n        \n            Query_id\n            Score\n        \n    \n    \n        \n            1\n            .59\n        \n        \n            2\n            .58\n        \n        \n            3\n            .59\n        \n        \n            4\n            .57\n        \n    \n\nA system like this also has the same average score of 0.58, but it’s not as easy to reject any subset of requests…\n\nLearning to say no\nConsider an RAG application where a large proportion of the queries are regarding timeline queries. If our search engines do not support this time constraint, we will likely be unable to perform well.\n\n\n    \n        \n            Query_id\n            Score\n            Query Type\n        \n    \n    \n        \n            1\n            0.9\n            text search\n        \n        \n            2\n            0.8\n            text search\n        \n        \n            3\n            0.9\n            news search\n        \n        \n            4\n            0.9\n            news search\n        \n        \n            5\n            0.0\n            timeline\n        \n        \n            6\n            0.0\n            timeline\n        \n    \n\n\nIf we’re in a pinch to ship, we could simply build a classification model that detects whether or not these questions are timeline questions and throw a warning. Instead of constantly trying to push the algorithm to do better, we can educate the user and educate them by changing the way that we might design the product.\n\n\n  Detecting segments\n\n  Detecting these segments could be accomplished in various ways. We could construct a classifier or employ a language model to categorize them. Additionally, we can utilize clustering algorithms with the embeddings to identify common groups and potentially analyze the mean scores within each group. The sole objective is to identify segments that can enhance our understanding of the activities within specific subgroups.\n\n\nOne of the worst things you can do is to spend months building out a feature that only increases your productivity by a little while ignoring some more important segment of your user base.\n\nBy redesigning our application and recognizing its limitations, we can potentially improve performance under certain conditions by identifying the types of tasks we can decline. If we are able to put this segment data into some kind of In-System Observability, we can safely monitor what proportion of questions are being turned down and prioritize our work to maximize coverage.\n\nFigure out what you’re actually trying to do before you do it\nOne of the dangerous things I’ve noticed working with startups is that we often think that the AI works at all… As a result, we want to be able to serve a large general application without much thought into what exactly we want to accomplish.\n\nIn my opinion, most of these companies should try to focus on one or two significant areas and identify a good niche to target. If your app is good at one or two tasks, there’s no way you could not find a hundred or two hundred users to test out your application and get feedback quickly. Whereas, if your application is good at nothing, it’s going to be hard to be memorable and provide something that has repeated use. You might get some virality, but very quickly, you’re going to lose the trust of your users and find yourself in a position where you’re trying to reduce churn.\n\nWhen we’re front-loaded, the ability to use GPT-4 to make predictions, and time to feedback is very important. If we can get feedback quickly, we can iterate quickly. If we can iterate quickly, we can build a better product.\n\nFinal thoughts\nThe MVP for an AI application is not as simple as shipping a product with 80% of the features. Instead, it requires a deep understanding of the segments of your users that you can serve well and the ability to educate your users about the segments that you don’t serve well. By understanding the limitations of your system and niching down, you can build a product that is memorable and provides something that has repeated use. This will allow you to get feedback quickly and iterate quickly, ultimately leading to a better product, by identifying your feats of strength.\n"
    },
  
    {
      "title"    : "The Nockout Story",
      "category" : "",
      "tags"     : "Sports, Technology, Community, Innovation",
      "url"      : "/2024/01/11/the-nockout-story/",
      "date"     : "January 11, 2024",
      "excerpt"  : "Discover how Nockout is transforming the way we find and enjoy sports activities. No more hassle in booking courts, no more mismatches in skill levels, just pure joy of playing your favorite sports.",
      "content"  : "\n\nAs the co-founders of Nockout, Yash and I, Subramanya, have been on a quest to solve a problem that plagues every sports enthusiast: finding the right place and the right people for playing sports. Our personal struggles with organizing sports activities have led us to create a platform that not only eases these challenges but also promotes a sense of community among sports lovers.\n\nThe Problem: A Universal Challenge\nOur frustrations weren’t unique. Across the globe, from tennis courts to basketball hoops, sports enthusiasts were grappling with the same issues: finding the right venue and the right people to play with. This global dilemma was evident in the shared experiences voiced through numerous tweets and conversations among the community.\n\nBay Club is pretty good. But also trying to find a reliable way to find players is hard (even using PyC).&mdash; Gautam (@gautamtata) January 1, 2024\n\nYou should move to New York, where it&#39;s even more difficult!https://t.co/c8RjpPzW9x&mdash; Awais Hussain (@Ahussain4) January 1, 2024\n\nsomeone create an app that shows all public basketball courts and whether or not people are at them or not. this would save a lot of time for me lol.&mdash; thao 🍉 (@holycowitsthao) March 18, 2021\n\nI have wanted pickup hoops forever&mdash; Rob Kornblum (@rkorny) July 5, 2021\n\nThese tweets underscore the need for a platform like Nockout.\n\nOur Solution: Introducing Nockout\nNockout is more than just an app; it’s a revolution in the sports community. Designed to be intuitive and user-friendly, it addresses key challenges:\n\n\n  Venue Discovery: The app shows you all available sports facilities nearby. Whether it’s a public basketball court or a private soccer field, “Nockout” has you covered.\n  Skill-Based Activity Matching: Our platform intuitively recommends players whose skills align with yours, ensuring you can join in on sporting activities that suit your preferences and proficiency in your chosen sport. After all, it’s all about fair play and good competition.\n  Intuitive Process: We’ve designed Nockout to be user-friendly. The booking process is straightforward, and finding players is hassle-free.\n\n\nThe Impact: Fostering Community and Fair Play\nNockout transcends being a mere application; it’s about building a community bound by the love of sports. It encourages fair play, connects like-minded individuals, and rekindles the joy in sports.\n\nLooking Ahead: The Future of Nockout\nOur vision for Nockout is expansive and all-encompassing:\n\n\n  Creating Spaces for Teams: Developing private areas for teams and groups to interact and bond.\n  Expanding Community Features: Introducing a platform for sharing triumphs and experiences.\n  Accessible Coaching and Activities: Offering a range of activities and coaching sessions for all skill levels and interests.\n  Streamlined Payments and Management: Enhancing the booking and payment process for a smooth user experience.\n  Personalized Athletic Journey: Providing tailored advice for sports and nutrition, alongside a comprehensive sports marketplace.\n\n\nJoin the Revolution\nBe part of a movement that’s reshaping the sports landscape. Sign up for early beta access at Nockout.co, and connect with us on Instagram, LinkedIn, and Twitter. Together, let’s make sports accessible and enjoyable for everyone!\n\n\n\n"
    },
  
    {
      "title"    : "Enhancing Document Interactions - Leveraging the synergy of Google Cloud Platform, Pinecone, and LLM in Natural Language Communication",
      "category" : "",
      "tags"     : "GCP, Pinecone, Large Language Models, OpenAI, Document AI",
      "url"      : "/2023/06/10/enhancing-document-interactions/",
      "date"     : "June 10, 2023",
      "excerpt"  : "Explore the groundbreaking fusion of Google Cloud Platform for OCR, Pinecone, and Large Language Model that is transforming information retrieval. This blog delves into how these potent tools collaborate to enable seamless interactions with documents using natural language. Discover how Google Cloud Platform offers a solid foundation, Pinecone provides rapid similarity searches for effective document retrieval, and LLM elevates language comprehension and generation capabilities.",
      "content"  : "\nHigh-level view of system design with Document AI, OpenAI, Pinecone\n\nIn today’s digital era, accessing crucial information from government documents can be overwhelming and time-consuming due to their scanned and non-digitized formats. To address this issue, there is a need for an innovative tool that simplifies navigation, scanning, and digitization of these documents, making them easily readable and searchable. This user-friendly solution will revolutionize the way people interact with government documents, leading to better decision-making, improved public services, and a more informed and engaged citizenry. Developing such a tool is essential for ensuring transparency and accessibility of vital information in the modern world.\n\nTo achieve our goal, we will follow a systematic approach consisting of the following steps:\n\n\n  We will use the powerful Document AI API provided by Google Cloud Platform to convert PDF / Image documents into text format. This step allows us to extract textual content from the documents, making it easier to process and analyze.\n  Next, we will employ a Language Model (LLM) to generate embeddings for each text extracted from the documents. These embeddings capture the semantic representation of the text, enabling us to effectively analyze and compare documents based on their content.\n  To optimize the retrieval process, we will utilize Pinecone, a robust indexing and similarity search system. By storing the generated embeddings in PineCone, we can quickly search for documents that closely match a user’s query.\n  With the acquired knowledge and enhanced search capabilities, our tool will efficiently answer user queries by retrieving the most relevant documents based on their content.\n\n\nFor demonstration of this process, we utilized documents from the Karnataka Resident Data Hub (KRDH) by web scraping.\n\n    \n\nDemo: Building a powerful question/answering for government documents using Document AI, OpenAI, Pinecone, and Flask\n\n1. Setting Up Google Cloud Platform - Document AI\n\nDocument AI is a document understanding platform that converts unstructured data from documents into structured data, making it easier to comprehend, analyze, and utilize. To set up Document AI in your Google Cloud Platform (GCP) Console, follow these steps:\n\n\n  Enable the Document AI API.\n  Create a service account:\n    \n      Navigate to the create service account page in the Google Cloud console.\n      Choose your project.\n      Enter a name in the Service account name field. The Google Cloud console will automatically fill in the Service account ID field based on this name.\n      Click Create and continue.\n      Grant the Project &gt; Owner role to your service account to provide access to your project.\n      Click Continue.\n      Click Done to complete the service account creation process. (Do not close your browser window, as you will need it in the next step.)\n    \n  \n  Create a service account key:\n    \n      In the Google Cloud console, click the email address for the service account you created.\n      Click Keys.\n      Click Add key, then click Create new key.\n      Click Create. A JSON key file will be downloaded to your computer.\n      Click Close.\n    \n  \n  Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file containing your service account key. This variable applies only to your current shell session, so if you open a new session, you will need to set the variable again.\n  Install the Client Library:\n      pip install --upgrade google-cloud-documentai \n    \n  \n  Create a Processor:\n    \n      In the Document AI section of the Google Cloud console, go to the Processors page.\n      Click +Create processor.\n      Choose the processor type you want to create from the list.\n      In the Create processor window, specify a processor name.\n      Select your desired region from the list.\n      Click Create to generate your processor.\n      Take note of the Processor ID and location.\n    \n  \n\n\nAfter completing these steps, you are ready to use the Document AI API in your code.\n\ndef convert_pdf_images_to_text(file_path: str):\n    \"\"\"\n    Convert PDF or image file containing text into plain text using Google Document AI.\n    Args:\n        file_path (str): The file path of the PDF or image file.\n\n    Returns:\n        str: The extracted plain text from the input file.\n    \"\"\"\n    extention = file_path.split(\".\")[-1].strip()\n    if extention == \"pdf\":\n        mime_type = \"application/pdf\"\n    elif extention == \"png\":\n        mime_type = \"image/png\"\n    elif extention == \"jpg\" or extention == \"jpeg\":\n        mime_type = \"image/jpeg\"\n    opts = ClientOptions(\n        api_endpoint=f\"{location}-documentai.googleapis.com\"\n    )\n    client = documentai.DocumentProcessorServiceClient(client_options=opts)\n    # Add the credentials obtained, Project ID, Location and the Processor ID\n    name = client.processor_path(\n        project_id, location, processor_id\n    )\n    # Read the file into memory\n    with open(file_path, \"rb\") as image:\n        image_content = image.read()\n    # Load Binary Data into Document AI RawDocument Object\n    raw_document = documentai.RawDocument(content=image_content, mime_type=mime_type)\n    # Configure the process request\n    request = documentai.ProcessRequest(name=name, raw_document=raw_document)\n    result_document = client.process_document(request=request).document\n    return result_document.text\n\n\n2. Embeddings Generation and Pinecone\n\nIn this step, we will use the OpenAI Text Embedding API to generate embeddings that capture the semantic meaning of the extracted text. These embeddings serve as numerical representations of the textual data, allowing us to understand the underlying context and nuances.\n\nAfter generating the embeddings, we will securely store them in Pinecone, a powerful indexing and similarity search system. By leveraging Pinecone’s efficient storage capabilities, we can effectively organize and index the embeddings for quick and precise retrieval.\n\nWith the embeddings stored in Pinecone, our system gains the ability to perform similarity searches. This enables us to find documents that closely match a given query or exhibit similar semantic characteristics.\n\nThe following code uses OpenAI’s Text Embedding model to create embeddings for text data. It divides the input text into chunks, generates embeddings for each chunk, and then upserts the embeddings along with associated metadata to a Pinecone search index for efficient searching and retrieval.\n\ndef create_embeddings(\n    text: str, model: str = \"text-embedding-ada-002\"):\n    \"\"\"\n    Creates a text embedding using OpenAI's Text Embedding model.\n\n    Args:\n        text (str): The text to embed\n        model (str, optional): The name of the text embedding model to use.\n            Defaults to \"text-embedding-ada-002\".\n\n    Returns:\n        List[float]: The text embedding.\n    \"\"\"\n    if type(text) == list:\n        response = openai.Embedding.create(model=model, input=text).data\n        return [d[\"embedding\"] for d in response]\n    else:\n        return [openai.Embedding.create(\n            model=model, input=[text]).data[0][\"embedding\"]]∂\n\n\ndef generate_embeddings_upload_to_pinecone(documents: List[Dict[str, Any]]):\n    \"\"\"\n    Generates text embeddings from the provided documents, then uploads and indexes \n    them to Pinecone.\n\n    Args:\n        documents (List[Dict[str, Any]]): A list of dictionaries containing \n        document information.\n            Each dictionary should include the following keys:\n                - \"Content\": The text content of the document.\n                - \"DocumentName\": The name of the document.\n                - \"DocumentType\": The type/category of the document.\n\n    Note:\n        This function assumes that Pinecone and the associated index have already\n        been initialized properly. Please make sure to initialize Pinecone first\n        and set up the index accordingly.\n    \"\"\"\n    # create chunks\n    chunks = []\n    for document in documents:\n        texts = create_chunks(document[\"Content\"])\n        chunks.extend(\n            [\n                {\n                    \"id\": str(uuid4()),\n                    \"text\": texts[i],\n                    \"chunk_index\": i,\n                    \"title\": document[\"DocumentName\"],\n                    \"type\": document[\"DocumentType\"],\n                }\n                for i in range(len(texts))\n            ]\n        )\n    # initialize Pinecone index, create embeddings, and upsert to Pinecone\n    index = pinecone.Index(\"pinecone-index\")\n    for i in tqdm(range(0, len(chunks), 100)):\n        # find end of batch\n        i_end = min(len(chunks), i + 100)\n        batch = chunks[i:i_end]\n        ids_batch = [x[\"id\"] for x in batch]\n        texts = [x[\"text\"] for x in batch]\n        embeds = create_embeddings(text=texts)\n        # cleanup metadata\n        meta_batch = [\n            {\n                \"title\": x[\"title\"],\n                \"type\": x[\"type\"],\n                \"text\": x[\"text\"],\n                \"chunk_index\": x[\"chunk_index\"],\n            }\n            for x in batch\n        ]\n        to_upsert = []\n        for id, embed, meta in list(zip(ids_batch, embeds, meta_batch)):\n            to_upsert.append(\n                {\n                    \"id\": id,\n                    \"values\": embed,\n                    \"metadata\": meta,\n                }\n            )\n        # upsert to Pinecone\n        index.upsert_documents(to_upsert)\n\n\nFor more information on OpenAI’s Text Embedding API, refer to the OpenAI API documentation. For more details on Pinecone, check out the Pinecone documentation.\n\n3. User Query and Communication\nFinally, with all the necessary components in place, we can witness the powerful functionality of our tool as it matches user queries with relevant context and provides accurate answers.\n\nWhen a user submits a query, our system leverages the stored embeddings and advanced search capabilities to identify the most relevant documents based on their semantic similarity to the query. By analyzing the contextual information captured in the embeddings, our tool can retrieve the documents that contain the desired information.\n\ndef query_and_combine(\n    self, query_vector: list, top_k: int = 5, threshold: float = 0.75):\n    \"\"\"Query Pinecone index and combine responses to string\n\n    Args:\n        query_embedding (list): Query embedding\n        index (str): Pinecone index to query\n        top_k (int, optional): Number of top results to return. Defaults to 5.\n        threshold : The similarity threshold. Defaults to 0.75\n\n    Returns:\n        str: Combined responses\n    \"\"\"\n    responses = index.query(query_vector=query_vector, top_k=top_k, metadata=True)\n    _responses = []\n    for sample in responses[\"matches\"]:\n        if sample[\"score\"] &lt; threshold:\n            continue\n        if \"text\" in sample[\"metadata\"]:\n            _responses.append(sample[\"metadata\"][\"text\"])\n        else:\n            _responses.append(str(sample[\"metadata\"]))\n\n    return \" \\n --- \\n \".join(_responses).replace(\"\\n---\\n\", \" \\n --- \\n \").strip()\n\n\ndef generate_answer(query: str, language: str = \"English\"):\n    \"\"\"\n    Generates an answer to a user's query using the context from Pinecone search results\n    and OpenAI's chat models.\n\n    The function takes the user's query, creates a text embedding from it, performs a\n    Pinecone query to find relevant context, and then generates an answer using OpenAI's\n    chat models with the given context.\n\n    Returns:\n        A JSON object containing the generated answer.\n\n    Note:\n        This function assumes that Pinecone and the associated index have already been \n        initialized properly, and that the OpenAI API is set up correctly. Please \n        make sure to initialize Pinecone and the OpenAI API first.\n    \"\"\"\n    query_embed = create_embeddings(text=query)[0]\n    augmented_query = query_and_combine(\n        query_embed,\n        top_k=app.config[\"top_n\"],\n        threshold=app.config[\"pinecone_threshold\"],\n    )\n    ## Creating the prompt for model\n    primer = \"\"\"You are Q&amp;A bot. A highly intelligent system that answers\n    user questions based on the context provided by the user above\n    each question. If the information can not be found in the context\n    provided by the user you truthfully say \"I don't know\". Be as concise as possible.\n    \"\"\"\n    augmented_query = augmented_query if augmented_query != \"\" else \"No context found\"\n\n    text, usage = openai.ChatCompletion.create(\n        messages=[\n            {\"role\": \"system\", \"content\": primer},\n            {\n                \"role\": \"user\",\n                \"content\": f\"Context: \\n {augmented_query} \\n --- \\n Question: {query} \\n Answer in {language}\",\n            },\n        ],\n        model=app.config[\"chat_model\"],\n        temperature=app.config[\"temperature\"],\n    )\n\n    return text\n\n\nThe code consists of two functions.\n\n  query_and_combine() queries a Pinecone index using a query vector, retrieves the top matching responses, and combines them into a single string. It filters the responses based on a similarity threshold and extracts the relevant text or metadata to be included in the combined result.\n  generate_answer() generates an answer to a user query. It creates an embedding for the query, performs a combined query on the Pinecone index, and uses the obtained augmented query as context for a chat-based language model. The model generates an answer based on the context and user query, which is then returned as the response.\nOverall, the code enables querying a Pinecone index, combining responses, and generating answers using a language model based on the given query and context.\n\n\nAs you reach the end of this blog, we hope you have gained valuable insights into the powerful combination of Google Cloud Platform, Pinecone, and Language Models for revolutionizing document interactions. To dive deeper and explore the code behind this innovative solution, visit our GitHub repository. Feel free to clone, modify, and contribute to the project, and don’t hesitate to share your thoughts and experiences. I would also like to thank Tasheer Hussain B for his contributions.  Happy coding!\n\nReferences\n\n  Google Document AI\n  Retrieval Enhanced Generative Question Answering with OpenAI\n  Introduction to Flask\n  GitHub repository\n\n"
    },
  
    {
      "title"    : "Hybrid Search for E-Commerce with Pinecone and LLMs",
      "category" : "",
      "tags"     : "Pinecone, Hybrid Search, E-Commerce, Large Language Models, Vector Database",
      "url"      : "/2023/05/02/hybrid-search-for-e-commerce-with-pinecone-and-LLM/",
      "date"     : "May 2, 2023",
      "excerpt"  : "Learn how to build a powerful hybrid search system for e-commerce applications by combining traditional information retrieval methods with machine learning models like Language Models (LLMs) and Pinecone, a managed vector database. Discover the benefits of hybrid search for e-commerce, including improved search relevance, personalization, handling long-tail queries, and simpler infrastructure management.",
      "content"  : "Searching and finding relevant products is a critical component of an e-commerce website. Providing fast and accurate search results can make the difference between high user satisfaction and user frustration. With recent advancements in natural language understanding and vector search technologies, enhanced search systems have become more accessible and efficient, leading to better user experiences and improved conversion rates.\n\nIn this blog post, we’ll explore how to implement a hybrid search system for e-commerce using Pinecone, a high-performance vector search engine, and fine-tuned domain-specific language models. By the end of this post, you’ll not only have a strong understanding of hybrid search but also a practical step-by-step guide to implementing it.\n\nWhat is Hybrid Search?\n\n\nHigh-level view of simple Pinecone Hybrid Index\n\nBefore diving into the implementation, let’s quickly understand what hybrid search means. Hybrid search is an approach that combines the strengths of both traditional search (sparse vector search) and vector search (dense vector search) to achieve better search performance across a wide range of domains.\n\nDense vector search extracts high-quality vector embeddings from text data and performs a similarity search to find relevant documents. However, it often struggles with out-of-domain data when it’s not fine-tuned on domain-specific datasets.\n\nOn the other hand, traditional search uses sparse vector representations, like term frequency-inverse document frequency (TF-IDF) or BM25, and does not require any domain-specific fine-tuning. While it can handle new domains, its performance is limited by its inability to understand semantic relations between words and lacks the intelligence of dense retrieval.\n\nHybrid search tries to mitigate the weaknesses of both approaches by combining them in a single system, leveraging the performance potential of dense vector search and the zero-shot adaptability of traditional search.\n\nNow that we have a basic understanding of hybrid search, let’s dive into its implementation.\n\nBuilding a Hybrid Search System\n\nWe’ll cover the following steps for implementing a hybrid search system:\n\n\n  Leveraging Domain-Specific Language Models\n  Creating Sparse and Dense Vectors\n  Setting Up Pinecone\n  Implementing the Hybrid Search Pipeline\n  Making Queries and Tuning Parameters\n\n\n1. Leveraging Domain-Specific Language Models\n\nIn recent years, large-scale pre-trained language models like OpenAI’s GPT and Cohere have become increasingly popular for a variety of tasks, including natural language understanding and generation. These models can be fine-tuned on domain-specific data to improve their performance and adapt to specific tasks, such as e-commerce product search.\n\nIn our example, we will use a fine-tuned domain-specific language model to generate dense vector embeddings for products and queries. However, you can choose other models or even create your own custom embeddings based on your specific domain.\n\nimport torch\nfrom transformers import AutoTokenizer, AutoModel\n\n# Load a pre-trained domain-specific language model\nmodel_name = \"your-domain-specific-model\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModel.from_pretrained(model_name)\n\n# Generate dense vector embeddings for a product description\ntext = \"Nike Air Max sports shoes for men\"\ninputs = tokenizer(text, return_tensors=\"pt\")\nwith torch.no_grad():\n    outputs = model(**inputs)\n    dense_embedding = outputs.last_hidden_state.mean(dim=1).numpy()\n\n\n2. Creating Sparse and Dense Vectors\n\nHybrid search requires both sparse and dense vector representations for our e-commerce data. We’ll now describe how to generate these vectors.\n\nSparse Vectors\n\nSparse vector representations, like TF-IDF or BM25, can be created using standard text processing techniques, such as tokenization, stopword removal, and stemming. An example of generating sparse vectors can be achieved using a vocabulary matrix.\n\n# This function generates sparse vector representations of a list of product descriptions\ndef generate_sparse_vectors(text):\n    '''Generates sparse vector representations for a list of product descriptions\n\n    Args:\n        text (list): A list of product descriptions\n\n    Returns:\n        sparse_vector (dict): A dictionary of indices and values\n    '''\n    sparse_vector = bm25.encode_queries(text)\n    return sparse_vector\n\nfrom pinecone_text.sparse import BM25Encoder\n\n# Create the BM25 encoder and fit the data\nbm25 = BM25Encoder()\nbm25.fit(new_df.full_data)\n\n# Create the sparse vectors\nsparse_vectors = []\nfor product_description in product_descriptions:\n    sparse_vectors.append(generate_sparse_vectors(text=product_description))\n\n\nDense Vectors\n\nDense vector representations can be generated using pre-trained or custom domain-specific language models. In our previous example, we used a domain-specific language model to generate dense vector embeddings for a product description.\n\ndef generate_dense_vector(text):\n    '''Generates dense vector embeddings for a list of product descriptions\n\n    Args:\n        text (list): A list of product descriptions\n\n    Returns:\n        dense_embedding (np.array): A numpy array of dense vector embeddings\n    '''\n    # Tokenize the text and convert to PyTorch tensors\n    inputs = tokenizer(text, return_tensors=\"pt\")\n    # Generate the embeddings with the pre-trained model\n    with torch.no_grad():\n        outputs = model(**inputs)\n        dense_vector = outputs.last_hidden_state.mean(dim=1).numpy()\n    return dense_vector\n\n# Generate dense vector embeddings for a list of product descriptions\ndense_vectors = []\nfor product_description in product_descriptions:\n    dense_vectors.append(generate_dense_vector(text=product_description))\n\n\n3. Setting Up Pinecone\n\nPinecone is a high-performance vector search engine that supports hybrid search. It enables the creation of a single index for both sparse and dense vectors and seamlessly handles search queries across different data modalities.\n\nTo use Pinecone, you’ll need to sign up for an account, install the Pinecone client, and set up your API key and environment.\n\n# Create a Pinecone hybrid search index\nimport pinecone\n\npinecone.init(\n    api_key=\"YOUR_API_KEY\",  # app.pinecone.io\n    environment=\"YOUR_ENV\"  # find next to api key in console\n)\n\n# Create a Pinecone hybrid search index\nindex_name = \"ecommerce-hybrid-search\"\npinecone.create_index(\n    index_name = index_name,\n    dimension = MODEL_DIMENSION,  # dimensionality of dense model\n    metric = \"dotproduct\"\n)\n# connect to the index\nindex = pinecone.Index(index_name=index_name)\n# view index stats\nindex.describe_index_stats()\n\n\n4. Implementing the Hybrid Search Pipeline\n\nWith our sparse and dense vectors generated and Pinecone set up, we can now build a hybrid search pipeline. This pipeline includes the following steps:\n\n\n  Adding product data to the Pinecone index\n  Retrieving results using both sparse and dense vectors\n\n\ndef add_product_data_to_index(product_ids, sparse_vectors, dense_vectors, metadata=None):\n    \"\"\"Upserts product data to the Pinecone index.\n\n    Args:\n        product_ids (`list` of `str`): Product IDs.\n        sparse_vectors (`list` of `list` of `float`): Sparse vectors.\n        dense_vectors (`list` of `list` of `float`): Dense vectors.\n        metadata (`list` of `list` of `str`): Optional metadata.\n\n    Returns:\n        None\n    \"\"\"\n    batch_size = 32\n\n    # Loop through the product IDs in batches.\n    for i in range(0, len(product_ids), batch_size):\n        i_end = min(i + batch_size, len(product_ids))\n        ids = product_ids[i:i_end]\n        sparse_batch = sparse_vectors[i:i_end]\n        dense_batch = dense_vectors[i:i_end]\n        meta_batch = metadata[i:i_end] if metadata else []\n\n        vectors = []\n        for _id, sparse, dense, meta in zip(ids, sparse_batch, dense_batch, meta_batch):\n            vectors.append({\n                'id': _id,\n                'sparse_values': sparse,\n                'values': dense,\n                'metadata': meta\n            })\n\n        # Upsert the vectors into the Pinecone index.\n        index.upsert(vectors=vectors)\n\nadd_product_data_to_index(product_ids, sparse_vectors, dense_vectors)\n\n\nNow that our data is indexed, we can perform hybrid search queries.\n\n5. Making Queries and Tuning Parameters\n\n\nHigh-level view of simple Pinecone Hybrid Query\n\nTo make hybrid search queries, we’ll create a function that takes a query, the number of top results, and an alpha parameter to control the weighting between dense and sparse vector search scores.\n\ndef hybrid_scale(dense, sparse, alpha: float):\n    \"\"\"Hybrid vector scaling using a convex combination\n\n    alpha * dense + (1 - alpha) * sparse\n\n    Args:\n        dense: Array of floats representing\n        sparse: a dict of `indices` and `values`\n        alpha: float between 0 and 1 where 0 == sparse only\n               and 1 == dense only\n    \"\"\"\n    if alpha &lt; 0 or alpha &gt; 1:\n        raise ValueError(\"Alpha must be between 0 and 1\")\n    # scale sparse and dense vectors to create hybrid search vecs\n    hsparse = {\n        'indices': sparse['indices'],\n        'values':  [v * (1 - alpha) for v in sparse['values']]\n    }\n    hdense = [v * alpha for v in dense]\n    return hdense, hsparse\n\ndef search_products(query, top_k=10, alpha=0.5):\n    # Generate sparse query vector\n    sparse_query_vector = generate_sparse_vector(query)\n\n    # Generate dense query vector\n    dense_query_vector = generate_dense_vector(query)\n\n    # Calculate hybrid query vector\n    dense_query_vector, sparse_query_vector = hybrid_scale(dense_query_vector, sparse_query_vector, alpha)\n\n    # Search products using Pinecone\n    results = index.query(\n        vector=dense_query_vector,\n        sparse_vector=sparse_query_vector,\n        top_k=top_k\n    )\n\n    return results\n\n\nWe can then use this function to search for relevant products in our e-commerce dataset.\n\nquery = \"running shoes for women\"\nresults = search_products(query, top_k=5)\n\nfor result in results:\n    print(result['id'], result['metadata']['product_name'], result['score'])\n\n\nExperimenting with different values for the alpha parameter will help you find the optimal balance between sparse and dense vector search for your specific domain.\n\nConclusion\n\nIn this blog post, we demonstrated how to build a hybrid search system for e-commerce using Pinecone and domain-specific language models. Hybrid search enables us to combine the strengths of both traditional search and vector search, improving search performance and adaptability across diverse domains.\n\nBy following the steps and code snippets provided in this post, you can implement your own hybrid search system tailored to your e-commerce website’s specific requirements. Start exploring Pinecone and improve your e-commerce search experience today!\n\nReferences\n\n\n  Ecommerce Search using Hybrid Search Techniques in Pinecone (Google Colab Notebook): A practical guide showcasing the implementation of e-commerce search using Pinecone’s hybrid search techniques.\n  Pinecone Ecommerce Search Documentation: Official Pinecone documentation for building e-commerce search systems.\n  BM25 Vector Generation using Pinecone (Google Colab Notebook): A guide for generating BM25 sparse vectors using Pinecone.\n  Pinecone Text Repository on GitHub: A collection of text processing and vector generation resources using Pinecone.\n  Introduction to Hybrid Search on Pinecone’s Website: An overview of hybrid search, its benefits, and use cases in the context of pinecone’s capabilities.\n\n"
    },
  
    {
      "title"    : "Demystifying the Shell Scripting: Working with Files and Directories",
      "category" : "",
      "tags"     : "Shell Scripting, Bash, Shell, File Management, Directory Management",
      "url"      : "/2023/01/04/demystifying-the-shell-scripting-working-with-files-and-directories/",
      "date"     : "January 4, 2023",
      "excerpt"  : "Master the art of working with files and directories in shell scripting to streamline your tasks and improve efficiency. Learn how to create, copy, move, and delete files and directories, as well as read and write to files using practical examples. Discover the power of searching for files and directories with the `find` command. Enhance your shell scripting skills with valuable resources and tutorials, and unlock the full potential of file and directory management in the shell.",
      "content"  : "In my previous blog posts, we covered the basics of using the shell, introduced shell scripting for beginners, and explored advanced techniques and best practices. In this blog post, we will focus on working with files and directories in shell scripts. We will discuss common tasks such as creating, copying, moving, and deleting files and directories, as well as reading and writing to files. We will also provide some resources for further learning.\n\nCreating Files and Directories\n\nTo create a new file in a shell script, you can use the touch command:\n\ntouch new_file.txt\n\n\nTo create a new directory, you can use the mkdir command:\n\nmkdir new_directory\n\n\nCopying and Moving Files and Directories\n\nTo copy a file, you can use the cp command:\n\ncp source_file.txt destination_file.txt\n\n\nTo copy a directory, you can use the -r (recursive) option:\n\ncp -r source_directory destination_directory\n\n\nTo move a file or directory, you can use the mv command:\n\nmv source_file.txt destination_file.txt\n\n\nDeleting Files and Directories\n\nTo delete a file, you can use the rm command:\n\nrm file_to_delete.txt\n\n\nTo delete a directory, you can use the -r (recursive) option:\n\nrm -r directory_to_delete\n\n\nReading and Writing to Files\n\nTo read the contents of a file, you can use the cat command:\n\ncat file_to_read.txt\n\n\nTo write to a file, you can use the &gt; operator to overwrite the file or the &gt;&gt; operator to append to the file:\n\necho \"This is a new line\" &gt; file_to_write.txt\necho \"This is another new line\" &gt;&gt; file_to_write.txt\n\n\nTo read a file line by line, you can use a while loop with the read command:\n\n#!/bin/bash\n\nwhile IFS= read -r line; do\n  echo \"Line: $line\"\ndone &lt; file_to_read.txt\n\n\nSearching for Files and Directories\n\nTo search for files and directories, you can use the find command:\n\nfind /path/to/search -name \"file_pattern\"\n\n\nFor example, to find all .txt files in the /home/user directory, you can use:\n\nfind /home/user -name \"*.txt\"\n\n\nResources\n\nTo further improve your skills in working with files and directories in shell scripts, here are some resources:\n\n\n  File Management Commands in Linux: A comprehensive guide to file management commands in Linux.\n  Linux Find Command Examples: A collection of examples for using the find command in Linux.\n\n\nIn conclusion, working with files and directories is an essential aspect of shell scripting. By mastering common tasks such as creating, copying, moving, and deleting files and directories, as well as reading and writing to files, you will be well-equipped to handle a wide range of shell scripting tasks.\n"
    },
  
    {
      "title"    : "Demystifying the Shell Scripting: Advanced Techniques and Best Practices",
      "category" : "",
      "tags"     : "Shell Scripting, Bash, Shell, Error Handling, Command Substitution, Process Management, Best Practices",
      "url"      : "/2022/12/28/demystifying-the-shell-scripting-advanced-techniques-and-best-practices/",
      "date"     : "December 28, 2022",
      "excerpt"  : "Building upon the fundamentals of shell scripting, this guide delves into advanced techniques and best practices that will elevate your scripting skills. We will explore error handling, command substitution, process management, and share valuable tips for writing efficient, robust, and maintainable scripts. By mastering these advanced concepts, you will be well-equipped to tackle complex scripting challenges and harness the full power of shell scripting.",
      "content"  : "In my previous blog posts, we covered the basics of using the shell and introduced shell scripting for beginners. Now that you have a solid foundation in shell scripting, it’s time to explore some advanced techniques and best practices that will help you write more efficient, robust, and maintainable scripts. In this blog post, we will discuss error handling, command substitution, process management, and best practices for writing shell scripts. We will also provide some resources for further learning.\n\nError Handling\n\nError handling is an essential aspect of writing robust shell scripts. By default, shell scripts continue to execute subsequent commands even if an error occurs. To change this behavior and make your script exit immediately if a command fails, you can use the set -e option:\n\n#!/bin/bash\nset -e\n\n# Your script here\n\n\nYou can also use the trap command to define custom error handling behavior. For example, you can create a cleanup function that will be called if your script exits unexpectedly:\n\n#!/bin/bash\n\nfunction cleanup() {\n  echo \"Cleaning up before exiting...\"\n  # Your cleanup code here\n}\n\ntrap cleanup EXIT\n\n# Your script here\n\n\nCommand Substitution\n\nCommand substitution allows you to capture the output of a command and store it in a variable. This can be useful for processing the output of a command within your script. There are two ways to perform command substitution:\n\n\n  Using backticks (` `):\n\n\noutput=`ls`\n\n\n\n  Using $():\n\n\noutput=$(ls)\n\n\nThe $() syntax is preferred because it is more readable and can be easily nested.\n\nProcess Management\n\nShell scripts often need to manage background processes, such as starting, stopping, or monitoring them. Here are some useful commands for process management:\n\n\n  &amp;: Run a command in the background by appending an ampersand (&amp;) to the command.\n\n\nlong_running_command &amp;\n\n\n\n  wait: Wait for a background process to complete before continuing with the script.\n\n\nlong_running_command &amp;\nwait\n\n\n\n  kill: Terminate a process by sending a signal to it.\n\n\nkill -9 process_id\n\n\n\n  ps: List running processes and their process IDs.\n\n\nps aux\n\n\nBest Practices\n\nHere are some best practices for writing shell scripts:\n\n\n  Use meaningful variable and function names.\n  Add comments to explain complex or non-obvious code.\n  Use indentation and whitespace to improve readability.\n  Keep your scripts modular by breaking them into smaller functions.\n  Use the local keyword to limit the scope of variables within functions.\n  Always quote your variables to prevent issues with spaces and special characters.\n  Use the [[ ]] syntax for conditional expressions, as it is more robust than [ ].\n\n\nResources\n\nTo further improve your shell scripting skills, here are some resources:\n\n\n  Google Shell Style Guide: A comprehensive style guide for writing shell scripts, created by Google.\n  ShellCheck: A static analysis tool for shell scripts that can help you identify and fix potential issues in your code.\n  Awesome Shell: A curated list of awesome command-line frameworks, toolkits, guides, and other resources for shell scripting.\n\n\nIn conclusion, mastering advanced techniques and best practices in shell scripting will help you write more efficient, robust, and maintainable scripts. By understanding error handling, command substitution, process management, and following best practices, you will be well on your way to becoming a shell scripting expert.\n"
    },
  
    {
      "title"    : "Demystifying the Shell Scripting: A Beginner&#39;s Guide",
      "category" : "",
      "tags"     : "Shell Scripting, Bash, Shell",
      "url"      : "/2022/12/28/demystifying-the-shell-scripting-a-beginners-guide/",
      "date"     : "December 28, 2022",
      "excerpt"  : "Shell scripting is a powerful tool that enables users to automate tasks, perform complex operations, and create custom commands. In this beginner&#39;s guide, we will explore the basics of shell scripting, including creating and executing scripts, working with variables, control structures, loops, and functions. By understanding these fundamental concepts, you will be well on your way to mastering shell scripting and unlocking its full potential.",
      "content"  : "In my previous blog post, we introduced the basics of using the shell, navigating within it, connecting programs, and some miscellaneous tips and tricks. Now that you have a good understanding of the shell, it’s time to take your skills to the next level by learning shell scripting. Shell scripting allows you to automate tasks, perform complex operations, and create custom commands. In this blog post, we will explore the basics of shell scripting, including variables, control structures, loops, and functions. We will also provide some resources for further learning.\n\nWhat is Shell Scripting?\n\nShell scripting is the process of writing a series of commands in a text file (called a script) that can be executed by the shell. These scripts can be used to automate repetitive tasks, perform complex operations, and create custom commands. Shell scripts are typically written in the same language as the shell itself (e.g., Bash, Zsh, or Fish).\n\nCreating a Shell Script\n\nTo create a shell script, simply create a new text file with the extension .sh (e.g., myscript.sh). The first line of the script should be a “shebang” (#!) followed by the path to the shell interpreter (e.g., #!/bin/bash for Bash scripts). This line tells the operating system which interpreter to use when executing the script.\n\nHere’s an example of a simple shell script that prints “Hello, World!” to the console:\n\n#!/bin/bash\n\necho \"Hello, World!\"\n\n\nTo execute the script, you need to make it executable by changing its permissions using the chmod command:\n\nchmod +x myscript.sh\n\n\nNow you can run the script by typing ./myscript.sh in the terminal.\n\nVariables\n\nVariables in shell scripts are used to store values that can be referenced and manipulated throughout the script. To create a variable, use the = operator without any spaces:\n\nmy_variable=\"Hello, World!\"\n\n\nTo reference the value of a variable, use the $ symbol:\n\necho $my_variable\n\n\nControl Structures\n\nControl structures, such as if statements and case statements, allow you to add conditional logic to your shell scripts. Here’s an example of an if statement:\n\n#!/bin/bash\n\nnumber=5\n\nif [ $number -gt 3 ]; then\n  echo \"The number is greater than 3.\"\nelse\n  echo \"The number is not greater than 3.\"\nfi\n\n\nIn this example, the script checks if the value of the number variable is greater than 3 and prints a message accordingly.\n\nLoops\n\nLoops allow you to execute a block of code multiple times. There are two main types of loops in shell scripting: for loops and while loops. Here’s an example of a for loop:\n\n#!/bin/bash\n\nfor i in {1..5}; do\n  echo \"Iteration $i\"\ndone\n\n\nThis script will print the message “Iteration X” five times, with X being the current iteration number.\n\nFunctions\n\nFunctions are reusable blocks of code that can be called with a specific set of arguments. To create a function, use the function keyword followed by the function name and a pair of parentheses:\n\n#!/bin/bash\n\nfunction greet() {\n  echo \"Hello, $1!\"\n}\n\ngreet \"World\"\n\n\nIn this example, the greet function takes one argument ($1) and prints a greeting message using that argument.\n\nResources\n\nTo further improve your shell scripting skills, here are some resources:\n\n\n  Shell Scripting Tutorial: A comprehensive tutorial covering all aspects of shell scripting.\n  Bash Guide for Beginners: A beginner-friendly guide to Bash scripting.\n  Advanced Bash-Scripting Guide: A more advanced guide for those looking to deepen their understanding of Bash scripting.\n\n\nIn conclusion, shell scripting is a powerful tool that allows you to automate tasks, perform complex operations, and create custom commands. By understanding the basics of shell scripting, including variables, control structures, loops, and functions, you will be well on your way to becoming a shell scripting expert.\n"
    },
  
    {
      "title"    : "Demystifying the Shell: A Beginner&#39;s Guide",
      "category" : "",
      "tags"     : "Bash, Shell",
      "url"      : "/2022/12/28/demystifying-the-shell-a-beginners-guide/",
      "date"     : "December 28, 2022",
      "excerpt"  : "Discover the power of the shell, a command-line interface that allows you to interact with your computer&#39;s operating system more directly and efficiently. Learn the basics of using the shell, navigating within it, and connecting programs using simple examples. Enhance your skills with miscellaneous tips and resources, including tab completion, command history, keyboard shortcuts, and helpful online tools. Embrace the command line and unlock the full potential of the shell!",
      "content"  : "The shell is an essential tool for any developer, system administrator, or even a casual computer user. It allows you to interact with your computer’s operating system using text-based commands, giving you more control and flexibility than graphical user interfaces (GUIs). In this blog post, we will explore the basics of using the shell, navigating within it, connecting programs, and some miscellaneous tips and tricks. We will also provide some resources for further learning.\n\nWhat is the Shell?\n\nThe shell is a command-line interface (CLI) that allows you to interact with your computer’s operating system by typing commands. It is a program that takes your commands, interprets them, and then sends them to the operating system to be executed. There are various types of shells available, such as Bash (Bourne Again SHell), Zsh (Z Shell), and Fish (Friendly Interactive SHell), each with its own unique features and capabilities.\n\nUsing the Shell\n\nTo start using the shell, you need to open a terminal emulator. On Linux and macOS, you can usually find the terminal application in your Applications or Utilities folder. On Windows, you can use the Command Prompt, PowerShell, or install a third-party terminal emulator like Git Bash or Windows Subsystem for Linux (WSL).\n\nOnce you have opened the terminal, you can start typing commands. For example, to list the files and directories in your current directory, you can type the following command:\n\nls\n\n\nThis command will display the contents of your current directory. You can also use flags (options) to modify the behavior of a command. For example, to display the contents of a directory in a more detailed format, you can use the -l flag:\n\nls -l\n\n\nNavigating in the Shell\n\nNavigating within the shell is quite simple. You can use the cd (change directory) command to move between directories. For example, to move to the /home/user/Documents directory, you can type:\n\ncd /home/user/Documents\n\n\nTo move up one directory level, you can use the .. notation:\n\ncd ..\n\n\nYou can also use the pwd (print working directory) command to display the current directory you are in:\n\npwd\n\n\nConnecting Programs\n\nIn the shell, you can connect multiple programs together using pipes (|). This allows you to pass the output of one program as input to another program. For example, you can use the grep command to search for a specific word in a file, and then use the wc (word count) command to count the number of lines containing that word:\n\ngrep 'search_word' file.txt | wc -l\n\n\nThis command will first search for the word ‘search_word’ in the file ‘file.txt’ and then count the number of lines containing that word.\n\nMiscellaneous\n\nHere are some miscellaneous tips and tricks for using the shell:\n\n\n  Use the history command to view your command history.\n  Use the clear command to clear the terminal screen.\n  Use the man command followed by a command name to view the manual page for that command (e.g., man ls).\n  Use the TAB key to auto-complete file and directory names.\n  Use the CTRL + C keyboard shortcut to cancel a running command.\n\n\nResources\n\nTo further improve your shell skills, here are some resources:\n\n\n  LinuxCommand.org: This website provides a wealth of information on using the shell, including tutorials, examples, and reference material.\n  ExplainShell: This is an online tool that allows you to enter a shell command and receive a detailed explanation of what each part of the command does.\n  Bash Cheat Sheet: This is a handy reference guide that provides a quick overview of common Bash commands and syntax.\n  ShellCheck: This is an online tool that can help you find and fix issues in your shell scripts. It provides suggestions and explanations for common mistakes and best practices.\n\n\nIn conclusion, mastering the shell is an essential skill for any computer user. It allows you to interact with your computer’s operating system more efficiently and effectively than using graphical user interfaces. By understanding the basics of using the shell, navigating within it, connecting programs, and learning some miscellaneous tips and tricks, you will be well on your way to becoming a shell expert.\n"
    },
  
    {
      "title"    : "Version Control (Git)",
      "category" : "",
      "tags"     : "Git, Version Control",
      "url"      : "/2022/12/21/version-control/",
      "date"     : "December 21, 2022",
      "excerpt"  : "How to use version control _properly_, and take advantage of it to save you from disaster, collaborate with others, and quickly find and isolate problematic changes. No more `rm -rf; git clone`. No more merge conflicts (well, fewer of them at least). No more huge blocks of commented-out code. No more fretting over how to find what broke your code. No more &quot;oh no, did we delete the working code?!&quot;.",
      "content"  : "Version control systems (VCSs) are tools used to track changes to source code (or other collections of files and folders). As the name implies, these tools help maintain a history of changes; furthermore, they facilitate collaboration. VCSs track changes to a folder and its contents in a series of snapshots, where\neach snapshot encapsulates the entire state of files/folders within a top-level directory. VCSs also maintain metadata like who created each snapshot, messages associated with each snapshot, and so on.\n\nWhy is version control useful? Even when you’re working by yourself, it can let you look at old snapshots of a project, keep a log of why certain changes were\nmade, work on parallel branches of development, and much more. When working with others, it’s an invaluable tool for seeing what other people have changed, as well as resolving conflicts in concurrent development.\n\nModern VCSs also let you easily (and often automatically) answer questions like:\n\n\n  Who wrote this module?\n  When was this particular line of this particular file edited? By whom? Why was it edited?\n  Over the last 1000 revisions, when/why did a particular unit test stop working?\n\n\nWhile other VCSs exist, Git is the de facto standard for version control.\nThis XKCD comic captures Git’s reputation:\n\n\n\nBecause Git’s interface is a leaky abstraction, learning Git top-down (starting with its interface / command-line interface) can lead to a lot of confusion. It’s possible to memorize a handful of commands and think of them as magic incantations, and follow the approach in the comic above whenever anything goes wrong.\n\nWhile Git admittedly has an ugly interface, its underlying design and ideas are beautiful. While an ugly interface has to be memorized, a beautiful design can be understood. For this reason, we give a bottom-up explanation of Git, starting with its data model and later covering the command-line interface. Once the data model is understood, the commands can be better understood in terms of how they manipulate the underlying data model.\n\nGit’s data model\n\nThere are many ad-hoc approaches you could take to version control. Git has a well-thought-out model that enables all the nice features of version control, like maintaining history, supporting branches, and enabling collaboration.\n\nSnapshots\n\nGit models the history of a collection of files and folders within some top-level directory as a series of snapshots. In Git terminology, a file is called a “blob”, and it’s just a bunch of bytes. A directory is called a “tree”, and it maps names to blobs or trees (so directories can contain other directories). A snapshot is the top-level tree that is being tracked. For example, we might have a tree as follows:\n\n&lt;root&gt; (tree)\n|\n+- foo (tree)\n|  |\n|  + bar.txt (blob, contents = \"hello world\")\n|\n+- baz.txt (blob, contents = \"git is wonderful\")\n\n\nThe top-level tree contains two elements, a tree “foo” (that itself contains one element, a blob “bar.txt”), and a blob “baz.txt”.\n\nModeling history: relating snapshots\n\nHow should a version control system relate snapshots? One simple model would be to have a linear history. A history would be a list of snapshots in time-order. For many reasons, Git doesn’t use a simple model like this.\n\nIn Git, a history is a directed acyclic graph (DAG) of snapshots. That may sound like a fancy math word, but don’t be intimidated. All this means is that each snapshot in Git refers to a set of “parents”, the snapshots that preceded it. It’s a set of parents rather than a single parent (as would be the case in a linear history) because a snapshot might descend from multiple parents, for example, due to combining (merging) two parallel branches of development.\n\nGit calls these snapshots “commit”s. Visualizing a commit history might look something like this:\n\no &lt;-- o &lt;-- o &lt;-- o\n            ^\n             \\\n              --- o &lt;-- o\n\n\nIn the ASCII art above, the os correspond to individual commits (snapshots). The arrows point to the parent of each commit (it’s a “comes before” relation, not “comes after”). After the third commit, the history branches into two separate branches. This might correspond to, for example, two separate features being developed in parallel, independently from each other. In the future, these branches may be merged to create a new snapshot that incorporates both of the features, producing a new history that looks like this, with the newly created merge commit shown in bold:\n\n\n\no &lt;-- o &lt;-- o &lt;-- o &lt;---- o\n            ^            /\n             \\          v\n              --- o &lt;-- o\n\n\n\nCommits in Git are immutable. This doesn’t mean that mistakes can’t be corrected, however; it’s just that “edits” to the commit history are actually creating entirely new commits, and references (see below) are updated to point to the new ones.\n\nData model, as pseudocode\n\nIt may be instructive to see Git’s data model written down in pseudocode:\n\n// a file is a bunch of bytes\ntype blob = array&lt;byte&gt;\n\n// a directory contains named files and directories\ntype tree = map&lt;string, tree | blob&gt;\n\n// a commit has parents, metadata, and the top-level tree\ntype commit = struct {\n    parents: array&lt;commit&gt;\n    author: string\n    message: string\n    snapshot: tree\n}\n\n\nIt’s a clean, simple model of history.\n\nObjects and content-addressing\n\nAn “object” is a blob, tree, or commit:\n\ntype object = blob | tree | commit\n\n\nIn Git data store, all objects are content-addressed by their SHA-1\nhash.\n\nobjects = map&lt;string, object&gt;\n\ndef store(object):\n    id = sha1(object)\n    objects[id] = object\n\ndef load(id):\n    return objects[id]\n\n\nBlobs, trees, and commits are unified in this way: they are all objects. When they reference other objects, they don’t actually contain them in their on-disk representation, but have a reference to them by their hash.\n\nFor example, the tree for the example directory structure above\n(visualized using git cat-file -p 698281bc680d1995c5f4caaf3359721a5a58d48d),\nlooks like this:\n\n100644 blob 4448adbf7ecd394f42ae135bbeed9676e894af85    baz.txt\n040000 tree c68d233a33c5c06e0340e4c224f0afca87c8ce87    foo\n\n\nThe tree itself contains pointers to its contents, baz.txt (a blob) and foo\n(a tree). If we look at the contents addressed by the hash corresponding to\nbaz.txt with git cat-file -p 4448adbf7ecd394f42ae135bbeed9676e894af85, we get\nthe following:\n\ngit is wonderful\n\n\nReferences\n\nNow, all snapshots can be identified by their SHA-1 hashes. That’s inconvenient, because humans aren’t good at remembering strings of 40 hexadecimal characters.\n\nGit’s solution to this problem is human-readable names for SHA-1 hashes, called “references”. References are pointers to commits. Unlike objects, which are\nimmutable, references are mutable (can be updated to point to a new commit). For example, the master reference usually points to the latest commit in the\nmain branch of development.\n\nreferences = map&lt;string, string&gt;\n\ndef update_reference(name, id):\n    references[name] = id\n\ndef read_reference(name):\n    return references[name]\n\ndef load_reference(name_or_id):\n    if name_or_id in references:\n        return load(references[name_or_id])\n    else:\n        return load(name_or_id)\n\n\nWith this, Git can use human-readable names like “master” to refer to a particular snapshot in the history, instead of a long hexadecimal string.\n\nOne detail is that we often want a notion of “where we currently are” in the history, so that when we take a new snapshot, we know what it is relative to (how we set the parents field of the commit). In Git, that “where we currently are” is a special reference called “HEAD”.\n\nRepositories\n\nFinally, we can define what (roughly) is a Git repository: it is the data objects and references.\n\nOn disk, all Git stores are objects and references: that’s all there is to Git’s data model. All git commands map to some manipulation of the commit DAG by\nadding objects and adding/updating references.\n\nWhenever you’re typing in any command, think about what manipulation the command is making to the underlying graph data structure. Conversely, if you’re trying to make a particular kind of change to the commit DAG, e.g. “discard uncommitted changes and make the ‘master’ ref point to commit 5d83f9e”, there’s probably a command to do it (e.g. in this case, git checkout master; git reset --hard 5d83f9e).\n\nStaging area\n\nThis is another concept that’s orthogonal to the data model, but it’s a part of the interface to create commits.\n\nOne way you might imagine implementing snapshotting as described above is to have a “create snapshot” command that creates a new snapshot based on the current state of the working directory. Some version control tools work like this, but not Git. We want clean snapshots, and it might not always be ideal to make a snapshot from the current state. For example, imagine a scenario where you’ve implemented two separate features, and you want to create two separate commits, where the first introduces the first feature, and the next introduces the second feature. Or imagine a scenario where you have debugging print statements added all over your code, along with a bugfix; you want to commit the bugfix while discarding all the print statements.\n\nGit accommodates such scenarios by allowing you to specify which modifications should be included in the next snapshot through a mechanism called the “staging area”.\n\nGit command-line interface\n\nTo avoid duplicating information, we’re not going to explain the commands below in detail. See the highly recommended Pro Git for more information.\n\nBasics\n\nThe git init command initializes a new Git repository, with repository metadata being stored in the .git directory:\n\n$ mkdir myproject\n$ cd myproject\n$ git init\nInitialized empty Git repository in .git\n$ git status\nOn branch master\nNo commits yet\nnothing to commit (create/copy files and use \"git add\" to track)\n\n\nHow do we interpret this output? “No commits yet” basically means our version\nhistory is empty. Let’s fix that.\n\n$ echo \"hello, git\" &gt; hello.txt\n$ git add hello.txt\n$ git status\nOn branch master\nNo commits yet\nChanges to be committed:\n  (use \"git rm --cached &lt;file&gt;...\" to unstage)\n        new file:   hello.txt\n$ git commit -m 'Initial commit'\n[master (root-commit) 4515d17] Initial commit\n 1 file changed, 1 insertion(+)\n create mode 100644 hello.txt\n\n\nWith this, we’ve git added a file to the staging area, and then git commited that change, adding a simple commit message “Initial commit”. If we didn’t specify a -m option, Git would open our text editor to allow us type a commit message.\n\nNow that we have a non-empty version history, we can visualize the history. Visualizing the history as a DAG can be especially helpful in understanding the current status of the repo and connecting it with your understanding of the Git data model.\n\nThe git log command visualizes history. By default, it shows a flattened version, which hides the graph structure. If you use a command like git log --all --graph --decorate, it will show you the full version history of the repository, visualized in graph form.\n\n$ git log --all --graph --decorate\n* commit 4515d17a167bdef0a91ee7d50d75b12c9c2652aa (HEAD -&gt; master)\n  Author: Subramanya N &lt;subramanyanagabhushan@gmail.com&gt;\n  Date: Tue Dec 21 22:18:36 2020 -0500\n      Initial commit\n\n\nThis doesn’t look all that graph-like, because it only contains a single node. Let’s make some more changes, author a new commit, and visualize the history once more.\n\n$ echo \"another line\" &gt;&gt; hello.txt\n$ git status\nOn branch master\nChanges not staged for commit:\n  (use \"git add &lt;file&gt;...\" to update what will be committed)\n  (use \"git checkout -- &lt;file&gt;...\" to discard changes in working directory)\n        modified:   hello.txt\nno changes added to commit (use \"git add\" and/or \"git commit -a\")\n$ git add hello.txt\n$ git status\nOn branch master\nChanges to be committed:\n  (use \"git reset HEAD &lt;file&gt;...\" to unstage)\n        modified:   hello.txt\n$ git commit -m 'Add a line'\n[master 35f60a8] Add a line\n 1 file changed, 1 insertion(+)\n\n\nNow, if we visualize the history again, we’ll see some of the graph structure:\n\n* commit 35f60a825be0106036dd2fbc7657598eb7b04c67 (HEAD -&gt; master)\n| Author: Subramanya N &lt;subramanyanagabhushan@gmail.com&gt;\n| Date:   Tue Dec 21 22:26:20 2020 -0500\n|     Add a line\n* commit 4515d17a167bdef0a91ee7d50d75b12c9c2652aa\n  Author: Subramanya N &lt;subramanyanagabhushan@gmail.com&gt;\n  Date: Tue Dec 21 22:18:36 2020 -0500\n      Initial commit\n\n\nAlso, note that it shows the current HEAD, along with the current branch\n(master).\n\nWe can look at old versions using the git checkout command.\n\n$ git checkout 4515d17  # previous commit hash; yours will be different\nNote: checking out '4515d17'.\nYou are in 'detached HEAD' state. You can look around, make experimental\nchanges and commit them, and you can discard any commits you make in this\nstate without impacting any branches by performing another checkout.\nIf you want to create a new branch to retain commits you create, you may\ndo so (now or later) by using -b with the checkout command again. Example:\n  git checkout -b &lt;new-branch-name&gt;\nHEAD is now at 4515d17 Initial commit\n$ cat hello.txt\nhello, git\n$ git checkout master\nPrevious HEAD position was 4515d17 Initial commit\nSwitched to branch 'master'\n$ cat hello.txt\nhello, git\nanother line\n\n\nGit can show you how files have evolved (differences, or diffs) using the git\ndiff command:\n\n$ git diff 4515d17 hello.txt\ndiff --git c/hello.txt w/hello.txt\nindex 94bab17..f0013b2 100644\n--- c/hello.txt\n+++ w/hello.txt\n@@ -1 +1,2 @@\n hello, git\n +another line\n\n\n\n  git help &lt;command&gt;: get help for a git command\n  git init: creates a new git repo, with data stored in the .git directory\n  git status: tells you what’s going on\n  git add &lt;filename&gt;: adds files to staging area\n  git commit: creates a new commit\n    \n      Write good commit messages!\n      Even more reasons to write good commit messages!\n    \n  \n  git log: shows a flattened log of history\n  git log --all --graph --decorate: visualizes history as a DAG\n  git diff &lt;filename&gt;: show changes you made relative to the staging area\n  git diff &lt;revision&gt; &lt;filename&gt;: shows differences in a file between snapshots\n  git checkout &lt;revision&gt;: updates HEAD and current branch\n\n\nBranching and merging\n\nBranching allows you to “fork” version history. It can be helpful for working on independent features or bug fixes in parallel. The git branch command can be used to create new branches; git checkout -b &lt;branch name&gt; creates and branch and checks it out.\n\nMerging is the opposite of branching: it allows you to combine forked version histories, e.g. merging a feature branch back into master. The git merge command is used for merging.\n\n\n  git branch: shows branches\n  git branch &lt;name&gt;: creates a branch\n  git checkout -b &lt;name&gt;: creates a branch and switches to it\n    \n      same as git branch &lt;name&gt;; git checkout &lt;name&gt;\n    \n  \n  git merge &lt;revision&gt;: merges into current branch\n  git mergetool: use a fancy tool to help resolve merge conflicts\n  git rebase: rebase set of patches onto a new base\n\n\nRemotes\n\n\n  git remote: list remotes\n  git remote add &lt;name&gt; &lt;url&gt;: add a remote\n  git push &lt;remote&gt; &lt;local branch&gt;:&lt;remote branch&gt;: send objects to remote, and update remote reference\n  git branch --set-upstream-to=&lt;remote&gt;/&lt;remote branch&gt;: set up correspondence between local and remote branch\n  git fetch: retrieve objects/references from a remote\n  git pull: same as git fetch; git merge\n  git clone: download repository from remote\n\n\nUndo\n\n\n  git commit --amend: edit a commit’s contents/message\n  git reset HEAD &lt;file&gt;: unstage a file\n  git checkout -- &lt;file&gt;: discard changes\n\n\nAdvanced Git\n\n\n  git config: Git is highly customizable\n  git clone --depth=1: shallow clone, without entire version history\n  git add -p: interactive staging\n  git rebase -i: interactive rebasing\n  git blame: show who last edited which line\n  git stash: temporarily remove modifications to working directory\n  git bisect: binary search history (e.g. for regressions)\n  .gitignore: specify intentionally untracked files to ignore\n\n\nMiscellaneous\n\n\n  GUIs: there are many GUI clients\nout there for Git. We personally don’t use them and use the command-line\ninterface instead.\n  Shell integration: it’s super handy to have a Git status as part of your\nshell prompt (zsh,\nbash). Often included in\nframeworks like Oh My Zsh.\n  Editor integration: similarly to the above, handy integrations with many\nfeatures. fugitive.vim is the standard\none for Vim.\n  Workflows: we taught you the data model, plus some basic commands; we\ndidn’t tell you what practices to follow when working on big projects (and\nthere are many\ndifferent\napproaches).\n  GitHub: Git is not GitHub. GitHub has a specific way of contributing code\nto other projects, called pull\nrequests.\n  Other Git providers: GitHub is not special: there are many Git repository\nhosts, like GitLab and\nBitBucket.\n\n\nResources\n\n\n  Pro Git is highly recommended reading.\nGoing through Chapters 1–5 should teach you most of what you need to use Git\nproficiently, now that you understand the data model. The later chapters have\nsome interesting, advanced material.\n  Oh Shit, Git!?! is a short guide on how to recover\nfrom some common Git mistakes.\n  Git for Computer\nScientists is a\nshort explanation of Git’s data model, with less pseudocode and more fancy\ndiagrams than these lecture notes.\n  Git from the Bottom Up\nis a detailed explanation of Git’s implementation details beyond just the data\nmodel, for the curious.\n  How to explain git in simple\nwords\n  Learn Git Branching is a browser-based\ngame that teaches you Git.\n\n"
    }
  
  ,
  
    {
      "title"    : "Navigating UMass Amherst: A Handbook for International Students",
      "category" : "book",
      "tags"     : "Handbook, UMass Amherst, International Students",
      "url"      : "/books/navigating-umass-amherst-a-handbook-for-international-students/",
      "date"     : "May 8, 2023",
      "excerpt"  : "This handbook, penned by an international student at UMass Amherst, shares insights and advice based on personal experiences navigating academic and cultural transitions. The author has undertaken a variety of courses and collaborated with prominent entities, contributing to the vibrant academic community. Aimed at making the journey less daunting for future students, the handbook touches on academic expectations, cultural nuances, and logistical issues, while providing resource links for deeper exploration. It&#39;s a tool for sharing collective wisdom, rather than a definitive guide or shortcut to success.",
      "content"  : "\n"
    }
  
]