Azure Databricks Agent Bricks: Building AI Agents Directly on Your Data Platform

In January 2026, Microsoft announced the general availability of Azure Databricks Agent Bricks—a native capability for creating, deploying, and managing AI agents directly within the Databricks platform. This integration unifies data engineering, machine learning, and agentic AI development in a single environment, enabling data teams to build intelligent agents that have native access to lakehouse data, Delta tables, and Unity Catalog governance. For enterprises already invested in Databricks, Agent Bricks eliminates the need for separate agent infrastructure while leveraging existing data assets.

What is Agent Bricks?

Agent Bricks is Databricks’ native framework for building AI agents that can:

  • Query lakehouse data: Agents have native SQL/Spark access to Delta tables
  • Use Unity Catalog governance: All agent data access follows existing permissions
  • Execute notebooks: Agents can run parameterized notebooks as tools
  • Leverage MLflow models: Call registered models for inference
  • Access vector search: Use Databricks Vector Search for RAG applications
  • Integrate with Feature Store: Real-time feature retrieval for personalization

Unlike standalone agent frameworks, Agent Bricks is data-native—agents operate within the same security, governance, and compute infrastructure as your data pipelines.

Agent Bricks Architecture

graph TB
    subgraph Users ["Users & Applications"]
        App["Application"]
        API["REST API"]
        Chat["Chat Interface"]
    end
    
    subgraph AgentBricks ["Agent Bricks Runtime"]
        Orchestrator["Agent Orchestrator"]
        Planner["Task Planner"]
        Executor["Tool Executor"]
        Memory["Conversation Memory"]
    end
    
    subgraph Tools ["Native Tools"]
        SQLTool["SQL Query Tool"]
        NotebookTool["Notebook Tool"]
        ModelTool["Model Serving Tool"]
        VectorTool["Vector Search Tool"]
        FeatureTool["Feature Store Tool"]
    end
    
    subgraph Databricks ["Databricks Platform"]
        Unity["Unity Catalog"]
        Delta["Delta Lake"]
        MLflow["MLflow Registry"]
        VectorDB["Vector Search"]
        Serving["Model Serving"]
    end
    
    App --> API
    Chat --> API
    API --> Orchestrator
    Orchestrator --> Planner
    Planner --> Executor
    Executor --> Memory
    
    Executor --> SQLTool
    Executor --> NotebookTool
    Executor --> ModelTool
    Executor --> VectorTool
    Executor --> FeatureTool
    
    SQLTool --> Unity
    Unity --> Delta
    NotebookTool --> Delta
    ModelTool --> MLflow
    ModelTool --> Serving
    VectorTool --> VectorDB
    FeatureTool --> Delta
    
    style AgentBricks fill:#E8F5E9,stroke:#2E7D32
    style Tools fill:#E3F2FD,stroke:#1565C0
    style Databricks fill:#FFF3E0,stroke:#EF6C00

Getting Started with Agent Bricks

# In a Databricks notebook
from databricks.agents import Agent, AgentConfig
from databricks.agents.tools import SQLQueryTool, VectorSearchTool, NotebookTool

# Configure the agent
config = AgentConfig(
    name="sales-analyst-agent",
    description="AI agent for sales data analysis and reporting",
    model="databricks-dbrx-instruct",  # or "azure-openai-gpt5"
    
    # System prompt defining agent behavior
    system_prompt="""You are a sales data analyst with access to the company's 
    sales lakehouse. You can query sales data, generate reports, and provide 
    insights. Always cite the specific tables and queries you use.
    
    Available data:
    - sales.orders: Order transactions
    - sales.customers: Customer master data  
    - sales.products: Product catalog
    - sales.forecasts: ML-generated sales forecasts
    """,
    
    # Memory configuration
    memory_config={
        "type": "conversation",
        "max_turns": 20,
        "persist_to_table": "agents.sales_analyst_memory"
    }
)

# Create agent with tools
agent = Agent(config)

# Add SQL query tool with Unity Catalog governance
agent.add_tool(SQLQueryTool(
    name="query_sales_data",
    description="Execute SQL queries against the sales lakehouse",
    catalog="main",
    schema="sales",
    allowed_tables=["orders", "customers", "products", "forecasts"],
    max_rows=10000
))

# Add vector search for semantic queries
agent.add_tool(VectorSearchTool(
    name="search_product_docs",
    description="Search product documentation and specifications",
    index_name="main.sales.product_docs_index",
    num_results=5
))

# Add notebook tool for complex analysis
agent.add_tool(NotebookTool(
    name="run_analysis",
    description="Run parameterized analysis notebooks",
    notebook_paths={
        "cohort_analysis": "/Repos/analytics/cohort_analysis",
        "forecast_refresh": "/Repos/analytics/forecast_model"
    }
))

Running the Agent

# Interactive session
session = agent.create_session(user_id="analyst@company.com")

# First query
response = await session.chat(
    "What were our top 5 products by revenue last quarter?"
)

print(response.answer)
# Output:
# Based on my query of sales.orders joined with sales.products, 
# your top 5 products by Q4 2025 revenue were:
# 
# | Product | Revenue | Units Sold |
# |---------|---------|------------|
# | Enterprise Suite | $2.4M | 1,247 |
# | Pro License | $1.8M | 3,891 |
# | Team Pack | $1.2M | 2,156 |
# | Starter Kit | $890K | 8,934 |
# | Add-on Bundle | $654K | 4,521 |
#
# Query used: [expandable SQL]

print("
Tools used:")
for tool_call in response.tool_calls:
    print(f"  {tool_call.tool}: {tool_call.summary}")
    
# Output:
# Tools used:
#   query_sales_data: Aggregated orders by product for Q4 2025

# Follow-up with context
response = await session.chat(
    "Great. Now run a cohort analysis on Enterprise Suite customers"
)

# Agent uses notebook tool for complex analysis
print(response.answer)
# Output:
# I've run the cohort analysis notebook for Enterprise Suite customers.
# Key findings:
# - 78% 12-month retention rate
# - Average expansion revenue: 34% in year 2
# - Highest churn risk: customers without training sessions
# 
# Full report saved to: /Volumes/main/sales/reports/enterprise_cohort_jan2026.html

Unity Catalog Governance

Agent Bricks inherits Unity Catalog’s security model, ensuring agents only access data their users are authorized to see:

from databricks.agents import AgentPermissions

# Define agent permissions using Unity Catalog
permissions = AgentPermissions(
    # Agent identity for audit logging
    service_principal="agent-sales-analyst-sp",
    
    # Inherit user permissions (recommended)
    permission_mode="user_delegation",  # or "service_principal"
    
    # Additional restrictions
    restrictions={
        "blocked_columns": ["customer_ssn", "credit_card"],
        "row_filters": {
            "sales.orders": "region IN ('US', 'EU')",  # Limit to regions
        },
        "max_query_cost": 100,  # Cost units
    }
)

agent = Agent(config, permissions=permissions)

# When user asks for data they can't access:
response = await session.chat("Show me customer SSN numbers")
# Output: I'm unable to access the customer_ssn column due to 
# data governance policies. I can help with other customer 
# attributes like name, email, and purchase history.
💡
USER DELEGATION MODE

In user_delegation mode, the agent runs queries with the calling user’s permissions. This means different users get different data access through the same agent—no need for per-user agent configurations.

RAG with Databricks Vector Search

from databricks.agents.tools import VectorSearchTool, RAGConfig

# Create RAG-enabled agent
rag_config = RAGConfig(
    vector_index="main.docs.product_documentation_index",
    embedding_model="databricks-bge-large-en",
    chunk_size=512,
    num_results=5,
    
    # Hybrid search combining vector + keyword
    search_mode="hybrid",
    keyword_weight=0.3,
    
    # Re-ranking for precision
    reranker_model="databricks-reranker-v1",
    rerank_top_k=3
)

agent.add_tool(VectorSearchTool(
    name="search_docs",
    description="Search product documentation, knowledge base, and support tickets",
    rag_config=rag_config
))

# Query requiring retrieval
response = await session.chat(
    "What's the recommended configuration for high-availability deployments?"
)

print(response.answer)
# Answer synthesized from retrieved documentation

print("
Sources:")
for source in response.sources:
    print(f"  - {source.title} (relevance: {source.score:.0%})")
    
# Output:
# Sources:
#   - HA Deployment Guide v3.2 (relevance: 94%)
#   - Architecture Best Practices (relevance: 89%)
#   - Customer Success Story: FinCorp HA Setup (relevance: 78%)

Model Serving Integration

from databricks.agents.tools import ModelServingTool

# Add ML model as agent tool
agent.add_tool(ModelServingTool(
    name="predict_churn",
    description="Predict customer churn probability for a given customer ID",
    endpoint_name="churn-prediction-endpoint",
    input_schema={
        "customer_id": "string",
    },
    output_interpretation="""
    Returns churn probability (0-1) and top risk factors.
    High risk: > 0.7, Medium: 0.4-0.7, Low: < 0.4
    """
))

agent.add_tool(ModelServingTool(
    name="forecast_sales",
    description="Generate sales forecast for next N months",
    endpoint_name="sales-forecast-endpoint",
    input_schema={
        "product_id": "string",
        "months_ahead": "integer"
    }
))

# Agent uses ML models in responses
response = await session.chat(
    "Which of our Enterprise Suite customers are at high churn risk?"
)

# Agent:
# 1. Queries customers table for Enterprise Suite
# 2. Calls churn prediction model for each
# 3. Filters to high risk (> 0.7)
# 4. Synthesizes response with actionable insights

Deploying Agents to Production

from databricks.agents import AgentDeployment

# Register agent in Unity Catalog
agent.register(
    catalog="main",
    schema="agents",
    name="sales_analyst_v1"
)

# Deploy as REST endpoint
deployment = AgentDeployment(
    agent_name="main.agents.sales_analyst_v1",
    
    # Compute configuration
    compute={
        "min_instances": 1,
        "max_instances": 10,
        "scale_to_zero": False,
        "instance_type": "Standard_DS3_v2"
    },
    
    # Rate limiting
    rate_limits={
        "requests_per_minute": 100,
        "tokens_per_minute": 50000
    },
    
    # Monitoring
    monitoring={
        "log_requests": True,
        "log_responses": True,  # For debugging; disable in prod
        "metrics_table": "main.agents.sales_analyst_metrics",
        "alerts": {
            "error_rate_threshold": 0.05,
            "latency_p99_threshold_ms": 5000
        }
    }
)

endpoint = await deployment.deploy()
print(f"Agent deployed at: {endpoint.url}")

# Call from external application
import requests

response = requests.post(
    f"{endpoint.url}/chat",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "session_id": "user-session-123",
        "message": "What's our revenue trend this quarter?"
    }
)
print(response.json()["answer"])
⚠️
COST MANAGEMENT

Agent Bricks usage is billed per token (LLM) plus compute time for SQL queries and notebook execution. Set query cost limits and token budgets to prevent runaway costs from complex multi-step agent tasks.

Comparison with External Agent Frameworks

FeatureAgent BricksLangChain + DatabricksAutoGen + Databricks
Lakehouse native✅ Built-in❌ Connector required❌ Connector required
Unity Catalog governance✅ Automatic⚠️ Manual configuration⚠️ Manual configuration
Vector Search integration✅ Native✅ Via connector✅ Via connector
Model Serving✅ Direct endpoint calls✅ Via API✅ Via API
Notebook execution✅ Native tool❌ Custom implementation❌ Custom implementation
Feature Store✅ Native tool⚠️ Separate SDK⚠️ Separate SDK
Multi-agent orchestration⚠️ Basic✅ Advanced✅ Advanced

Key Takeaways

  • Agent Bricks provides data-native AI agent development within Azure Databricks.
  • Unity Catalog integration ensures agents inherit existing data governance policies automatically.
  • Native tools for SQL, Vector Search, Model Serving, and Notebooks eliminate integration overhead.
  • User delegation allows a single agent to serve multiple users with appropriate data access.
  • Production deployment is streamlined with built-in REST endpoints, monitoring, and rate limiting.

Conclusion

Azure Databricks Agent Bricks represents a fundamental shift toward data-native AI agents. Rather than connecting external agent frameworks to your lakehouse, Agent Bricks embeds agent capabilities directly into the data platform, inheriting all the governance, security, and compute infrastructure already in place. For enterprises with significant Databricks investments, this approach dramatically reduces the complexity of building AI agents that need access to production data—while ensuring those agents respect the same access controls as every other data consumer.

References


Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.