Learn how to add real-time web search capabilities to your Azure AI agents with Bing Grounding, complete with citations and production-ready code.

#Azure Bing Grounding: Web Search in AI Responses

Your AI agent can answer questions about your data, but what happens when users ask about current events, recent product releases, or breaking news? The model's knowledge cutoff becomes a hard wall. You could build a custom web scraping system, but that's complex, unreliable, and potentially violates terms of service.

Enter Azure Bing Grounding. It's Microsoft's answer to giving AI agents access to real-time web information, complete with automatic citations and source verification. No web scraping, no custom search indexing, no complex infrastructure. Just add a tool to your agent and let it search when needed.

In this guide, we'll explore how to provision Bing Grounding with Terraform, integrate it with Azure AI Foundry projects, and build production-ready agents that cite their sources.

#What is Bing Grounding and Why It Matters

Bing Grounding is an Azure service that allows AI agents to search the web using Bing and incorporate results directly into their responses. Unlike traditional RAG (Retrieval Augmented Generation) systems that search your private data, Bing Grounding searches the public web in real-time.

#Key Benefits

Real-time Information: Agents can access current information beyond their training data cutoff. Market prices, weather, news, documentation updates—anything publicly available on the web.

Automatic Citations: Every piece of information from Bing includes a citation with title and URL. This provides transparency and allows users to verify sources, which is critical for trust and compliance.

Agent-Controlled Search: The model decides when to search based on the user's question. You don't need to detect search intent manually—the agent handles it intelligently.

Cost Optimization: A single Bing Grounding resource can be shared across multiple environments (development, staging, production), reducing costs while maintaining isolation at the project level.

#How It Works

When you enable Bing Grounding on an Azure AI agent:

User asks a question
Agent determines if web search would be helpful
Agent calls Bing Grounding tool automatically
Bing returns search results with URLs
Agent incorporates results into response
Citations are extracted from annotations

The agent decides when to search based on the context. Ask about last week's tech news, and it searches. Ask about your application's business logic, and it uses its base knowledge.

#Provisioning Bing Grounding with Terraform

Bing Grounding requires the Microsoft.Bing/accounts resource type with kind Bing.Grounding. This is different from the older Bing Search API v7 that uses Cognitive Services.

#Setting Up the Resource Group

First, create a dedicated resource group for Bing. This allows you to share the Bing resource across environments while maintaining isolation.

# Resource Group for Bing grounding (on Bing subscription)
# Shared across environments for cost optimization
# Dev creates it, Prod imports it via data source
resource "azurerm_resource_group" "bing_grounding" {
  count    = var.env == "dev" ? 1 : 0
  provider = azurerm.bing
  name     = "shared-bing-rg"
  location = "westeurope" # Must be a real region

  tags = {
    environment = "shared"
    project     = "ai-platform"
    managedBy   = "terraform"
    purpose     = "shared-grounding"
  }
}

# Data source to reference existing Bing RG in prod
data "azurerm_resource_group" "bing_grounding_existing" {
  count    = var.env == "prod" ? 1 : 0
  provider = azurerm.bing
  name     = "shared-bing-rg"
}

# Local to get the resource group ID regardless of environment
locals {
  bing_rg_id = var.env == "dev" ?
    azurerm_resource_group.bing_grounding[0].id :
    data.azurerm_resource_group.bing_grounding_existing[0].id
}

This pattern creates the resource group in development and references it in production, avoiding duplicate costs.

#Creating the Bing Grounding Resource

Use the azapi provider to create the Bing Grounding resource. The standard azurerm provider doesn't support this resource type yet.

# Bing Grounding resource
# Uses Microsoft.Bing/accounts API (not CognitiveServices Bing.Search.v7)
# Only created in dev, shared with prod via data source
resource "azapi_resource" "bing_grounding" {
  count                     = var.env == "dev" ? 1 : 0
  type                      = "Microsoft.Bing/accounts@2020-06-10"
  name                      = "shared-bing-grounding"
  parent_id                 = local.bing_rg_id
  location                  = "global"
  schema_validation_enabled = false

  body = {
    kind = "Bing.Grounding"
    sku = {
      name = "G1"
    }
    properties = {
      statisticsEnabled = false
    }
  }

  tags = {
    environment = var.env
    project     = "ai-platform"
    managedBy   = "terraform"
    purpose     = "shared-grounding"
  }

  response_export_values = ["properties.endpoint"]
}

# Data source to reference existing Bing Grounding in prod
data "azapi_resource" "bing_grounding_existing" {
  count                  = var.env == "prod" ? 1 : 0
  type                   = "Microsoft.Bing/accounts@2020-06-10"
  name                   = "shared-bing-grounding"
  parent_id              = local.bing_rg_id
  response_export_values = ["properties.endpoint"]
}

# Locals to access Bing resources regardless of environment
locals {
  bing_grounding_id = var.env == "dev" ?
    azapi_resource.bing_grounding[0].id :
    data.azapi_resource.bing_grounding_existing[0].id

  bing_grounding_endpoint = var.env == "dev" ?
    azapi_resource.bing_grounding[0].output.properties.endpoint :
    data.azapi_resource.bing_grounding_existing[0].output.properties.endpoint
}

Important Notes:

Location must be "global" for Bing Grounding
SKU G1 is the standard tier for grounding
Only dev environment creates the resource; prod references it

#Retrieving API Keys

Use the listKeys action to retrieve the API keys securely:

# Get Bing API keys using listKeys action
data "azapi_resource_action" "bing_keys" {
  type                   = "Microsoft.Bing/accounts@2020-06-10"
  resource_id            = local.bing_grounding_id
  action                 = "listKeys"
  response_export_values = ["*"]
}

output "bing_grounding_key" {
  description = "Bing grounding API key (sensitive)"
  value       = data.azapi_resource_action.bing_keys.output.key1
  sensitive   = true
}

Store the key in Azure Key Vault or as a secret in your deployment pipeline. Never commit it to source control.

#Connecting Bing to Azure AI Foundry Projects

Once you have the Bing resource provisioned, connect it to your AI Foundry project. This makes Bing Grounding available as a tool for agents in that project.

#Creating the Connection

Use the Microsoft.CognitiveServices/accounts/connections API to create the connection:

# Connection: Bing Grounding to AI Foundry Project
# This enables Bing grounding capabilities for AI agents in the project
# Compatible with GPT-4o, GPT-3.5-turbo, GPT-4, GPT-4-turbo
resource "azapi_resource" "bing_connection" {
  type                      = "Microsoft.CognitiveServices/accounts/connections@2025-06-01"
  name                      = "bing-grounding-connection"
  parent_id                 = azurerm_cognitive_account.ai_foundry.id
  schema_validation_enabled = false

  body = {
    properties = {
      category      = "ApiKey"
      target        = local.bing_grounding_endpoint
      authType      = "ApiKey"
      isSharedToAll = true

      metadata = {
        Location   = "westeurope"
        ApiType    = "Azure"
        ResourceId = local.bing_grounding_id
      }

      credentials = {
        key = data.azapi_resource_action.bing_keys.output.key1
      }
    }
  }

  depends_on = [
    data.azapi_resource_action.bing_keys,
    azapi_resource.ai_foundry_project
  ]
}

Key Properties:

category: Must be "ApiKey" for Bing Grounding
isSharedToAll: Set to true to make available to all agents in the project
credentials.key: The API key retrieved from the Bing resource

#Model Compatibility

Bing Grounding works with most Azure OpenAI models:

✅ gpt-4o
✅ gpt-4-turbo
✅ gpt-4
✅ gpt-3.5-turbo
❌ gpt-4o-mini-2024-07-18 (not compatible)

If you need to use gpt-4o-mini, verify compatibility with the latest Azure AI documentation.

#Building an Agent Service with Bing Grounding

With infrastructure provisioned, let's build a TypeScript service that creates agents with Bing Grounding enabled.

#Service Configuration

First, configure environment variables for authentication and project access:

// Environment variables needed:
// - PROJECT_ENDPOINT: Azure AI Foundry project endpoint
// - MODEL_DEPLOYMENT_NAME: Name of your GPT-4o deployment
// - BING_CONNECTION_ID: Connection ID from Terraform output
// - AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET (local dev)
// - Or use Managed Identity in production

import { AgentsClient, ToolUtility } from '@azure/ai-agents';
import { DefaultAzureCredential } from '@azure/identity';

export class AzureAgentService {
	private client: AgentsClient;
	private modelDeploymentName: string;
	private bingConnectionId?: string;

	constructor() {
		const projectEndpoint = process.env.PROJECT_ENDPOINT;
		this.modelDeploymentName = process.env.MODEL_DEPLOYMENT_NAME;
		this.bingConnectionId = process.env.BING_CONNECTION_ID;

		if (!projectEndpoint || !this.modelDeploymentName) {
			throw new Error('PROJECT_ENDPOINT and MODEL_DEPLOYMENT_NAME environment variables are required');
		}

		// DefaultAzureCredential supports:
		// 1. EnvironmentCredential (local dev)
		// 2. ManagedIdentityCredential (production)
		// 3. AzureCliCredential (fallback)
		const credential = new DefaultAzureCredential();

		this.client = new AgentsClient(projectEndpoint, credential);
	}
}

#Creating Agents with Bing Grounding

Enable Bing Grounding by adding it as a tool during agent creation:

interface AgentConfig {
  name: string;
  instructions?: string;
  enableBingGrounding?: boolean; // Default: true
}

async createAgent(config: AgentConfig) {
  const tools = [];

  // Always add Bing grounding tool unless explicitly disabled
  const shouldEnableBing = config.enableBingGrounding !== false;

  if (shouldEnableBing && this.bingConnectionId) {
    const bingTool = ToolUtility.createBingGroundingTool([
      {
        connectionId: this.bingConnectionId,
        count: 20, // Number of search results (max: 50, default: 5)
      },
    ]);
    tools.push(bingTool.definition);
  }

  const agent = await this.client.createAgent(this.modelDeploymentName, {
    name: config.name,
    instructions: config.instructions,
    tools: tools.length > 0 ? tools : undefined,
    temperature: 0.2, // Low temperature for consistent, factual responses
  });

  return agent;
}

Important Parameters:

count: Number of search results to retrieve (1-50). Higher values provide more context but use more tokens. 20 is a good balance for quality.
temperature: Lower values (0.1-0.3) work better for factual responses with citations.

#Configuring Search Result Count

The count parameter controls how many Bing search results the agent can access. This directly impacts response quality and cost:

Search result count options:

5 results (default) - Simple queries, cost-sensitive (~500 tokens)
10 results - Standard queries, balanced (~1,000 tokens)
20 results - Complex queries, high quality (~2,000 tokens)
50 results (maximum) - Research tasks, comprehensive (~5,000 tokens)

Higher counts improve accuracy for complex queries but increase prompt tokens significantly. Start with 10-20 and adjust based on your use case.

#Sending Messages and Extracting Citations

Once you have an agent with Bing Grounding, you can send messages and extract citations from responses.

#Sending Messages

Create a thread, send a message, and poll for completion:

async sendMessage(
  agentId: string,
  userMessage: string
): Promise<AgentResponse> {
  // Create a thread for this conversation
  const thread = await this.client.threads.create();

  try {
    // Create user message
    await this.client.messages.create(thread.id, 'user', userMessage);

    // Create and poll run
    const runStatus = await this.client.runs.createAndPoll(thread.id, agentId, {
      pollingOptions: {
        intervalInMs: 15000, // 15 seconds to avoid rate limits
      },
    });

    // Check if run failed
    if (runStatus.status === 'failed') {
      const errorMessage = runStatus.lastError?.message || 'Unknown error';
      throw new Error(`Agent run failed: ${errorMessage}`);
    }

    // Get the agent's response with citations
    const response = await this.getLatestAgentMessage(thread.id);

    return {
      content: response.content,
      citations: response.citations,
      metadata: {
        agentId,
        threadId: thread.id,
        runId: runStatus.id,
        tokensUsed: runStatus.usage ? {
          prompt: runStatus.usage.promptTokens,
          completion: runStatus.usage.completionTokens,
          total: runStatus.usage.totalTokens,
        } : undefined,
      },
    };
  } finally {
    // Clean up thread
    await this.client.threads.delete(thread.id);
  }
}

Polling Considerations:

Use a longer polling interval (15+ seconds) to avoid rate limit exhaustion
Azure AI Foundry has rate limits on API calls (e.g., 400 capacity = 40 req/sec)
Adjust based on your deployment's capacity and concurrent operations

#Extracting Citations from Responses

Citations are embedded in message annotations. Extract them using type guards:

import type {
  MessageTextContent,
  ThreadMessage,
  MessageTextUrlCitationAnnotation,
} from '@azure/ai-agents';

private extractMessageContent(message: ThreadMessage): {
  content: string;
  citations: Array<{ title: string; url: string }>;
} {
  const contentParts: string[] = [];
  const citations: Array<{ title: string; url: string }> = [];

  for (const content of message.content) {
    if (isOutputOfType<MessageTextContent>(content, 'text')) {
      contentParts.push(content.text.value);

      // Extract citations from annotations
      if (content.text.annotations) {
        for (const annotation of content.text.annotations) {
          if (
            isOutputOfType<MessageTextUrlCitationAnnotation>(
              annotation,
              'url_citation'
            )
          ) {
            citations.push({
              title: annotation.urlCitation.title || annotation.urlCitation.url,
              url: annotation.urlCitation.url,
            });
          }
        }
      }
    }
  }

  return {
    content: contentParts.join('\n'),
    citations,
  };
}

Citation Structure:

Each citation includes a title and URL
Title defaults to URL if not provided
Citations maintain order of appearance in the response
Duplicate URLs may appear if referenced multiple times

#Example Usage

Here's how to use the service for a one-off query:

const agentService = new AzureAgentService();

// Create an agent with Bing Grounding (enabled by default)
const agent = await agentService.createAgent({
	name: 'research-assistant',
	instructions:
		'You are a helpful research assistant. Use Bing grounding to provide accurate, up-to-date information with citations.',
});

try {
	// Send a query
	const response = await agentService.sendMessage(
		agent.id,
		'What are the latest developments in quantum computing from the past month?',
	);

	console.log('Response:', response.content);
	console.log('Citations:');
	response.citations?.forEach((citation, index) => {
		console.log(`${index + 1}. ${citation.title}`);
		console.log(`   ${citation.url}`);
	});

	console.log('Tokens used:', response.metadata.tokensUsed?.total);
} finally {
	// Clean up
	await agentService.deleteAgent(agent.id);
}

Example Output:

Response: Recent developments in quantum computing include IBM's announcement
of a 1,121-qubit quantum processor, Google's breakthrough in quantum error
correction, and Microsoft's release of Azure Quantum Elements for chemistry
simulations.

Citations:
1. IBM Unveils 1,121-Qubit Quantum Processor
   https://www.ibm.com/quantum/news/1121-qubit-processor
2. Google achieves quantum error correction milestone
   https://blog.google/technology/quantum-ai/quantum-error-correction/
3. Microsoft Azure Quantum Elements for chemistry
   https://azure.microsoft.com/en-us/products/quantum/elements/

Tokens used: 3245

#Cost Considerations and Pricing Tiers

Bing Grounding pricing consists of two components: the Bing resource itself and the tokens consumed during agent runs.

#Bing Resource Pricing

The G1 SKU for Bing Grounding is billed per 1,000 transactions (queries):

G1 Tier: Approximately $7 per 1,000 queries
No minimum commitment: Pay only for what you use
Shared resource: One resource can serve multiple projects and environments

#Token Consumption

Search results add to prompt tokens. Typical token usage:

Typical token usage by search result count:

5 results - ~500 tokens added
10 results - ~1,000 tokens added
20 results - ~2,000 tokens added
50 results - ~5,000 tokens added

With GPT-4o pricing at ~$5 per 1M prompt tokens, 20 search results add roughly $0.01 per query in token costs.

#Total Cost Example

For 10,000 queries per month with 20 search results each:

Bing Grounding: 10,000 queries × ($7 / 1,000) = $70
Token overhead: 10,000 queries × 2,000 tokens × ($5 / 1M) = $100
Total: ~$170/month for grounded search

Compare this to building a custom web search solution with scraping infrastructure, storage, and maintenance. Bing Grounding is significantly cheaper and more reliable.

A single Bing Grounding resource can be shared across multiple environments (dev, staging, prod) while maintaining project-level isolation. This reduces costs without sacrificing security.

The shared Bing Grounding resource sits at the top of your infrastructure hierarchy. You create it once in the development environment, and then each environment's AI Foundry project - whether dev, staging, production, or disaster recovery - connects to that same resource. Each project maintains its own agents, threads, and conversation history, so there's no data leakage between environments. You get the cost savings of a single Bing resource while preserving complete isolation at the application layer.

Use conditional resource creation and data sources:

# Dev creates the resource
resource "azapi_resource" "bing_grounding" {
  count = var.env == "dev" ? 1 : 0
  # ... resource configuration
}

# Other environments reference it
data "azapi_resource" "bing_grounding_existing" {
  count = var.env != "dev" ? 1 : 0
  type  = "Microsoft.Bing/accounts@2020-06-10"
  name  = "shared-bing-grounding"
  # ... rest of configuration
}

# Each environment creates its own connection
resource "azapi_resource" "bing_connection" {
  # All environments create their own connection
  # but reference the same Bing resource
  parent_id = azurerm_cognitive_account.ai_foundry.id
  # ... connection configuration
}

#Security Considerations

Project Isolation: Each AI Foundry project has its own threads, agents, and conversation history. Sharing the Bing resource does not compromise data isolation.

API Key Rotation: When rotating Bing API keys, update connections in all environments. Use a secret management system like Azure Key Vault for centralized key distribution.

Rate Limiting: All environments share the same rate limits on the Bing resource. Monitor usage to avoid throttling in production due to dev/staging activity.

#Cost Savings

Sharing one Bing resource across 4 environments (dev, staging, prod, DR):

Without Sharing: 4 × $7/1,000 queries = $28 per 1,000 queries (if each used separately)
With Sharing: 1 × $7/1,000 queries = $7 per 1,000 queries
Savings: 75% reduction in Bing resource costs

This is one of the most effective cost optimizations for multi-environment deployments.

#Comparing Bing Grounding with Other Approaches

Bing Grounding is one of several ways to provide AI agents with additional context. Understanding the tradeoffs helps you choose the right tool.

#Bing Grounding vs. RAG (Retrieval Augmented Generation)

RAG: Private Data Search

Searches your own documents, databases, knowledge bases
Full control over content and indexing
No internet access required
Ideal for internal company knowledge, proprietary data

Bing Grounding: Public Web Search

Searches the public web via Bing
Access to real-time information and current events
Automatic citations from web sources
Ideal for current events, market data, public documentation

When to Use Both: Use RAG for internal knowledge (product specs, company policies) and Bing Grounding for external context (market trends, competitor analysis, current events).

// Example: Combining RAG and Bing Grounding
const agent = await agentService.createAgent({
	name: 'hybrid-assistant',
	instructions: `You are a business intelligence assistant.

  Use Azure AI Search (RAG) for:
  - Internal sales data
  - Product specifications
  - Company policies

  Use Bing Grounding for:
  - Competitor news and announcements
  - Market trends and analysis
  - Industry regulations and updates`,
	// Enable both tools
	enableBingGrounding: true,
	// RAG would be added here via Azure AI Search connection
});

#Bing Grounding vs. Custom Web Tools

Custom Web Tools (Function Calling)

You write the search/scraping logic
Full control over sources and parsing
Complex to maintain and scale
Legal and ethical considerations for scraping

Bing Grounding

Microsoft handles search infrastructure
Built-in citation extraction
No scraping or legal concerns
Limited customization (no custom search filters)

When to Use Custom Tools: When you need to search specific sites (e.g., internal documentation hosted externally), parse non-standard formats, or have specific data extraction requirements that Bing doesn't support.

#Bing Grounding vs. Web Browser Tools

Some AI frameworks provide browser automation tools (Selenium, Playwright) for agents. These let agents navigate websites interactively.

Web Browser Tools

Full browser automation (click, scroll, fill forms)
Can access content behind authentication
Very slow (seconds to minutes per page)
High token consumption (full page HTML)
Complex error handling