Documentation Index Fetch the complete documentation index at: https://mintlify.com/microsoft/graphrag/llms.txt
Use this file to discover all available pages before exploring further.
This guide walks you through deploying GraphRAG using Azure services, including Azure OpenAI for language models and Azure Storage for scalable data management.
Why Azure?
Azure provides enterprise-grade features for GraphRAG deployments:
Managed OpenAI - No API key rotation, enterprise SLAs
Scalable storage - Blob Storage and Cosmos DB integration
Security - Managed identities, VNet integration, private endpoints
Compliance - Meet regulatory requirements with Azure’s certifications
Cost management - Detailed billing and budget controls
Prerequisites
Azure subscription
Ensure you have an active Azure subscription with appropriate permissions.
Login to Azure
Authenticate with your Azure account: az login
az account set --subscription "Your Subscription Name"
Set up Azure OpenAI
Create Azure OpenAI resource
Create an Azure OpenAI service instance: az cognitiveservices account create \
--name graphrag-openai \
--resource-group graphrag-resources \
--location eastus \
--kind OpenAI \
--sku S0 \
--custom-domain graphrag-openai
Deploy models
Deploy the required models (chat and embeddings): # Deploy GPT-4 for chat
az cognitiveservices account deployment create \
--name graphrag-openai \
--resource-group graphrag-resources \
--deployment-name gpt-4-deployment \
--model-name gpt-4 \
--model-version "0613" \
--model-format OpenAI \
--sku-capacity 10 \
--sku-name "Standard"
# Deploy text-embedding-3-small for embeddings
az cognitiveservices account deployment create \
--name graphrag-openai \
--resource-group graphrag-resources \
--deployment-name embedding-deployment \
--model-name text-embedding-3-small \
--model-version "1" \
--model-format OpenAI \
--sku-capacity 10 \
--sku-name "Standard"
Retrieve endpoint and key
Get your Azure OpenAI endpoint and API key: # Get endpoint
az cognitiveservices account show \
--name graphrag-openai \
--resource-group graphrag-resources \
--query "properties.endpoint" \
--output tsv
# Get API key
az cognitiveservices account keys list \
--name graphrag-openai \
--resource-group graphrag-resources \
--query "key1" \
--output tsv
Create storage account
Create an Azure Storage account for your data: az storage account create \
--name graphragstorage \
--resource-group graphrag-resources \
--location eastus \
--sku Standard_LRS \
--kind StorageV2
Create containers
Create containers for input and output data: # Get connection string
CONNECTION_STRING = $( az storage account show-connection-string \
--name graphragstorage \
--resource-group graphrag-resources \
--query "connectionString" \
--output tsv )
# Create containers
az storage container create \
--name graphrag-input \
--connection-string " $CONNECTION_STRING "
az storage container create \
--name graphrag-output \
--connection-string " $CONNECTION_STRING "
Update environment variables
Create or update your .env file: GRAPHRAG_API_KEY = your-azure-openai-api-key
AZURE_STORAGE_CONNECTION_STRING = your-storage-connection-string
Configure settings.yaml
Update settings.yaml with Azure-specific configuration: # Azure OpenAI Configuration
completion_models :
default_completion_model :
type : chat
model_provider : azure
model : gpt-4
deployment_name : gpt-4-deployment
api_base : https://graphrag-openai.openai.azure.com
api_version : 2024-02-15-preview
api_key : ${GRAPHRAG_API_KEY}
embedding_models :
default_embedding_model :
type : embedding
model_provider : azure
model : text-embedding-3-small
deployment_name : embedding-deployment
api_base : https://graphrag-openai.openai.azure.com
api_version : 2024-02-15-preview
api_key : ${GRAPHRAG_API_KEY}
# Azure Blob Storage for Input
input :
storage :
type : blob
connection_string : ${AZURE_STORAGE_CONNECTION_STRING}
container_name : graphrag-input
type : text
file_pattern : .*\.txt$
# Azure Blob Storage for Output
output :
type : blob
connection_string : ${AZURE_STORAGE_CONNECTION_STRING}
container_name : graphrag-output
# Optional: Azure Blob Storage for Cache
cache :
type : json
storage :
type : blob
connection_string : ${AZURE_STORAGE_CONNECTION_STRING}
container_name : graphrag-cache
Using managed identity (recommended)
For production deployments, use Azure Managed Identity instead of API keys:
Create managed identity
Create a user-assigned managed identity: az identity create \
--name graphrag-identity \
--resource-group graphrag-resources
Grant permissions
Assign the managed identity to Azure OpenAI: # Get principal ID
PRINCIPAL_ID = $( az identity show \
--name graphrag-identity \
--resource-group graphrag-resources \
--query "principalId" \
--output tsv )
# Assign Cognitive Services User role
az role assignment create \
--role "Cognitive Services User" \
--assignee " $PRINCIPAL_ID " \
--scope "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.CognitiveServices/accounts/graphrag-openai"
# Assign Storage Blob Data Contributor role
az role assignment create \
--role "Storage Blob Data Contributor" \
--assignee " $PRINCIPAL_ID " \
--scope "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.Storage/storageAccounts/graphragstorage"
Update configuration
Modify settings.yaml to use managed identity: completion_models :
default_completion_model :
type : chat
model_provider : azure
model : gpt-4
deployment_name : gpt-4-deployment
api_base : https://graphrag-openai.openai.azure.com
api_version : 2024-02-15-preview
auth_method : azure_managed_identity # Use managed identity
# Remove api_key line
Authenticate Azure CLI
Login with the managed identity:
Deploy to Azure Container Instances
Run GraphRAG in Azure Container Instances for scheduled indexing:
Create Dockerfile
FROM python:3.11-slim
WORKDIR /app
RUN pip install graphrag
COPY settings.yaml .
COPY .env .
CMD [ "graphrag" , "index" , "--root" , "/app" ]
Build and push image
Build and push to Azure Container Registry: # Create container registry
az acr create \
--name graphragregistry \
--resource-group graphrag-resources \
--sku Basic
# Build and push
az acr build \
--registry graphragregistry \
--image graphrag:latest .
Deploy to ACI
Create a container instance: az container create \
--resource-group graphrag-resources \
--name graphrag-indexer \
--image graphragregistry.azurecr.io/graphrag:latest \
--cpu 2 \
--memory 4 \
--registry-login-server graphragregistry.azurecr.io \
--registry-username $( az acr credential show --name graphragregistry --query username -o tsv ) \
--registry-password $( az acr credential show --name graphragregistry --query passwords[0].value -o tsv ) \
--environment-variables \
GRAPHRAG_API_KEY=" $GRAPHRAG_API_KEY " \
AZURE_STORAGE_CONNECTION_STRING=" $CONNECTION_STRING "
Optional: Azure Cosmos DB storage
For enhanced scalability, use Azure Cosmos DB:
Create Cosmos DB account
az cosmosdb create \
--name graphrag-cosmos \
--resource-group graphrag-resources \
--kind GlobalDocumentDB
Configure in settings.yaml
output :
type : cosmosdb
connection_string : ${COSMOS_CONNECTION_STRING}
database_name : graphrag
container_name : output
Cost optimization
Model selection
Rate limiting
Storage tiers
Provisioned throughput
Choose cost-effective models:
Use gpt-3.5-turbo instead of gpt-4 for initial testing
Use text-embedding-3-small instead of text-embedding-3-large
Configure rate limits to control costs: completion_models :
default_completion_model :
rate_limit :
requests_per_period : 60
period_in_seconds : 60
Use appropriate storage tiers:
Hot tier for active data
Cool tier for archival
Configure lifecycle policies
For Azure OpenAI:
Use provisioned throughput for predictable workloads
Monitor and adjust capacity
Monitoring and logging
Enable diagnostics
Enable diagnostic logging for Azure OpenAI: az monitor diagnostic-settings create \
--name graphrag-diagnostics \
--resource "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.CognitiveServices/accounts/graphrag-openai" \
--logs '[{"category": "RequestResponse", "enabled": true}]' \
--metrics '[{"category": "AllMetrics", "enabled": true}]' \
--workspace YOUR_LOG_ANALYTICS_WORKSPACE_ID
Set up alerts
Create alerts for cost and performance: az monitor metrics alert create \
--name high-token-usage \
--resource-group graphrag-resources \
--scopes "/subscriptions/YOUR_SUBSCRIPTION_ID/resourceGroups/graphrag-resources/providers/Microsoft.CognitiveServices/accounts/graphrag-openai" \
--condition "total ProcessedPromptTokens > 1000000" \
--description "Alert when token usage exceeds threshold"
Security best practices
Use managed identities Avoid storing credentials, use Azure Managed Identity
Private endpoints Configure private endpoints for Azure services
Network security Implement VNet integration and firewall rules
Key rotation Automate API key rotation using Key Vault
Next steps
Multi-lingual support Deploy GraphRAG for multiple languages
Enterprise knowledge Enterprise deployment patterns
Configuration reference Complete configuration guide
Azure documentation Azure OpenAI documentation