Retrieval methods

GraphRAG provides multiple retrieval strategies, each optimized for different types of questions. Understanding when to use each method is key to getting the best results from your knowledge graph.

Overview of retrieval methods

Unlike traditional RAG systems that rely solely on semantic similarity, GraphRAG offers four distinct search approaches:

Global search

Best for: Dataset-wide questions requiring holistic understandingUses community summaries to reason about themes, trends, and high-level patterns across the entire corpus.

Local search

Best for: Entity-specific questions requiring detailed informationCombines knowledge graph structure with text chunks to gather comprehensive information about specific entities.

DRIFT search

Best for: Exploratory queries needing both breadth and depthDynamically combines global and local approaches through iterative question refinement.

Basic search

Best for: Simple semantic similarity queriesTraditional top-k vector search over text units when graph structure isn’t needed.

Global search

Global search addresses a critical weakness in baseline RAG: answering questions that require holistic understanding of an entire dataset.

When to use global search

Ideal queries
Why baseline RAG fails

Thematic questions:

“What are the top 5 themes in this data?”
“What are the main trends discussed?”
“Summarize the overall narrative”

Aggregation questions:

“What are the most significant findings?”
“What patterns emerge across the dataset?”
“What are the key takeaways?”

Comparison questions:

“What are the major differences between X and Y?”
“How do various perspectives compare?”
“What are the competing viewpoints?”

How global search works

Select community level

Choose which hierarchy level to use based on desired granularity:

Root level: Fastest, most abstract (2-5 communities)
Mid level: Balanced detail and coverage
Leaf level: Most comprehensive, slowest (hundreds of communities)

# Configuration
context_builder_params = {
    "community_level": 2,  # Choose hierarchy level
}

Map phase

Community reports are processed in parallel:

Chunk reports: Split reports into token-sized chunks
Generate intermediate responses: Each chunk produces a list of points with importance ratings
Rate points: LLM assigns 1-10 importance scores to each point

Reduce phase

Aggregate intermediate responses:

Rank points: Sort all points by importance rating
Filter: Keep only highest-rated points that fit in context window
Synthesize: LLM generates final response from aggregated points

The final response is a comprehensive answer drawing from across the dataset.

Configuration

Core parameters

from graphrag.query.structured_search.global_search import GlobalSearch

search = GlobalSearch(
    model=model,  # LLM for generation
    context_builder=context_builder,  # Community report builder
    
    # Prompts
    map_system_prompt=map_prompt,  # Map phase instructions
    reduce_system_prompt=reduce_prompt,  # Reduce phase instructions
    
    # Response configuration
    response_type="Multiple Paragraphs",  # Or "Multi-Page Report"
    
    # Token budget
    max_data_tokens=8000,  # Context window for community reports
    
    # LLM parameters
    map_llm_params={"temperature": 0.0, "max_tokens": 1000},
    reduce_llm_params={"temperature": 0.0, "max_tokens": 2000},
    
    # Parallelism
    concurrent_coroutines=10,  # Parallel map operations
)

Advanced options

General knowledge integration:

search = GlobalSearch(
    # ...
    allow_general_knowledge=True,  # Include real-world knowledge
    general_knowledge_inclusion_prompt=knowledge_prompt,
)

True: LLM can incorporate external knowledge beyond dataset
False (default): Responses strictly from indexed data

Enabling general knowledge may increase hallucinations but can be useful for context-setting or filling gaps.

Context builder parameters:

context_builder_params = {
    "community_level": 2,  # Hierarchy level to use
    "use_community_summary": True,  # Include community summaries
    "shuffle_data": True,  # Randomize report order (reduces position bias)
    "include_community_rank": True,  # Weight by community importance
}

Performance vs quality trade-offs

Hierarchy level selection:

Level	Communities	Speed	Detail	Cost
Root (top)	2-5	Fastest	Lowest	Lowest
Mid	10-50	Medium	Medium	Medium
Leaf (0)	100-1000	Slowest	Highest	Highest

Token budget:

Lower (4000-8000): Faster, less comprehensive
Higher (12000-16000): Slower, more comprehensive

Concurrent coroutines:

Higher (20-50): Faster map phase, more API load
Lower (5-10): Slower but more stable

Local search

Local search excels at answering questions about specific entities by combining structured graph knowledge with unstructured text.

When to use local search

Ideal queries
Advantages over baseline RAG

Entity-specific questions:

“What are the healing properties of chamomile?”
“Who is Satya Nadella and what is his role?”
“Describe the relationship between X and Y”

Detailed information needs:

“What does the research say about [specific topic]?”
“What are the characteristics of [entity]?”
“How is [entity] connected to [other entities]?”

Multi-hop reasoning:

“How are A and B related through C?”
“What do A’s connections say about B?”

How local search works

Entity extraction

Identify entities relevant to the query:

Embed query: Convert query to vector embedding
Search entity embeddings: Find semantically similar entities
Rank by similarity: Top-k entities become entry points

# Entity extraction via embedding similarity
extracted_entities = entity_extraction(
    query=query,
    entity_descriptions=entity_descriptions,
    embedding_model=embedding_model,
    top_k=top_k_entities,
)

Graph traversal

Fan out from seed entities to gather related information:Connected entities:

Direct neighbors (1-hop)
Optionally: 2-hop neighbors
Ranked by relationship strength and centrality

Relationships:

All edges connected to seed entities
Edges between gathered entities
Ranked by weight and relevance

Community reports:

Reports for communities containing seed entities
Reports for related entities’ communities
Provides thematic context

Text unit retrieval

Gather source text:

Entity-text mappings: Text units mentioning extracted entities
Rank by relevance: Score text units by:
- Entity importance
- Number of relevant entities mentioned
- Semantic similarity to query
Filter by token budget: Keep top-ranked units that fit

Covariate retrieval

If claims are available:

Entity-claim mappings: Claims about extracted entities
Rank by relevance: Score claims by entity importance and claim type
Include in context: Add to structured context

Context assembly

Combine all components into structured context:

# Entities
- Entity 1: [description]
- Entity 2: [description]

# Relationships
- Entity 1 -> Entity 2: [relationship description]

# Community Reports
- Community X: [summary]

# Sources
- Text Unit 1: [text]
- Text Unit 2: [text]

# Claims
- Claim about Entity 1: [description]

Response generation

LLM generates answer using the structured context:

Answers drawn from multiple sources
Maintains provenance (can cite entities, relationships, text units)
Balances graph structure with text details

Configuration

Core parameters

from graphrag.query.structured_search.local_search import LocalSearch

search = LocalSearch(
    model=model,  # LLM for generation
    context_builder=context_builder,  # Mixed context builder
    
    # Prompt
    system_prompt=system_prompt,  # Instructions for response generation
    
    # Response configuration
    response_type="Multiple Paragraphs",
    
    # LLM parameters
    llm_params={"temperature": 0.0, "max_tokens": 2000},
)

Context builder parameters

Fine-tune what gets included:

context_builder_params = {
    # Entity settings
    "text_unit_prop": 0.5,  # Proportion of budget for text units
    "community_prop": 0.25,  # Proportion for community reports
    "conversation_history_max_turns": 5,  # Conversation context
    "top_k_entities": 10,  # Seed entities from query
    
    # Graph traversal
    "include_entity_rank": True,  # Weight by entity importance
    "include_relationship_weight": True,  # Use relationship weights
    "rank_description": True,  # Rank entities by description relevance
    
    # Community context
    "include_community_rank": True,  # Weight community reports
    "use_community_summary": True,  # Include summaries
    
    # Token management
    "max_tokens": 8000,  # Total context budget
}

Ranking and filtering

How candidates are prioritized:Entity ranking:

Embedding similarity to query
Graph centrality (degree, PageRank)
Community membership importance

Relationship ranking:

Connected to high-ranked entities
Relationship weight
Description relevance to query

Text unit ranking:

Contains high-ranked entities
Number of relevant entities
Semantic similarity to query

Community report ranking:

Contains seed entities
Community size and importance
Summary relevance

DRIFT search

DRIFT (Dynamic Reasoning and Inference with Flexible Traversal) combines the breadth of global search with the depth of local search through iterative refinement.

How DRIFT search works

DRIFT search creates a hierarchical exploration tree with three phases: Primer (global), Follow-up (local), and Output (ranked hierarchy)

Primer phase

Start with global community context:

Retrieve top-k community reports: Most relevant to query
Generate initial answer: Broad response addressing the query
Generate follow-up questions: Questions for deeper exploration
Confidence scoring: Rate each follow-up question’s potential

Follow-up phase

Iteratively refine through local search:

Select highest-confidence question: From pending follow-ups
Execute local search: Detailed entity-based search
Generate intermediate answer: Specific response to follow-up
Generate new follow-ups: Further refinement questions
Update confidence: Re-score based on information gain
Repeat: Until budget exhausted or confidence threshold not met

Output hierarchy

Produce structured result:

Question: [Original query]
└── Global Answer: [Broad response]
    ├── Follow-up 1: [Refined question]
    │   └── Local Answer: [Detailed response]
    │       └── Follow-up 1.1: [Further refinement]
    │           └── Local Answer: [More detail]
    └── Follow-up 2: [Alternative angle]
        └── Local Answer: [Detailed response]

Ranked by relevance for user exploration.

When to use DRIFT search

Ideal scenarios
Advantages

Exploratory queries:

“Tell me about [broad topic]”
“What should I know about [domain]?”
“Explain [complex concept]”

Unknown unknowns:

User doesn’t know specific entities to ask about
Investigating unfamiliar dataset
Discovery-oriented exploration

Balanced needs:

Need both overview and details
Want multiple perspectives
Seeking comprehensive understanding

Configuration

from graphrag.query.structured_search.drift_search import DRIFTSearch
from graphrag.config.models.drift_search_config import DRIFTSearchConfig

config = DRIFTSearchConfig(
    # Primer phase
    primer_folds=3,  # How many community report folds to use
    primer_allow_general_knowledge=False,
    
    # Follow-up phase
    follow_up_max_iterations=3,  # Max follow-up depth
    follow_up_confidence_threshold=0.7,  # Min confidence to continue
    
    # Token budgets
    local_search_text_unit_prop=0.5,
    local_search_community_prop=0.25,
    
    # LLM parameters
    temperature=0.0,
    max_tokens=2000,
)

search = DRIFTSearch(
    model=model,
    context_builder=context_builder,
    config=config,
    tokenizer=tokenizer,
)

Key hyperparameters

primer_folds: Number of community report batches

Higher → more comprehensive initial coverage
Lower → faster primer phase

follow_up_max_iterations: Maximum refinement depth

Higher → more detailed exploration
Lower → faster, less thorough

follow_up_confidence_threshold: Minimum confidence to continue

Higher (0.8-0.9) → only high-value follow-ups
Lower (0.5-0.7) → more exploratory

DRIFT automatically balances exploration depth with computational cost using confidence scoring.

Basic search

Traditional top-k vector similarity search over text units.

When to use basic search

Simple fact lookup questions
Queries with direct semantic matches in text
When graph structure doesn’t add value
Baseline comparison for other methods

# Basic search configuration
basic_search_params = {
    "top_k": 5,  # Number of text units to retrieve
    "embedding_model": "text-embedding-3-small",
}

Basic search is included primarily for comparison and simple use cases. Most queries benefit from local or global search.

Choosing the right method

Decision tree
By use case
By characteristics

Is the question about the dataset as a whole?
├─ Yes → Global Search
│   - Themes, trends, top items
│   - Dataset-wide summaries
│   - Comparative analysis across corpus
│
└─ No → Is it about specific entities?
    ├─ Yes → Local Search
    │   - Entity attributes and relationships
    │   - Detailed information needs
    │   - Multi-hop reasoning
    │
    └─ No → Is it exploratory?
        ├─ Yes → DRIFT Search
        │   - Broad topic investigation
        │   - Unknown information needs
        │   - Comprehensive understanding
        │
        └─ No → Basic Search
            - Simple fact lookup
            - Direct semantic match

Use Case	Best Method	Why
”What are the main themes?”	Global	Requires dataset-wide understanding
”What does entity X do?”	Local	Specific entity information
”Tell me about topic Y”	DRIFT	Exploratory, needs breadth and depth
”Find mentions of Z”	Basic	Simple semantic search
”Compare A and B”	Local	Specific entities and relationships
”Summarize the data”	Global	Holistic overview
”How are A and B connected?”	Local	Multi-hop graph traversal
”What’s important here?”	DRIFT	Discovery-oriented

Performance considerations

Speed

Fastest to slowest:

Basic search
Local search
Global search (root level)
Global search (leaf level)
DRIFT search

Cost

LLM token usage:

Basic: Minimal (generation only)
Local: Moderate (one generation call)
Global: High (map-reduce = many calls)
DRIFT: Highest (global + multiple local)

Quality

For appropriate query types:

Global: Excellent for themes
Local: Excellent for entities
DRIFT: Excellent for exploration
Basic: Good for simple facts

Scalability

Large datasets:

Basic: Scales well (vector search)
Local: Scales moderately (graph size)
Global: Depends on hierarchy level
DRIFT: Resource-intensive

Best practices

Start with the right method

Don’t default to one method:

Analyze the query type
Consider information needs
Choose appropriate method
Evaluate results

Tune for your use case

Global: Adjust community level based on detail needs
Local: Tune context proportions for your data
DRIFT: Balance exploration depth with cost
All: Optimize token budgets

Combine methods

Consider hybrid approaches:

Try local first, fall back to global
Use basic search for filtering, then local for details
DRIFT for exploration, local for follow-up

Monitor performance

Track metrics:

Query latency
Token usage and cost
Result quality (user feedback)
Adjust parameters accordingly

Next steps

Get started guide

Set up GraphRAG and run your first queries

Configuration

Configure retrieval parameters for your use case

​Overview of retrieval methods

Global search

Local search

DRIFT search

Basic search

​Global search

​When to use global search

​How global search works

​Configuration

​Local search

​When to use local search

​How local search works

​Configuration

​DRIFT search

​How DRIFT search works

​When to use DRIFT search

​Configuration

​Basic search

​When to use basic search

​Choosing the right method

​Performance considerations

Speed

Cost

Quality

Scalability

​Best practices

​Next steps

Get started guide

Configuration

Overview of retrieval methods

Global search

When to use global search

How global search works

Configuration

Local search

When to use local search

How local search works

Configuration

DRIFT search

How DRIFT search works

When to use DRIFT search

Configuration

Basic search

When to use basic search

Choosing the right method

Performance considerations

Best practices

Next steps