Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph
TL;DR Highlight
LLMs traverse Knowledge Graphs step-by-step to reduce hallucinations and improve accuracy.
Who Should Read
Backend/AI developers who want to reduce LLM hallucinations. Especially useful when building domain-knowledge-based QA systems or fact-checking pipelines.
Core Mechanics
- Uses LLM as an agent on a Knowledge Graph (KG), traversing KG nodes step-by-step (beam search style) to answer questions
- Instead of fetching text chunks like traditional RAG, constructs reasoning paths by following entity-relation triples in the KG
- The LLM autonomously decides which relation to follow at each traversal step and self-evaluates whether the answer is sufficient
- Reasoning is traceable via KG paths, making the answer explainable
- Queries external KGs (Freebase, Wikidata, etc.) in real-time, so no model retraining needed for up-to-date knowledge
Evidence
- ~8-12% accuracy improvement over traditional RAG on KGQA benchmarks (WebQSP, CWQ)
- Up to 15%+ performance improvement over standalone Chain-of-Thought on multi-hop reasoning questions
- Provides explicit reasoning paths (traceability) compared to black-box LLMs
How to Apply
- If you have a domain KG (e.g., medical or legal ontology), extract entities from user queries and build a KG traversal agent with an LLM to create a multi-hop answer pipeline
- Replacing chunk retrieval with KG triple path search in your RAG pipeline improves accuracy especially for relational questions like 'who has what relationship with whom'
- Present a list of KG relation candidates in the LLM prompt and let it choose which relation to explore — you can connect to Freebase/Wikidata APIs for a quick implementation
Code Example
snippet
# Think-on-Graph core loop pseudo-code
def think_on_graph(question, kg_client, llm, max_depth=3, beam_width=3):
# 1. Extract starting entities from the question
start_entities = llm.extract_entities(question)
paths = [(entity, []) for entity in start_entities] # (current node, path)
for depth in range(max_depth):
candidates = []
for current_node, path in paths:
# 2. Retrieve relation candidates for the current node from KG
relations = kg_client.get_relations(current_node)
# 3. LLM selects relevant relations
prompt = f"""
Question: {question}
Current exploration node: {current_node}
Path so far: {path}
Available relations: {relations}
From the above relations, select the ones to explore in order to answer the question.
If none are relevant, output 'none'."""
selected_relations = llm.call(prompt)
for rel in selected_relations[:beam_width]:
next_nodes = kg_client.get_neighbors(current_node, rel)
for node in next_nodes:
candidates.append((node, path + [(current_node, rel, node)]))
# 4. LLM determines whether the current candidates are sufficient to answer
answer_check_prompt = f"""
Question: {question}
Explored paths and entities: {candidates}
Can the question be answered with the current information?
If yes, provide the answer; otherwise, output 'continue'."""
result = llm.call(answer_check_prompt)
if result != 'continue':
return result, candidates # Return answer + supporting paths
paths = candidates[:beam_width] # Maintain beam
return llm.call(f"Question: {question}\nCollected information: {paths}\nFinal answer:"), pathsTerminology
Knowledge Graph (KG)A database storing entities (people, places, concepts) and their relationships as a node-edge graph. Like 'Iron Man → made by → Marvel' expressing facts as connected links.
HallucinationWhen an LLM confidently states something that's not true. The model plausibly combines patterns from training data to fabricate non-existent facts.
Multi-hop reasoningReasoning that requires multiple steps to find an answer. E.g., 'What's the population of Iron Man actor's birthplace?' → Iron Man → actor → birthplace → population.
Beam SearchA search strategy that keeps only the top-k candidates at each step instead of exploring all possibilities. Like a navigation app showing a few optimal routes.
TripleThe minimum knowledge unit in a KG, stored as (subject, relation, object). E.g., (Iron Man, made by, Marvel).
RAGRetrieval-Augmented Generation. A technique that retrieves external documents to supplement LLM answers when the model's internal knowledge is insufficient.
ExplainabilityThe ability of AI to explain why it gave a certain answer. The opposite of a black box — the reasoning process can be verified by humans.