HIN-LLM Enhanced Knowledge Tracing
- The paper introduces HISE-KT, a novel framework combining HIN modeling with LLM-driven meta-path quality assessment for accurate prediction of student performance.
- HISE-KT constructs a multi-relationship heterogeneous network and employs automated meta-path selection alongside student similarity retrieval to enhance interpretability.
- Empirical results show up to +9% AUC improvement on public datasets, with clear, evidence-backed explanations that address meta-path noise and improve prediction transparency.
HIN-LLM Synergistic Enhanced Knowledge Tracing (HISE-KT) is a knowledge tracing framework integrating heterogeneous information networks (HINs) with LLMs to achieve accurate and evidence-based prediction of student performance, together with interpretable explanations. HISE-KT departs from earlier models prone to meta-path noise and explanation inconsistency by introducing automated meta-path quality assessment, student-similarity retrieval, and structured prompt engineering, each informed by educational psychology and optimized with LLM capabilities. The method demonstrates substantial predictive and interpretative improvement on public knowledge tracing benchmarks (Duan et al., 19 Nov 2025).
1. Multi-Relationship Heterogeneous Information Network Construction
HISE-KT models educational data as a multi-relationship heterogeneous information network (MRHIN)
where is the set of nodes with the following type set:
- : students,
- : questions,
- : knowledge concepts,
- : student-ability levels (Low/Medium/High),
- : question-difficulty levels (Low/Medium/High).
The edge-type set encodes relationships: student answered question (), question involves concept (), question difficulty (), student ability level (). Each relation is associated with a binary adjacency matrix such that iff .
This formalization enables joint encoding of interactions, content, skill level, and student ability, forming the substrate for cross-semantic meta-path reasoning (Duan et al., 19 Nov 2025).
2. Meta-Path Specification and Instantiation
A family of meta-path templates defines possible cross-type traversals:
- Basic instances include ––, ––, ––, ––––.
- Composite meta-paths such as –––––– encode more complex semantic traversals.
Instantiated paths conform to both type and edge constraints: , . Meta-paths encode both direct (single-hop) and higher-order relational semantics (multi-hop, cross-domain), supporting nuanced aggregation beyond classic neighbor-based HIN analytics.
3. LLM-Based Meta-Path Quality Assessment and Selection
HISE-KT employs an LLM to automatically quantify the quality of instantiated meta-paths along four axes:
- Question Centrality (): Encourages paths tightly centered on the target question by penalizing average shortest-path distance.
- Knowledge-Concept Relevance (): Measures overlap between questions on the path and the target concept .
- Informativeness (): Rewards distinct node instances on the path excluding and .
- Node-Type Diversity (): Penalizes homogeneity in ability/difficulty subtypes using a level entropy term.
Each dimension is scored in , and summed to a total :
For each meta-path template , only the Top-K paths with maximal are retained, replacing earlier heuristic, random, or manual selection strategies (Duan et al., 19 Nov 2025).
4. Meta-Path-Aware Student Similarity Retrieval
For a target student and question , HISE-KT extracts all students that co-occur in high-quality Top-K meta-paths. Each candidate is represented by a feature vector:
where is IRT ability, is per-concept accuracy, is number of shared questions, is number of shared concepts, is co-occurrence frequency on , is a decay constant.
Student similarity is measured by Mahalanobis distance (parameters estimated from the population), with . Top-S most similar students are selected to yield a context pool , where is the full historical trajectory of (Duan et al., 19 Nov 2025).
5. Structured Prompt Engineering and Explainable Prediction
HISE-KT leverages a structured prompt which concatenates:
- Target student summary: student ID, ability , interaction history ,
- Target question summary: question ID, concept, difficulty, discrimination, prior student accuracy,
- Similar-students context: for each Top-S student , their ability, history on , and accuracy.
The full prompt ends with an instruction:
Based on the above, predict: 1. Will student u_id answer q_0 correctly? (correct/wrong) with probability. 2. Provide a three-sentence analysis citing evidence from and .
The LLM produces both a point prediction and an explanation referencing evidence paths and similar students, thereby coupling performance and interpretability. The design enforces zero-shot generality and supports automated, evidence-citing explanations.
6. Complete Model Workflow
The HISE-KT pipeline proceeds as follows:
- Construct from dataset and IRT calculations.
- For each : a. Enumerate (sample) path instances, b. For each path, query LLM for scores , , , to obtain , c. Retain Top-K by .
- Aggregate all students from retained paths.
- For each candidate : compute , . Select Top-S for context.
- Compile the prompt with , , meta-data. Call LLM for prediction and explanation.
This systematic workflow supports joint optimization of both knowledge-tracing accuracy and interpretability, unifying the strengths of HIN modeling and LLM-based reasoning (Duan et al., 19 Nov 2025).
7. Empirical Performance and Interpretability
HISE-KT was evaluated on four public datasets (Assistment09, Slepemapy, Statics2011, Frcsub). Table 1 displays peak AUC results for HISE-KT (Qwen variant) and leading baselines:
| Dataset | HISE-KT_Qwen AUC | Best Previous Baseline (Method, AUC) |
|---|---|---|
| Assistment09 | 0.8703 | CoKT 0.8211 |
| Slepemapy | 0.9749 | STHKT 0.8574 |
| Statics2011 | 0.8888 | TCL4KT 0.8357 |
| Frcsub | 0.9482 | CoKT 0.9238 |
Accuracy improvements reach up to +9%. Interpretability, judged via human assessments and path-citation metrics, consistently exceeded all baselines. Explanatory outputs explicitly cited relevant meta-paths and student trajectories. For example, in Assistment09, the output LLM explanation referenced both meta-path evidence and similar peers’ incorrect answers to motivate its prediction (wrong, for the target item) (Duan et al., 19 Nov 2025).
8. Context and Comparison to Related Work
SINKT (Fu et al., 2024) also deploys a heterogeneous graph and LLM-based message-passing, but focuses primarily on student-inductive generalization and relies on LLMs for semantic initialization and graph expansion, with predictions realized through machine-learned encoders rather than prompt-driven explanation. In contrast, HISE-KT systematically integrates LLMs for both meta-path instance selection and final explanatory prediction, and introduces automated, fine-grained path scoring and student-similarity context aggregation. This suggests a broader applicability in environments where explainable, evidence-grounded predictions are requisite.
A plausible implication is that the HISE-KT paradigm can be extended to other domains where HINs and LLMs can be co-optimized for both prediction and interpretability, especially in educational recommender and adaptive tutoring systems. Current results represent a significant development in LLM-assisted, interpretable, evidence-backed knowledge tracing (Duan et al., 19 Nov 2025, Fu et al., 2024).