Retrieval Evidence Quality Prediction for RAG Hallucination Detection with LLM-Derived Semantic Features

Haruki Sato

doi:10.69987/JACS.2026.60603

Authors

Haruki Sato Electrical Engineering and Computer Science, Tohoku University, Sendai, MYG, Japan Author

DOI:

https://doi.org/10.69987/JACS.2026.60603

Keywords:

retrieval-augmented generation, hallucination detection, evidence quality, semantic support, factual consistency, calibration, RAGTruth, machine learning evaluation

Abstract

Retrieval-augmented generation (RAG) reduces unsupported generation by supplying external evidence, but retrieved passages do not guarantee that the final answer is grounded. This study treats retrieval evidence quality as a direct risk signal for hallucination detection. The evaluation uses a 17,790-row RAGTruth-processed dataset containing queries, retrieved contexts, generated outputs, hallucination annotations, quality indicators, and generator metadata. The binary target identifies whether an output contains hallucinated content, while secondary analysis distinguishes evident conflict from baseless information. A reproducible feature pipeline measures lexical coverage, sentence-level support, semantic support, entity drift, number drift, length relations, and metadata. Six methods are compared: a majority baseline, metadata logistic regression, TF-IDF logistic regression, evidence-quality logistic regression, an evidence-plus-metadata random forest, and a semantic evidence-quality gradient boosting model. On the held-out test split, the proposed model achieves 0.950 accuracy, 0.944 precision, 0.910 recall, 0.927 F1, 0.944 macro-F1, 0.984 AUROC, and 0.979 AUPRC. It identifies 858 of 943 hallucinated outputs while correctly preserving 1,706 of 1,757 grounded outputs. Ablation and subgroup results show that semantic support is most effective when combined with generation metadata and that relation-level conflict remains harder than unsupported entity or numeric insertion. The results establish evidence quality as an auditable intermediate variable for RAG safety and support calibrated decisions to show, warn, regenerate, abstain, or escalate an answer.