ConRAG: Contradiction-Aware Retrieval-Augmented Generation under Multi-Source Conflicting Evidence
DOI:
https://doi.org/10.69987/JACS.2024.40705Keywords:
retrieval-augmented generation, contradiction detection, natural language inference, evidence structuring, citation evaluation, hallucination robustnessAbstract
Retrieval-augmented generation (RAG) grounds language-model outputs in external evidence, but it often fails when the retrieved material contains genuine disagreements. In multi-source environments, a retriever can return passages that are all relevant yet mutually inconsistent. A standard generator may then merge incompatible evidence into a single narrative, leading to self-contradictions, unstable stance decisions, and citations that are difficult to verify. We propose ConRAG, a contradiction-aware RAG framework that makes conflict explicit and actionable. ConRAG consists of two coordinated stages. The analysis stage (A-stage) tags each retrieved passage with an NLI-style relation to the query (Support, Refute, or Irrelevant), clusters passages into internally consistent evidence groups, and computes a conflict score that quantifies disagreement strength. The generation stage (G-stage) follows a constrained protocol: it first outputs an evidence table, then adjudicates the stance with calibrated uncertainty, and finally generates an answer where every nontrivial sentence is bound to traceable citations.
We define an evaluation suite spanning stance correctness and evidence quality (FEVER, SciFact), citation precision and recall (ALCE), and hallucination robustness (RAGTruth). We implement ConRAG end-to-end and conduct full empirical evaluations on the official splits of these benchmarks. All tables and figures report measured results obtained from actual system runs under a fixed and reproducible evaluation protocol (consistent preprocessing, identical retrieval/generation budgets across methods, and controlled random seeds).







