Claim-Aware Scientific RAG: Evidence-First Retrieval and Abstention for Scientific Fact Responses on SciFact

Authors

  • Jing Chen Industrial Engineering and Operations Research, UCB, CA, USA Author
  •  Xinzhuo Sun Computer Science, Cornell Tech, NY, USA Author
  • Vincent Brown Information Technology, Illinois Tech, IL, USA Author

DOI:

https://doi.org/10.69987/JACS.2023.30102

Keywords:

RAG, fact verification, evidence-grounded generation, hallucination reduction, SciFact, BEIR, abstention, reranking, hybrid retrieval

Abstract

Retrieval-augmented generation (RAG) is widely adopted to reduce hallucinations, yet most systems still answer even when retrieval fails, producing fluent but unsupported “scientific facts”. This paper studies a claim-aware scientific RAG design principle: the system is allowed to answer only when it can cite evidence. We conduct full experimental evaluations on the SciFact scientific claim retrieval task using the BEIR-style SciFact split (5,183 abstracts; 809 training claims; 300 test claims). We compare a sparse BM25 retriever, a contrastive dense dual-encoder, and a hybrid retriever using reciprocal rank fusion (RRF), followed by an interaction-based reranker. We then add an evidence layer that extracts candidate citation sentences and scores them with a lightweight verifier, and we enforce an abstention gate that refuses to answer when confidence is low. On the SciFact test set, BM25 achieves nDCG@10=0.662 and Recall@100=0.883. The dense retriever alone underperforms (nDCG@10=0.537), but hybrid RRF improves Recall@100 to 0.923 and a reranker recovers nDCG@10 to 0.659. For evidence extraction, token-level evidence F1 reaches 0.190 when selecting two sentences. Finally, we quantify a refusal–hallucination tradeoff via confidence-based abstention: gating by the top-1 BM25 score reduces the rate of answers without any relevant abstract in the top-10 from 0.193 to 0.047 at 28.3% answer coverage. These results provide a reproducible baseline showing how evidence-first retrieval and calibrated refusal can be combined to control hallucinations in scientific RAG.

Downloads

Published

2023-01-07

How to Cite

Jing Chen,  Xinzhuo Sun, & Vincent Brown. (2023). Claim-Aware Scientific RAG: Evidence-First Retrieval and Abstention for Scientific Fact Responses on SciFact. Journal of Advanced Computing Systems , 3(1), 16-30. https://doi.org/10.69987/JACS.2023.30102

Share