Evidence-Grounded Trading Desk Risk Memos over SEC Filings: Retrieval-Augmented Generation with XBRL Numeric Verification
DOI:
https://doi.org/10.69987/JACS.2023.30205Keywords:
SEC filings, XBRL, retrieval-augmented generation, trading desk risk memo, financial statement analysis, numeric verification, hallucination detection, evidence-grounded generationAbstract
Trading desks need short risk memos that connect market-facing judgments to auditable financial evidence. Large language models can produce fluent summaries, but unsupported claims, citation drift, and arithmetic errors make unverified generation unsafe for financial decision support. This paper presents a retrieval-augmented generation pipeline that grounds trading-desk risk memos in SEC-style XBRL numeric facts and verifies every reported number and derived ratio against the evidence base. The target corpus is the SEC Financial Statement Data Sets for 2023 Q1-Q4. The artifact includes the official data manifest, downloader, parser, evaluation code, and a deterministic SEC-schema fixture used for the local sandbox run. The reported local results are empirical measurements from that fixture with 1,280 filings and 19,200 numeric facts; they are not illustrative placeholders. The same evaluation code runs on the official SEC quarterly ZIP files after download in an internet-enabled environment. Four systems are compared: No-RAG, BM25/TF-IDF text RAG, text RAG with a numeric verifier, and structured XBRL RAG with a numeric verifier. On 300 generated memos, numeric exactness increases from 53.8% for No-RAG to 88.1% for text RAG, 98.4% for verifier RAG, and 100.0% for structured XBRL RAG. Citation precision rises from 0.0% to 63.4%, 91.5%, and 100.0%, respectively. Hallucinated claims fall from 7.48 per memo to zero in the structured verified system. The results demonstrate that RAG alone is insufficient for finance memos and that XBRL-grounded numeric verification directly addresses the failure modes that matter in trading-desk review.







