[1]
Mingzhuo Yu and Zan Li, “An Empirical Comparison of Discrete Video Tokenization Schemes for Video Question Answering and Video Captioning”, AIMLR, vol. 6, no. 2, pp. 27–50, Apr. 2025, doi: 10.69987/AIMLR.2025.60203.