Yuhan Li, and Mingzhuo Yu. “Benchmarking CUDA, CuPy, and Triton Kernel Optimizations for 3D Point Cloud Segmentation: An Empirical Comparison of Latency, Memory Efficiency, and GPU Utilization”. Journal of Advanced Computing Systems , vol. 6, no. 5, May 2026, pp. 21-30, https://doi.org/10.69987/JACS.2026.60503.