Yuhan Li and Mingzhuo Yu (2026) “Benchmarking CUDA, CuPy, and Triton Kernel Optimizations for 3D Point Cloud Segmentation: An Empirical Comparison of Latency, Memory Efficiency, and GPU Utilization”, Journal of Advanced Computing Systems, 6(5), pp. 21–30. doi:10.69987/JACS.2026.60503.