AI-Driven Mobile UI Pattern Recognition and Design Topic Mining on RICO: Semantic Clustering and Screenshot-Based Topic Classification

Authors

  • Jason Kuhn Data Science, University of Pittsburgh, PA, USA Author
  • Yushan Chen Service Design, Savannah College of Art and Design, GA, USA Author
  • Evelyn Chan Computer Engineering, Dartmouth College, NH, USA Author

DOI:

https://doi.org/10.69987/JACS.2024.40506

Keywords:

mobile UI, design mining, topic modeling, RICO dataset, vision transformer

Abstract

Mobile UI ecosystems contain recurring layout patterns, interaction structures, and visual motifs that collectively form “design topics”. This paper presents a data-driven pipeline that mines design topics from the RICO v0.1 semantic-annotation subset and then evaluates screenshot-based topic classification. Using 66,261 RICO screens (PNG screenshots paired with JSON view hierarchies containing semantic fields such as componentLabel, iconClass, text, bounds, and clickable), we extract a compact semantic feature vector per screen and apply MiniBatch K-Means (K=20) to obtain interpretable topic clusters. These clusters serve as pseudo-labels for downstream visual recognition. We compare three lightweight models that predict the mined topics from UI screenshots alone: (i) a small convolutional neural network (CNN), (ii) a compact vision transformer (ViT), and (iii) a lightweight vision–language model (LightVLM) trained with contrastive alignment between screenshots and semantic feature vectors. Experiments use a stratified subset of 4,782 screens (train/val/test = 3,000/594/1,188; 150/30/60 per topic) with deterministic seed 42. On the held-out test set, the ViT achieves the strongest overall performance (Accuracy = 0.345, Macro-F1 = 0.284, Macro-AUC = 0.820), outperforming the CNN (Accuracy = 0.222, Macro-F1 = 0.138, Macro-AUC = 0.764) and LightVLM (Accuracy = 0.243, Macro-F1 = 0.189, Macro-AUC = 0.782). We provide topic distribution analysis, clustering visualizations, confusion matrices, and embedding plots to characterize common failure modes. Finally, a semantic-only prototype baseline (Macro-F1 = 0.605, Macro-AUC = 0.945) clarifies how strongly the mined topics are grounded in view-hierarchy semantics.

Author Biography

  • Evelyn Chan, Computer Engineering, Dartmouth College, NH, USA

     

     

     

Downloads

Published

2024-05-18

How to Cite

Jason Kuhn, Yushan Chen, & Evelyn Chan. (2024). AI-Driven Mobile UI Pattern Recognition and Design Topic Mining on RICO: Semantic Clustering and Screenshot-Based Topic Classification. Journal of Advanced Computing Systems , 4(5), 67-83. https://doi.org/10.69987/JACS.2024.40506

Share