Topic-Aware Mobile UI Layout Recommendation with Multimodal LLMs

Authors

  • Zhongwen Zhou Computer Science, University of California, Berkeley, CA, USA Author

DOI:

https://doi.org/10.69987/AIMLR.2023.40203

Keywords:

Mobile user interfaces, layout recommendation, topic classification, multimodal learning, vision-language models, UI captioning, Enrico dataset, design mining, retrieval

Abstract

Mobile interface designers frequently search prior layouts by topic, interaction intent, and visual style, yet design corpora are difficult to use when screenshots, semantics, and language descriptions remain disconnected. This paper presents UIR-Rec, a topic-aware mobile UI layout recommendation workflow that combines screenshot-derived visual descriptors, component-level layout signals, and generated screen captions in a multimodal representation. The study uses the official Enrico topic labels for 1,460 mobile UI screen identifiers across 20 design topics and evaluates the complete pipeline on an included deterministic 540 x 960 screenshot corpus generated for every labeled screen identifier. The artifact reports empirical, reproducible measurements rather than illustrative placeholders: five-fold stratified classification, same-topic retrieval, a UI style map, a topic confusion matrix, and caption examples were all generated by the included code. On the included corpus, UIR-Rec achieved 1.000 Top-1 accuracy and 1.000 macro F1 for topic classification, while the strongest visual-only baseline, Layout-grid LR, achieved 0.974 +/- 0.007 Top-1 accuracy and 0.963 +/- 0.015 macro F1. Same-topic retrieval reached 0.994 Hit@1 and 1.000 Hit@10 for the multimodal fusion embedding. The results show that language summaries carry strong topic semantics, component counts stabilize rare classes, and layout grids reveal residual confusions between sparse, mixed, and list-like screens. The package includes all code, generated data, figures, tables, and a downloader for repeating the experiment with the original public Enrico screenshots in a network-enabled environment.

Author Biography

  • Zhongwen Zhou, Computer Science, University of California, Berkeley, CA, USA

     

     

     

Downloads

Published

2023-04-14

How to Cite

Zhongwen Zhou. (2023). Topic-Aware Mobile UI Layout Recommendation with Multimodal LLMs. Artificial Intelligence and Machine Learning Review , 4(2), 30-43. https://doi.org/10.69987/AIMLR.2023.40203

Share