Enabling Conversational Interaction with Mobile UI using Large Language Models

Wang, Bryan; Li, Gang; Li, Yang

doi:10.48550/arxiv.2209.08655

Cited by 4 publications

(2 citation statements)

References 23 publications

(40 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Annotated datasets of mobile UI screens can serve as a basis for generating accessibility metadata labels for UI components [81], generating screen summaries [74], and performance testing [11]. Additionally, these datasets can be leveraged for semantic embeddings [3,52] in mobile UI to decode into other forms of UIs such as automatic UI generation [21,22], smartphone shortcuts [51], custom UI mashup [40,47], and conversational UIs [53,73].…”

Section: Leveraging Existing Uis For Designmentioning

confidence: 99%

MineXR: Mining Personalized Extended Reality Interfaces

Cho,

Yan,

Todi

et al. 2024

Proceedings of the CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

Extended Reality (XR) interfaces offer engaging user experiences, but their effective design requires a nuanced understanding of user behavior and preferences. This knowledge is challenging to obtain without the widespread adoption of XR devices. We introduce MineXR, a design mining workflow and data analysis platform for collecting and analyzing personalized XR user interaction and Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

show abstract

Section: Leveraging Existing Uis For Designmentioning

confidence: 99%

MineXR: Mining Personalized Extended Reality Interfaces

Cho,

Yan,

Todi

et al. 2024

Proceedings of the CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

show abstract

“…Artificial Intelligence (AI) has been widely used in the Human-Computer Interaction (HCI) community, with LLMs experiencing a surge of usage in recent years [1, 16, 27, 31-33, 48, 49, 60-62]. LLMs' abilities to understand common knowledge and reason within a given context have been leveraged for interactive code support [60], social computing [47,48] and accessibility support [30]. For example, Visual Caption employed a fine-tuned language model to predict user intent during visual inquiries using the last two sentences [41], while SayCan extracted and leveraged knowledge priors within LLMs to reason about, and execute, robot commands [1].…”

Section: Large Language Models In Hcimentioning

confidence: 99%

OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs

Li,

Xu,

Grossman

et al. 2024

Proceedings of the CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

Figure 1: OmniActions contributes: (1) a design space of digital follow-up actions (b) derived from data collected during a five-day diary study with 39 participants (a), and (2) a pipeline that takes multimodal sensory data (c) and contextual information (d) as inputs, and predicts what digital actions users might take and on which specific information in the input they might take these actions (e). The action prediction is guided by the design space.

show abstract