2022
DOI: 10.48550/arxiv.2209.08655
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Enabling Conversational Interaction with Mobile UI using Large Language Models

Abstract: Conversational agents show the promise to allow users to interact with mobile devices using language. However, to perform diverse UI tasks with natural language, developers typically need to create separate datasets and models for each specific task, which is expensive and effort-consuming. Recently, pre-trained large language models (LLMs) have been shown capable of generalizing to various downstream tasks when prompted with a handful of examples from the target task. This paper investigates the feasibility o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 23 publications
(40 reference statements)
0
2
0
Order By: Relevance
“…Annotated datasets of mobile UI screens can serve as a basis for generating accessibility metadata labels for UI components [81], generating screen summaries [74], and performance testing [11]. Additionally, these datasets can be leveraged for semantic embeddings [3,52] in mobile UI to decode into other forms of UIs such as automatic UI generation [21,22], smartphone shortcuts [51], custom UI mashup [40,47], and conversational UIs [53,73].…”
Section: Leveraging Existing Uis For Designmentioning
confidence: 99%
“…Annotated datasets of mobile UI screens can serve as a basis for generating accessibility metadata labels for UI components [81], generating screen summaries [74], and performance testing [11]. Additionally, these datasets can be leveraged for semantic embeddings [3,52] in mobile UI to decode into other forms of UIs such as automatic UI generation [21,22], smartphone shortcuts [51], custom UI mashup [40,47], and conversational UIs [53,73].…”
Section: Leveraging Existing Uis For Designmentioning
confidence: 99%
“…Artificial Intelligence (AI) has been widely used in the Human-Computer Interaction (HCI) community, with LLMs experiencing a surge of usage in recent years [1, 16, 27, 31-33, 48, 49, 60-62]. LLMs' abilities to understand common knowledge and reason within a given context have been leveraged for interactive code support [60], social computing [47,48] and accessibility support [30]. For example, Visual Caption employed a fine-tuned language model to predict user intent during visual inquiries using the last two sentences [41], while SayCan extracted and leveraged knowledge priors within LLMs to reason about, and execute, robot commands [1].…”
Section: Large Language Models In Hcimentioning
confidence: 99%