2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) 2019
DOI: 10.1109/msr.2019.00014
|View full text |Cite
|
Sign up to set email alerts
|

Import2vec: Learning Embeddings for Software Libraries

Abstract: We consider the problem of developing suitable learning representations (embeddings) for library packages that capture semantic similarity among libraries. Such representations are known to improve the performance of downstream learning tasks (e.g. classification) or applications such as contextual search and analogical reasoning.We apply word embedding techniques from natural language processing (NLP) to train embeddings for library packages ("library vectors"). Library vectors represent libraries by similar … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 27 publications
(14 citation statements)
references
References 17 publications
(24 reference statements)
0
14
0
Order By: Relevance
“…The more recent Import2Vec [37] paper produces em-beddings for each imported package. The authors do such embeddings for JavaScript, Python, and Java, and provide some qualitative evidence suggesting that these embeddings of APIs accurately reflect different functionality profiles by providing a number of examples where the similar APIs also appear to implement similar functionalities.…”
Section: A Developer Expertisementioning
confidence: 99%
“…The more recent Import2Vec [37] paper produces em-beddings for each imported package. The authors do such embeddings for JavaScript, Python, and Java, and provide some qualitative evidence suggesting that these embeddings of APIs accurately reflect different functionality profiles by providing a number of examples where the similar APIs also appear to implement similar functionalities.…”
Section: A Developer Expertisementioning
confidence: 99%
“…There are many studies on the representation of source code, including recent studies proposing distributed representations for identifiers [17], APIs [46,47], and software libraries [56]. A comprehensive survey of learning the representation of source code has been done by Allamanis et al [1].…”
Section: Related Workmentioning
confidence: 99%
“…Dependency similarity. As a proxy for functional similarity between two repositories (H 5 ), we compute the degree to which their sets of dependencies are similar [70]. Specifically, we consider each set of dependencies as a "document" and compute the TF-IDF correlation matrix similarity score of the two documents [10].…”
Section: Other Operationalizationsmentioning
confidence: 99%