One of the most essential tasks needed for various downstream tasks in career analytics (e.g., career trajectory analysis, job mobility prediction, and job recommendation) is Job Title Mapping (JTM), where the goal is to map user-created (noisy and non-standard) job titles to predefined and standard job titles. However, solving JTM is domain-specific and non-trivial due to its inherent challenges:(1) user-created job titles are messy, (2) different job titles often overlap their job requirements, (3) job transition trajectories are inconsistent, and (4) the number of job titles in real world applications is large-scale. Toward this JTM problem, in this work, we propose a novel solution, named as JAMES, that constructs three unique embeddings of a target job title: topological, semantic, and syntactic embeddings, together with multi-aspect co-attention. In addition, we employ logical reasoning representations to collaboratively estimate similarities between messy job titles and standard job titles in the reasoning space. We conduct comprehensive experiments against ten competing models on the large-scale real-world dataset with more than 350,000 job titles. Our results show that JAMES significantly outperforms the best baseline by 10.06% in Precision@10 and by 17.52% in NDCG@10, respectively. Recently [20] built a 30,000 job title taxonomy on LinkedIn for a job understanding task. However, little is known about how the JTM task with the aforementioned challenges can be solved.Proposed Ideas. Toward these challenges, in this paper, we propose JAMES (Job title mApping with Multi-aspect Embeddings and rea Soning) to solve the JTM task. We use a large-scale and real-world career dataset with more than 350,000 job titles that a
Misinformation is one of the most fundamental problems in social media with increasing cases and underlying harmful effects on users. To mitigate such problem, misinformation warnings have been developed, including alerting with warning messages and hiding the contents. Previous studies mainly explored the most effective, one-size-fits-all design. Therefore, little has been known about customizable and flexible warning designs. In this study, we propose a “topic-aware misinformation warning” where users’ preferences for warning designs can vary on topics. To illustrate our ideas, we developed Twitter-like pages using three topics (i.e., politics, gossip, and Covid-19) and three designs (i.e., interstitial, contextual, and highlight). We conducted semi-structured interviews with 18 participants to explore their preferences and opinions on the designs. Our results show that users’ preferences for misinformation warnings are diverse in topics. Thus, topic-aware misinformation warning is promising to alleviate misinformation problems on Twitter.
Japanese Katakana is one component of the Japanese writing system and is used to express English terms, loanwords, and onomatopoeia in Japanese characters based on the phonemes. The main purpose of this research is to find the best entity matching methods between English and Katakana. We built two research questions to clarify which types of entity matching systems works better than others. The first question is what transliteration should be used for conversion. We need to transliterate English or Katakana terms into the same form in order to compute the string similarity. We consider five conversions that transliterate English to Katakana directly, Katakana to English directly, English to Katakana via phoneme, Katakana to English via phoneme, and both English and Katakana to phoneme. The second question is what should be used for the similarity measure at entity matching. To investigate the problem, we choose six methods, which are Overlap Coefficient, Cosine, Jaccard, Jaro-Winkler, Levenshtein, and the similarity of the phoneme probability predicted by RNN. Our results show that 1) matching using phonemes and conversion of Katakana to English works better than other methods, and 2) the similarity of phonemes outperforms other methods while other similarity score is changed depending on data and models. * The author is now at Google Inc. which is a phoneme of Japanese characters (Smith, 1996). Romaji is used in any context for non-Japanese speakers who cannot read Japanese characters, such as for names, passports, and any Japanese entities. Romaji is the most common way to input Japanese into computers and to display Japanese on devices that do not support Japanese characters (DeFrancis, 1984), and almost all Japanese people learn Romaji and are able to read and write Japanese using Romaji. Therefore, generally speaking, Japanese people who do not write in English usually use Romaji to express Katakana or foreign terms without Japanese characters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.