Classic public switched telephone networks (PSTN) are often a black box for VoIP network providers, as they have no access to performance indicators, such as delay or packet loss. Only the degraded output speech signal can be used to monitor the speech quality of these networks. However, the current state-of-the-art speech quality models are not reliable enough to be used for live monitoring. One of the reasons for this is that PSTN distortions can be unique depending on the provider and country, which makes it difficult to train a model that generalizes well for different PSTN networks. In this paper, we present a new open-source PSTN speech quality test set with over 1000 crowdsourced real phone calls. Our proposed noreference model outperforms the full-reference POLQA and noreference P.563 on the validation and test set. Further, we analyzed the influence of file cropping on the perceived speech quality and the influence of the number of ratings and training size on the model accuracy.
A primary goal of remote collaboration tools is to provide effective and inclusive meetings for all participants. To study meeting effectiveness and meeting inclusiveness, we first conducted a large-scale email survey (N=4,425; after filtering N=3,290) at a large technology company (pre-COVID-19); using this data we derived a multivariate model of meeting effectiveness and show how it correlates with meeting inclusiveness, participation, and feeling comfortable to contribute. We believe this is the first such model of meeting effectiveness and inclusiveness. The large size of the data provided the opportunity to analyze correlations that are specific to sub-populations such as the impact of video. The model shows the following factors are correlated with inclusiveness, effectiveness, participation, and feeling comfortable to contribute in meetings: sending a pre-meeting communication, sending a post-meeting summary, including a meeting agenda, attendee location, remote-only meeting, audio/video quality and reliability, video usage, and meeting size. The model and survey results give a quantitative understanding of how and where to improve meeting effectiveness and inclusiveness and what the potential returns are. Motivated by the email survey results, we implemented a post-meeting survey into a leading computer-mediated communication (CMC) system to directly measure meeting effectiveness and inclusiveness (during COVID-19). Using initial results based on internal flighting we created a similar model of effectiveness and inclusiveness, with many of the same findings as the email survey. This shows a method of measuring and understanding these metrics which are both practical and useful in a commercial CMC system. By improving meeting effectiveness, companies can save significant time and money. Improving meeting inclusiveness is hypothesized to improve meeting effectiveness, but also improves the working environment and employee retention at organizations.
User-perceived quality-of-experience (QoE) in internet telephony systems is commonly evaluated using subjective ratings computed as a Mean Opinion Score (MOS). In such systems, while user MOS can be tracked on an ongoing basis, it does not give insight into which factors of a call induced any perceived degradation in QoE -it does not tell us what caused a user to have a sub-optimal experience. For effective planning of product improvements, we are interested in understanding the impact of each of these degrading factors, allowing the estimation of the return (i.e., the improvement in user QoE) for a given investment. To obtain such insights, we advocate the use of an end-of-call "problem token questionnaire" (PTQ) which probes the user about common call quality issues (e.g., distorted audio or frozen video) which they may have experienced. In this paper, we show the efficacy of this questionnaire using data gathered from over 700,000 end-of-call surveys gathered from Skype (a large commercial VoIP application). We present a method to rank call quality and reliability issues and address the challenge of isolating independent factors impacting the QoE. Finally, we present representative examples of how these problem tokens have proven to be useful in practice.
Meetings are a pervasive method of communication within all types of companies and organizations, and using remote collaboration systems to conduct meetings has increased dramatically since the COVID-19 pandemic. However, not all meetings are inclusive, especially in terms of the participation rates among attendees. In a recent large-scale survey conducted at Microsoft, the top suggestion given by meeting participants for improving inclusiveness is to improve the ability of remote participants to interrupt and acquire the floor during meetings. We show that the use of the virtual raise hand (VRH) feature can lead to an increase in predicted meeting inclusiveness at Microsoft. One challenge is that VRH is used in less than 1% of all meetings. In order to drive adoption of its usage to improve inclusiveness (and participation), we present a machine learningbased system that predicts when a meeting participant attempts to obtain the floor, but fails to interrupt (termed a 'failed interruption'). This prediction can be used to nudge the user to raise their virtual hand within the meeting. We believe this is the first failed speech interruption detector, and the performance on a realistic test set has an area under curve (AUC) of 0.95 with a true positive rate (TPR) of 50% at a false positive rate (FPR) of < 1%. To our knowledge, this is also the first dataset of interruption categories (including the failed interruption category) for remote meetings. Finally, we believe this is the first such system designed to improve meeting inclusiveness through speech interruption analysis and active intervention.
A stochastic graph process with a Markov property is introduced to model the flow of an infectious disease over a known contact network. The model provides a probability distribution over unobserved infectious pathways. The basic reproductive number in compartmental models is generalized to a dynamic reproductive number based on the sequence of outdegrees in the graph process. The cumulative resistance and threat associated with each individual is also measured based on the cumulative indegree and outdegree of the graph process. The model is applied to the outbreak data from the 2001 foot‐and‐mouth (FMD) outbreak in the United Kingdom. The Canadian Journal of Statistics 40: 55–67; 2012 © 2012 Statistical Society of Canada
Abstract. Principal components analysis is a well-known statistical method in dealing with large dependent data sets. It is also used in functional data for both purposes of data reduction as well as variation representation. On the other hand "handwriting" is one of the objects, studied in various statistical fields like pattern recognition and shape analysis. Considering time as the argument, the handwriting would be an infinite dimensional data; a functional object. In this paper we try to use the functional principal components analysis (FPCA) to the Persian handwriting data, analyzing the word Mehr which is the Persian term for Love.
Failure to accurately measure the outcomes of an experiment can lead to bias and incorrect conclusions. Online controlled experiments (aka AB tests) are increasingly being used to make decisions to improve websites as well as mobile and desktop applications. We argue that loss of telemetry data (during upload or post-processing) can skew the results of experiments, leading to loss of statistical power and inaccurate or erroneous conclusions. By systematically investigating the causes of telemetry loss, we argue that it is not practical to entirely eliminate it. Consequently, experimentation systems need to be robust to its effects. Furthermore, we note that it is nontrivial to measure the absolute level of telemetry loss in an experimentation system. In this paper, we take a top-down approach towards solving this problem. We motivate the impact of loss qualitatively using experiments in real applications deployed at scale, and formalize the problem by presenting a theoretical breakdown of the bias introduced by loss. Based on this foundation, we present a general framework for quantitatively evaluating the impact of telemetry loss, and present two solutions to measure the absolute levels of loss. This framework is used by well-known applications at Microsoft, with millions of users and billions of sessions. These general principles can be adopted by any application to improve the overall trustworthiness of experimentation and data-driven decision making.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.