To perform visual data exploration, many dimensionality reduction methods have been developed. These tools allow data analysts to represent multidimensional data in a 2D or 3D space, while preserving as much relevant information as possible. Yet, they cannot preserve all structures simultaneously and they induce some unavoidable distortions. Hence, many criteria have been introduced to evaluate a map's overall quality, mostly based on the preservation of neighbourhoods. Such global indicators are currently used to compare several maps, which helps to choose the most appropriate mapping method and its hyperparameters. However, those aggregated indicators tend to hide the local repartition of distortions. Thereby, they need to be supplemented by local evaluation to ensure correct interpretation of maps.In this paper, we describe a new method, called MING, for "Map Interpretation using Neighbourhood Graphs". It offers a graphical interpretation of pairs of map quality indicators, as well as local evaluation of the distortions. This is done by displaying on the map the nearest neighbours graphs computed in the data space and in the embedding. Shared and unshared edges exhibit reliable and unreliable neighbourhood information conveyed by the mapping. By this mean, analysts may determine whether proximity (or remoteness) of points on the map faithfully represents similarity (or dissimilarity) of original data, within the meaning of a chosen map quality criteria. We apply this approach to two pairs of widespread indicators: precision/recall and trustworthiness/continuity, chosen for their wide use in the community, which will allow an easy handling by users.
During the course of evolution, variations of a protein sequence is an ongoing phenomenon however limited by the need to maintain its structural and functional integrity. Deciphering the evolutionary path of a protein is thus of fundamental interest. With the development of new methods to visualize high dimension spaces and the improvement of phylogenetic analysis tools, it is possible to study the evolutionary trajectories of proteins in the sequence space. Using the Data-Driven High-Dimensional Scaling method, we show that it is possible to predict and represent potential evolutionary trajectories by representing phylogenetic trees into a 3D projection of the sequence space. With the case of the aminodeoxychorismate synthase, an enzyme involved in folate synthesis, we show that this representation raises interesting questions about the complexity of the evolution of a given biological function, in particular concerning its capacity to explore the sequence space.
Dimensionality reduction enables analysts to perform visual exploration of multidimensional data with a low-dimensional map retaining as much as possible of the original data structure. The interpretation of such a map relies on the hypothesis of preservation of neighborhood relations. Namely, distances in the map are assumed to reflect faithfully dissimilarities in the data space, as measured with a domain-related metric. Yet, in most cases, this hypothesis is undermined by distortions of those relations by the mapping process, which need to be accounted for during map interpretation. In this paper, we describe an interpretative support method called Map Interpretation using Neighborhood Graphs (MING) displaying individual neighborhood relations on the map, as edges of nearest neighbors graphs. The level of distortion of those relations is shown through coloring of the edges. This allows analysts to assess the reliability of similarity relations inferred from the map, while hinting at the original structure of data by showing the missing relations. Moreover, MING provides a local interpretation for classical map quality indicators, since the quantitative measure of distortions is based on those indicators. Overall, the proposed method alleviates the mapping-induced bias in interpretation while constantly reminding users that the map is not the data.
It is well known that the building energy userepresents a significant part of the total energy use,ca. 40% in USA according to the Building Energy Data Book. With the improvement of new construction's efficiency, the share of equipment's energy use increases more and more compared to the overall building energy use. This article proposes to study a new approach in building's systems Fault Detection and Diagnosis (FDD), so as to provide an intuitive FDD tool for every operator of the maintenance staff regardless of their qualifications. This new approach uses a method of multivariate statistics, provides easily understandable outputs allowing a quick comprehension of the equipment fault by building maintenance staff. Thereby the number of unsolved problems can be minimized and the intervention time would be considerably reduced, avoiding unexpected energy use and equipment's premature obsolescence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.