Mi Zhang scite author profile

Sensors in everyday devices, such as our phones, wearables, and computers, leave a stream of digital traces. Personal sensing refers to collecting and analyzing data from sensors embedded in the context of daily life with the aim of identifying human behaviors, thoughts, feelings, and traits. This article provides a critical review of personal sensing research related to mental health, focused principally on smartphones, but also including studies of wearables, social media, and computers. We provide a layered, hierarchical model for translating raw sensor data into markers of behaviors and states related to mental health. Also discussed are research methods as well as challenges, including privacy and problems of dimensionality. Although personal sensing is still in its infancy, it holds great promise as a method for conducting mental health research and as a clinical tool for monitoring at-risk populations and providing the foundation for the next generation of mobile health (or mHealth) interventions.

show abstract

Novelty and Diversity in Top-N Recommendation -- Analysis and Evaluation

Hurley

Zhang

2011

ACM Trans. Internet Technol.

260

154

View full text Add to dashboard Cite

For recommender systems that base their product rankings primarily on a measure of similarity between items and the user query, it can often happen that products on the recommendation list are highly similar to each other and lack diversity. In this article we argue that the motivation of diversity research is to increase the probability of retrieving unusual or novel items which are relevant to the user and introduce a methodology to evaluate their performance in terms of novel item retrieval. Moreover, noting that the retrieval of a set of items matching a user query is a common problem across many applications of information retrieval, we formulate the trade-off between diversity and matching quality as a binary optimization problem, with an input control parameter allowing explicit tuning of this trade-off. We study solution strategies to the optimization problem and demonstrate the importance of the control parameter in obtaining desired system performance. The methods are evaluated for collaborative recommendation using two datasets and case-based recommendation using a synthetic dataset constructed from the public-domain Travel dataset.

show abstract

Statistical attack detection

2009

View full text Add to dashboard Cite

PGA-SiamNet: Pyramid Feature-Based Attention-Guided Siamese Network for Remote Sensing Orthoimagery Building Change Detection

Jiang

et al. 2020

Remote Sensing

163

View full text Add to dashboard Cite

In recent years, building change detection has made remarkable progress through using deep learning. The core problems of this technique are the need for additional data (e.g., Lidar or semantic labels) and the difficulty in extracting sufficient features. In this paper, we propose an end-to-end network, called the pyramid feature-based attention-guided Siamese network (PGA-SiamNet), to solve these problems. The network is trained to capture possible changes using a convolutional neural network in a pyramid. It emphasizes the importance of correlation among the input feature pairs by introducing a global co-attention mechanism. Furthermore, we effectively improved the long-range dependencies of the features by utilizing various attention mechanisms and then aggregating the features of the low-level and co-attention level; this helps to obtain richer object information. Finally, we evaluated our method with a publicly available dataset (WHU) building dataset and a new dataset (EV-CD) building dataset. The experiments demonstrate that the proposed method is effective for building change detection and outperforms the existing state-of-the-art methods on high-resolution remote sensing orthoimages in various metrics.

show abstract

Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware

Zhang

Zhong

et al. 2020

104

View full text Add to dashboard Cite

Line-based Multi-Label Energy Optimization for fisheye image rectification and calibration

Zhang

Yao

Mao

et al. 2015

View full text Add to dashboard Cite

Using content and network analysis to understand the social support exchange patterns and user behaviors of an online smoking cessation intervention program

Zhang

Yang

2014

Asso for Info Science & Tech

View full text Add to dashboard Cite

Informational support and nurturant support are two basic types of social support offered in online health communities. This study identifies types of social support in the QuitStop forum and brings insights to exchange patterns of social support and user behaviors with content analysis and social network analysis. Motivated by user information behavior, this study defines two patterns to describe social support exchange: initiated support exchange and invited support exchange. It is found that users with a longer quitting time tend to actively give initiated support, and recent quitters with a shorter abstinent time are likely to seek and receive invited support. This study also finds that support givers of informational support quit longer ago than support givers of nurturant support, and support receivers of informational support quit more recently than support receivers of nurturant support. Usually, informational support is offered by users at late quit stages to users at early quit stages. Nurturant support is also exchanged among users within the same quit stage. These findings help us understand how health consumers are supporting each other and reveal new capabilities of online intervention programs that can be designed to offer social support in a timely and effective manner.

show abstract

Privacy Risks of General-Purpose Language Models

Pan

Zhang

et al. 2020

View full text Add to dashboard Cite

Recently, a new paradigm of building generalpurpose language models (e.g., Google's Bert and OpenAI's GPT-2) in Natural Language Processing (NLP) for text feature extraction, a standard procedure in NLP systems that converts texts to vectors (i.e., embeddings) for downstream modeling, has arisen and starts to find its application in various downstream NLP tasks and real world systems (e.g., Google's search engine [6]). To obtain general-purpose text embeddings, these language models have highly complicated architectures with millions of learnable parameters and are usually pretrained on billions of sentences before being utilized. As is widely recognized, such a practice indeed improves the state-of-the-art performance of many downstream NLP tasks.However, the improved utility is not for free. We find the text embeddings from general-purpose language models would capture much sensitive information from the plain text. Once being accessed by the adversary, the embeddings can be reverseengineered to disclose sensitive information of the victims for further harassment. Although such a privacy risk can impose a real threat to the future leverage of these promising NLP tools, there are neither published attacks nor systematic evaluations by far for the mainstream industry-level language models.To bridge this gap, we present the first systematic study on the privacy risks of 8 state-of-the-art language models with 4 diverse case studies. By constructing 2 novel attack classes, our study demonstrates the aforementioned privacy risks do exist and can impose practical threats to the application of general-purpose language models on sensitive data covering identity, genome, healthcare and location. For example, we show the adversary with nearly no prior knowledge can achieve about 75% accuracy when inferring the precise disease site from Bert embeddings of patients' medical descriptions. As possible countermeasures, we propose 4 different defenses (via rounding, differential privacy, adversarial training and subspace projection) to obfuscate the unprotected embeddings for mitigation purpose. With extensive evaluations, we also provide a preliminary analysis on the utilityprivacy trade-off brought by each defense, which we hope may foster future mitigation researches.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mi Zhang

Personal Sensing: Understanding Mental Health Using Ubiquitous Sensors and Machine Learning

Novelty and Diversity in Top-N Recommendation -- Analysis and Evaluation

Statistical attack detection

PGA-SiamNet: Pyramid Feature-Based Attention-Guided Siamese Network for Remote Sensing Orthoimagery Building Change Detection

Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware

Line-based Multi-Label Energy Optimization for fisheye image rectification and calibration

Using content and network analysis to understand the social support exchange patterns and user behaviors of an online smoking cessation intervention program

Privacy Risks of General-Purpose Language Models

Contact Info

Product

Resources

About