Yong Shi scite author profile

Transformers, which are popular for language modeling, have been explored for solving vision tasks recently, e.g., the Vision Transformers (ViT) for image classification. The ViT model splits each image into a sequence of tokens with fixed length and then applies multiple Transformer layers to model their global relation for classification. However, ViT achieves inferior performance compared with CNNs when trained from scratch on a midsize dataset (e.g., ImageNet). We find it is because: 1) the simple tokenization of input images fails to model the important local structure (e.g., edges, lines) among neighboring pixels, leading to its low training sample efficiency; 2) the redundant attention backbone design of ViT leads to limited feature richness in fixed computation budgets and limited training samples.To overcome such limitations, we propose a new Tokens-To-Token Vision Transformers (T2T-ViT), which introduces 1) a layer-wise Tokens-to-Token (T2T) transformation to progressively structurize the image to tokens by recursively aggregating neighboring Tokens into one Token (Tokens-to-Token), such that local structure presented by surrounding tokens can be modeled and tokens length can be reduced; 2) an efficient backbone with a deep-narrow structure for vision transformers motivated by CNN architecture design after extensive study. Notably, T2T-ViT reduces the parameter counts and MACs of vanilla ViT by 200%, while achieving more than 2.5% improvement when trained from scratch on ImageNet. It also outperforms ResNets and achieves comparable performance with MobileNets by directly training on ImageNet. For example, T2T-ViT with ResNet50 comparable size can achieve 80.7% accuracy on ImageNet. 1

show abstract

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Chen²,

Wang

et al. 2021

1,175

309

View full text Add to dashboard Cite

Smart Face Mask Based on an Ultrathin Pressure Sensor for Wireless Monitoring of Breath Conditions

Zhong

Li²,

Takakuwa

et al. 2021

Advanced Materials

View full text Add to dashboard Cite

A smart face mask that can conveniently monitor breath information is beneficial for maintaining personal health and preventing the spread of diseases. However, some challenges still need to be addressed before such devices can be of practical use. One key challenge is to develop a pressure sensor that is easily triggered by low pressure and has excellent stability as well as electrical and mechanical properties. In this study, a wireless smart face mask is designed by integrating an ultrathin self‐powered pressure sensor and a compact readout circuit with a normal face mask. The pressure sensor is the thinnest (totally compressed thickness of ≈5.5 µm) and lightest (total weight of ≈4.5 mg) electrostatic pressure sensor capable of achieving a peak open‐circuit voltage of up to ≈10 V when stimulated by airflow, which endows the sensor with the advantage of readout circuit miniaturization and makes the breath‐monitoring system portable and wearable. To demonstrate the capabilities of the smart face mask, it is used to wirelessly measure and analyze the various breath conditions of multiple testers.

show abstract

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Liu

Wen

et al. 2022

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Malicious Domain Name Detection Based on Extreme Machine Learning

Shi

Gong

2017

Neural Process Lett

View full text Add to dashboard Cite

A trusted computing environment model in cloud architecture

Zhou

Shi

et al. 2010

View full text Add to dashboard Cite

Intelligent intrusion detection based on federated learning aided long short-term memory

Zhao

Yin

Shi

et al. 2020

Physical Communication

View full text Add to dashboard Cite

Smart Face Mask Based on an Ultrathin Pressure Sensor for Wireless Monitoring of Breath Conditions (Adv. Mater. 6/2022)

Zhong

Li²,

Takakuwa

et al. 2022

Advanced Materials

View full text Add to dashboard Cite

Wearable Healthcare Devices Convenient breath monitoring via wearable devices is helpful for personal healthcare, especially during the COVID‐19 pandemic. In article number 2107758, Kenjiro Fukuda, Takao Someya, and co‐workers develop a wearable smart face mask based on an ultrathin self‐powered pressure sensor with high output ability, and various breath conditions from multiple testers are wirelessly detected and analyzed.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yong Shi

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Smart Face Mask Based on an Ultrathin Pressure Sensor for Wireless Monitoring of Breath Conditions

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Malicious Domain Name Detection Based on Extreme Machine Learning

A trusted computing environment model in cloud architecture

Intelligent intrusion detection based on federated learning aided long short-term memory

Smart Face Mask Based on an Ultrathin Pressure Sensor for Wireless Monitoring of Breath Conditions (Adv. Mater. 6/2022)

Contact Info

Product

Resources

About