End-to-end face parsing via interlinked convolutional neural networks

Yin, Zi; Yiu, Valentin; Hu, Xiaolin; Tang, Liang

doi:10.1007/s11571-020-09615-4

Cited by 28 publications

(15 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The localized regions are then sent to TaNet's segmentation pathways for pixel-wise prediction as shown in Figure 1. After segmenting the ROIs using TaNet's pathways, these regions can be remapped [23] to their original positions using a reverse grid transformer (G −1 ).…”

Section: ) Bilinear Sampler (S)mentioning

confidence: 99%

See 1 more Smart Citation

Trilateral Attention Network for Real-Time Cardiac Region Segmentation

2021

View full text Add to dashboard Cite

The accurate segmentation of cardiac images into anatomically meaningful regions is critical for the extraction of quantitative cardiac indices. The common pipeline for segmentation comprises regions of interest (ROIs) localization and segmentation stages that are independent of each other and typically performed using separate models. In this paper, we propose an end-to-end network, called Trilateral Attention Network (TaNet), for real-time region localization and segmentation. TaNet has a module for ROIs localization and three segmentation pathways: spatial pathway, handcrafted pathway, and context pathway. The localization module focuses segmentation attention on the desired region while learning the context relationship between different regions in the image. The localized regions are then sent to the three pathways for segmentation. The spatial pathway, which has regular convolutional kernels, is used to extract deep features at different levels of abstraction. The handcrafted pathway, which has hand-designed convolutional kernels, is used to extract a unique set of features complementary to the deep features. Finally, the context (or global) pathway is used to enlarge the receptive field. By jointly training TaNet for localization and segmentation, TaNet achieved superior performance, in terms of accuracy and speed, when evaluated on two echocardiography datasets for cardiac region segmentation. The code and models will be made publicly available in TaNet Github page.

show abstract

Section: ) Bilinear Sampler (S)mentioning

confidence: 99%

“…The ground truth transformation matrix is calculated for each region (θ gt r , r ∈ {1, 2, ..N }) as described in [23]. Specifically, we calculated the central coordinates (x, y) for each coarsely segmented ROI r (r ∈ {1, ..N }) and estimated θ gt as:…”

Section: Tanet Trainingmentioning

confidence: 99%

Trilateral Attention Network for Real-Time Cardiac Region Segmentation

2021

View full text Add to dashboard Cite

show abstract

“…Localization Network Prior to the use of localization network, we use FCN-8 [16,26] model for coarse segmentation. Providing a coarse segmentation of different ROIs allows the localization network to 1) generate the transformation parameters (θ) for these regions and 2) learn the context relationship among them.…”

Section: Rois Localizationmentioning

confidence: 99%

“…where θ ∈ R N ×2×3 ; here N = 5 as there are five cardiac regions as shown in Figure 1. As for the localization network (L), we used a simplified version of VGG16 [26] that has 8 convolutional layers and a final regression layer to generate N × 2 × 3 spatial transformation matrix (θ). L outputs the spatial transformation matrix (θ) as shown in Figure 1.…”

Section: Rois Localizationmentioning

confidence: 99%

Trilateral Attention Network for Real-time Medical Image Segmentation

Zamzmi,

Sachdev,

Antani

2021

Preprint

View full text Add to dashboard Cite

Accurate segmentation of medical images into anatomically meaningful regions is critical for the extraction of quantitative indices or biomarkers. The common pipeline for segmentation comprises regions of interest detection stage and segmentation stage, which are independent of each other and typically performed using separate deep learning networks. The performance of the segmentation stage highly relies on the extracted set of spatial features and the receptive fields. In this work, we propose an end-to-end network, called Trilateral Attention Network (TaNet), for real-time detection and segmentation in medical images. TaNet has a module for region localization, and three segmentation pathways: 1) handcrafted pathway with hand-designed convolutional kernels, 2) detail pathway with regular convolutional kernels, and 3) a global pathway to enlarge the receptive field. The first two pathways encode rich handcrafted and low-level features extracted by hand-designed and regular kernels while the global pathway encodes high-level context information. By jointly training the network for localization and segmentation using different sets of features, TaNet achieved superior performance, in terms of accuracy and speed, when evaluated on an echocardiography dataset for cardiac segmentation. The code and models will be made publicly available in TaNet Github page.

show abstract

“…We removed all skip-connections from S-FRCNN and obtained a model called S-FRCNN (no-skip). A CNN model with similar architecture has been used in face parsing [42,39], but different blocks do not share † https://cslikai.cn/project/AFRCNN ‡ https://github.com/etzinis/sudo_rm_rf/blob/master/sudo_rm_rf/dnn/models/ improved_sudormrf.py 1, it is seen that S-FRCNN (no-skip) achieved worse results than the original S-FRCNN.…”

Section: Comparison Of Micro-level Updating Schemesmentioning

confidence: 99%

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Hu¹,

Li²,

Zhang³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Recent advances in the design of neural network architectures, in particular those specialized in modeling sequences, have provided significant improvements in speech separation performance. In this work, we propose to use a bio-inspired architecture called Fully Recurrent Convolutional Neural Network (FRCNN) to solve the separation task. This model contains bottom-up, top-down and lateral connections to fuse information processed at various time-scales represented by stages. In contrast to the traditional approach updating stages in parallel, we propose to first update the stages one by one in the bottom-up direction, then fuse information from adjacent stages simultaneously and finally fuse information from all stages to the bottom stage together. Experiments showed that this asynchronous updating scheme achieved significantly better results with much fewer parameters than the traditional synchronous updating scheme. In addition, the proposed model achieved good balance between speech separation accuracy and computational efficiency as compared to other state-of-the-art models on three benchmark datasets.

show abstract

End-to-end face parsing via interlinked convolutional neural networks

Cited by 28 publications

References 28 publications

Trilateral Attention Network for Real-Time Cardiac Region Segmentation

Trilateral Attention Network for Real-Time Cardiac Region Segmentation

Trilateral Attention Network for Real-time Medical Image Segmentation

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Contact Info

Product

Resources

About