2019
DOI: 10.48550/arxiv.1904.03816
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Real-Time Automatic Portrait Matting on Mobile Devices

Abstract: We tackle the problem of automatic portrait matting on mobile devices. The proposed model is aimed at attaining real-time inference on mobile devices with minimal degradation of model performance. Our model MMNet, based on multi-branch dilated convolution with linear bottleneck blocks, outperforms the state-of-the-art model and is orders of magnitude faster. The model can be accelerated four times to attain 30 FPS on Xiaomi Mi 5 device with moderate increase in the gradient error. Under the same conditions, ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 36 publications
0
3
0
Order By: Relevance
“…Seo et al [23] explore an approach targeting real-time execution on mobile devices. Their network is lightweight, using depthwise-separable convolutions and weight quantization.…”
Section: Image Mattingmentioning
confidence: 99%
See 1 more Smart Citation
“…Seo et al [23] explore an approach targeting real-time execution on mobile devices. Their network is lightweight, using depthwise-separable convolutions and weight quantization.…”
Section: Image Mattingmentioning
confidence: 99%
“…COSNet [17], despite being a video method, showed considerable flickering. MMNet [23] exhibited the worst result, likely because it's unsuited to large images. In most comparisons, our proposed approach was preferred over the rest.…”
Section: Subjective Evaluationmentioning
confidence: 99%
“…It contains 2000 images of 600 × 800 resolution where 1700 and 300 images are split as training and testing set respectively. To overcome the lack of training data, we augment images by utilizing rotation and left-right flip, as suggested in [36]. Each training image is rotated by [−15 • , 15 • ] in steps of 5 • and left-right flipped, which means that a total of 23800 training images are obtained.…”
Section: Datasetmentioning
confidence: 99%