Places: An Image Database for Deep Scene Understanding

Zhou, Bolei; Khosla, Aditya; Lapedriza, Àgata; Torralba, Antonio; Oliva, Aude

doi:10.1167/17.10.296

Cited by 290 publications

(249 citation statements)

References 31 publications

Supporting

Mentioning

240

Contrasting

Unclassified

Order By: Relevance

“…While classification tasks require a single ground-truth label, using images with many labels gives the data more context. Furthermore, while large classification datasets exist, such as Places2 [25] with more than 10 million images and Tiny Images [26] containing 80 million image, the CAM2 system can retrieve more that 95 million images in a single day. Moreover, the data from network cameras can provide longterm observations.…”

Section: Automatic Labelingmentioning

confidence: 99%

Comparison of Visual Datasets for Machine Learning

Gauen

Dailey

Laiman

et al. 2017

2017 IEEE International Conference on Information Reuse and Integration (IRI)

View full text Add to dashboard Cite

Abstract-One of the greatest technological improvements in recent years is the rapid progress using machine learning for processing visual data. Among all factors that contribute to this development, datasets with labels play crucial roles. Several datasets are widely reused for investigating and analyzing different solutions in machine learning. Many systems, such as autonomous vehicles, rely on components using machine learning for recognizing objects. This paper compares different visual datasets and frameworks for machine learning. The comparison is both qualitative and quantitative and investigates object detection labels with respect to size, location, and contextual information. This paper also presents a new approach creating datasets using real-time, geo-tagged visual data, greatly improving the contextual information of the data. The data could be automatically labeled by cross-referencing information from other sources (such as weather).

show abstract

Section: Automatic Labelingmentioning

confidence: 99%

Comparison of Visual Datasets for Machine Learning

Gauen

Dailey

Laiman

et al. 2017

2017 IEEE International Conference on Information Reuse and Integration (IRI)

View full text Add to dashboard Cite

show abstract

“…For pose estimation we added two dense layers ('x' and 'q') with 3 respectively 4 neurons. All weighting parameters up to the 'pool10'-layer are initialized from a pre-trained version of SqueezeNet that was trained on a subset of the Places365 data set (Zhou et al, 2016). The additional layers are pre-trained using the Shop Façade data set and later on our test set of the Atrium.…”

Section: Methodsmentioning

confidence: 99%

“…It was shown that for pose regression, training on the Places data set (Zhou et al, 2014(Zhou et al, , 2016 leads to further improvement in terms of accuracy (Kendall et al, 2015), as it is a more suitable data set for pose regression. Since that, training was subsequent carried out on the Places data set.…”

Section: Trainingmentioning

confidence: 99%

Squeezeposenet: Image Based Pose Regression With Small Convolutional Neural Networks for Real Time Uas Navigation

Müller

Urban

Jutzi

2017

ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

ABSTRACT:The number of unmanned aerial vehicles (UAVs) is increasing since low-cost airborne systems are available for a wide range of users. The outdoor navigation of such vehicles is mostly based on global navigation satellite system (GNSS) methods to gain the vehicles trajectory. The drawback of satellite-based navigation are failures caused by occlusions and multi-path interferences. Beside this, local image-based solutions like Simultaneous Localization and Mapping (SLAM) and Visual Odometry (VO) can e.g. be used to support the GNSS solution by closing trajectory gaps but are computationally expensive. However, if the trajectory estimation is interrupted or not available a re-localization is mandatory. In this paper we will provide a novel method for a GNSS-free and fast image-based pose regression in a known area by utilizing a small convolutional neural network (CNN). With on-board processing in mind, we employ a lightweight CNN called SqueezeNet and use transfer learning to adapt the network to pose regression. Our experiments show promising results for GNSS-free and fast localization.

show abstract

“…Eventually, a model is built that can later be used to solve a particular problem which is known as fine tuning. Among the existing popular models, AlexNet [70], Places-CNN [71], and VGG_S [72] are widely used because they cover diversified applications. Despite the gain of popularity of deep learning, it is very computation intensive and requires expensive hardware and large set of training data.…”

Section: Literature Reviewmentioning

confidence: 99%

DTCTH: a discriminative local pattern descriptor for image classification

Rahman

et al. 2017

J Image Video Proc.

View full text Add to dashboard Cite

Despite lots of effort being exerted in designing feature descriptors, it is still challenging to find generalized feature descriptors, with acceptable discrimination ability, which are able to capture prominent features in various image processing applications. To address this issue, we propose a computationally feasible discriminative ternary census transform histogram (DTCTH) for image representation which uses dynamic thresholds to perceive the key properties of a feature descriptor. The code produced by DTCTH is more stable against intensity fluctuation, and it mainly captures the discriminative structural properties of an image by suppressing unnecessary background information. Thus, DTCTH becomes more generalized to be used in different applications with reasonable accuracies. To validate the generalizability of DTCTH, we have conducted rigorous experiments on five different applications considering nine benchmark datasets. The experimental results demonstrate that DTCTH performs as high as 28.08% better than the existing state-of-the-art feature descriptors such as GIST, SIFT, HOG, LBP, CLBP, OC-LBP, LGP, LTP, LAID, and CENTRIST.

show abstract

Places: An Image Database for Deep Scene Understanding

Cited by 290 publications

References 31 publications

Comparison of Visual Datasets for Machine Learning

Comparison of Visual Datasets for Machine Learning

Squeezeposenet: Image Based Pose Regression With Small Convolutional Neural Networks for Real Time Uas Navigation

DTCTH: a discriminative local pattern descriptor for image classification

Contact Info

Product

Resources

About