A dataset for Hand-Held Object Recognition

Rivera-Rubio, Jose; Idrees, Saad; Alexiou, Ioannis; Hadjilucas, Lucas; Bharath, Anil A.

doi:10.1109/icip.2014.7026188

Cited by 8 publications

(3 citation statements)

References 15 publications

(13 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There are few datasets available for this task. The majority of relevant datasets, such as HOD [28] and SHORT [29], are focused on single object detection, where each object is fully centered in the image as opposed to "in the wild" detection of handheld objects. In order to further research into handheld object detection, we created our own dataset with train, validation, and test splits based on data from the COCO-train 2017 dataset [30], an 80 class dataset of 118,287 images that is an industry-standard benchmark for object detection models.…”

Section: E Addendummentioning

confidence: 99%

SHOP: A Deep Learning Based Pipeline for near Real-Time Detection of Small Handheld Objects Present in Blurry Video

Ganguly,

Gandhi,

et al. 2022

Preprint

View full text Add to dashboard Cite

While prior works have investigated and developed computational models capable of object detection, models still struggle to reliably interpret images with motion blur and small objects. Moreover, none of these models are specifically designed for handheld object detection. In this work, we present SHOP (Small Handheld Object Pipeline), a pipeline that reliably and efficiently interprets blurry images containing handheld objects. The specific models used in each stage of the pipeline are flexible and can be changed based on performance requirements. First, images are deblurred and then run through a pose detection system where areas-of-interest are proposed around the hands of any people present. Next, object detection is performed on the images by a single-stage object detector. Finally, the proposed areas-of-interest are used to filter out low confidence detections. Testing on a handheld subset of Microsoft Common Objects in Context (MS COCO) demonstrates that this 3 stage process results in a 70 percent decrease in false positives while only reducing true positives by 17 percent in its strongest configuration. We also present a subset of MS COCO consisting solely of handheld objects that can be used to continue the development of handheld object detection methods. https://github.com/spider-sense/SHOP

show abstract

Section: E Addendummentioning

confidence: 99%

SHOP: A Deep Learning Based Pipeline for near Real-Time Detection of Small Handheld Objects Present in Blurry Video

Ganguly,

Gandhi,

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Exemplary techniques are the well-known bar code or quick response (QR)-code, identifying a reference pattern on the floor, the image recognition of products by means of so-called planograms, cf. (Rivera-Rubio et al, 2014), or the transmission of coded light pulses above the critical flicker frequency of approximately 100 Hz. Using visual odometry, which correlates sequential images taken by the smart phone, a relative position estimate can be obtained without any knowledge of the environment, see (Shangguan et al, 2014).…”

Section: Sensors Actuators and Localization Techniquesmentioning

confidence: 99%

Sustainable Consumerism via Context-Aware Shopping

Klinglmayr

Bergmair

Klaffenböck

et al. 2017

International Journal of Distributed Systems and Technologies

View full text Add to dashboard Cite

We are living in a world of vast information. The means of the Internet allow access to diverse sources of information, with social media and Internet of Things technologies significantly expanding the informational ecosystem. With the use of social media, it is easy for ‘like-minded' people to group up and initiate movements. One way to articulate such movements is via political consumerism. Users group together and boycott or buycott (boost purchases) for certain products with a concrete collective goal in mind. If, however, the collective goal is vague and abstract, as in the case of sustainability, this bottom-up strategy may lose its popularity and attraction. In this paper, we introduce a new concept of how individual consumers can follow their own understanding of sustainability, while at the same time benefiting from collective and participatory actions. We discuss how the means of ICT can be used to develop political consumerism further to transform individual policies into collective statements.

show abstract

“…In total, more than 90,000 frames of video were labelled with positional ground-truth. The dataset is publicly available for download at http://rsm.bicv.org (Rivera-Rubio et al, 2014).…”

Section: The Datasetmentioning

confidence: 99%

Appearance-based indoor localization: A comparison of patch descriptor performance

Rivera-Rubio

Alexiou

Bharath

2015

Pattern Recognition Letters

Self Cite

View full text Add to dashboard Cite

Vision is one of the most important of the senses, and humans use it extensively during navigation. We evaluated different types of image and video frame descriptors that could be used to determine distinctive visual landmarks for localizing a person based on what is seen by a camera that they carry. To do this, we created a database containing over 3 km of video-sequences with ground-truth in the form of distance travelled along different corridors. Using this database, the accuracy of localization -both in terms of knowing which route a user is on -and in terms of position along a certain route, can be evaluated. For each type of descriptor, we also tested different techniques to encode visual structure and to search between journeys to estimate a user's position. The techniques include single-frame descriptors, those using sequences of frames, and both colour and achromatic descriptors. We found that single-frame indexing worked better within this particular dataset. This might be because the motion of the person holding the camera makes the video too dependent on individual steps and motions of one particular journey. Our results suggest that appearance-based information could be an additional source of navigational data indoors, augmenting that provided by, say, radio signal strength indicators (RSSIs). Such visual information could be collected by crowdsourcing low-resolution video feeds, allowing journeys made by different users to be associated with each other, and location to be inferred without requiring explicit mapping. This offers a complementary approach to methods based on simultaneous localization and mapping (SLAM) algorithms.

show abstract

A dataset for Hand-Held Object Recognition

Cited by 8 publications

References 15 publications

SHOP: A Deep Learning Based Pipeline for near Real-Time Detection of Small Handheld Objects Present in Blurry Video

SHOP: A Deep Learning Based Pipeline for near Real-Time Detection of Small Handheld Objects Present in Blurry Video

Sustainable Consumerism via Context-Aware Shopping

Appearance-based indoor localization: A comparison of patch descriptor performance

Contact Info

Product

Resources

About