Seyed Mojtaba Tabatabaie scite author profile

Tabatabaie

2020

Indoor positioning and navigation inside an area with no GPS-data availability is a challenging problem. There are applications such as augmented reality, autonomous driving, navigation of drones inside tunnels, in which indoor positioning gets crucial. In this paper, a tandem architecture of deep network-based systems, for the first time to our knowledge, is developed to address this problem. This structure is trained on the scene images being obtained through scanning of the desired area segments using photogrammetry. A CNN structure based on EfficientNet is trained as a classifier of the scenes, followed by a MobileNet CNN structure which is trained to perform as a regressor. The proposed system achieves amazingly fine precisions for both Cartesian position and quaternion information of the camera.

Attention-Based Face AntiSpoofing of RGB Camera using a Minimal End-2-End Neural Network

Ghofrani¹,

2020

Catiloc: Camera Image Transformer for Indoor Localization

Ghofrani

2021

In this paper the problem of single image indoor camera localization has been addressed. This is a difficult task, since no GPS is available and the training data being gathered for the indoor positioning system could be subject to many modifications such as occlusion, variation of illumination, or repetitive textures and patterns during the test, and these effects can easily fool any positioning system. In this paper, following the idea of self attention and the transformer networks, we customized the feature extraction system and the output extraction block of a recently used transformer in the image recognition task, so that to achieve the camera 3D position and 4D quaternion information. Moreover, an engineering implementation trick was employed, and the results were evaluated on the 7scenes dataset, and compared to the other state-of-theart methods. The output results show a consistent outperformance with rather a simpler, and faster configuration.

APS: A Large-Scale Multi-modal Indoor Camera Positioning System

Ghofrani

2021

Navigation inside a closed area with no GPS-signal accessibility is a highly challenging task. In order to tackle this problem, recently the imaging-based methods have grabbed the attention of many researchers. These methods either extract the features (e.g. using SIFT, or SOSNet) and map the descriptive ones to the camera position and rotation information, or deploy an end-to-end system that directly estimates this information out of RGB images, similar to PoseNet. While the former methods suffer from heavy computational burden during the test process, the latter suffers from lack of accuracy and robustness against environmental changes and object movements. However, end-to-end systems are quite fast during the test and inference and are pretty qualified for real-world applications, even though their training phase could be longer than the former ones. In this paper, a novel multi-modal end-to-end system for large-scale indoor positioning has been proposed, namely APS (Alpha Positioning System), which integrates a Pix2Pix GAN network to reconstruct the point cloud pair of the input query image, with a deep CNN network in order to robustly estimate the position and rotation information of the camera. For this integration, the existing datasets have the shortcoming of paired RGB/point cloud images for indoor environments. Therefore, we created a new dataset to handle this situation. By implementing the proposed APS system, we could achieve a highly accurate camera positioning with a precision level of less than a centimeter.

APS: A Large-Scale Multi-Modal Indoor Camera Positioning System

Ghofrani¹,

Toroghi²,

2021

Preprint

Navigation inside a closed area with no GPS-signal accessibility is a highly challenging task. In order to tackle this problem, recently the imaging-based methods have grabbed the attention of many researchers. These methods either extract the features (e.g. using SIFT, or SOSNet) and map the descriptive ones to the camera position and rotation information, or deploy an end-to-end system that directly estimates this information out of RGB images, similar to PoseNet. While the former methods suffer from heavy computational burden during the test process, the latter suffers from lack of accuracy and robustness against environmental changes and object movements. However, end-to-end systems are quite fast during the test and inference and are pretty qualified for real-world applications, even though their training phase could be longer than the former ones. In this paper, a novel multi-modal end-toend system for large-scale indoor positioning has been proposed, namely APS (Alpha Positioning System), which integrates a Pix2Pix GAN network to reconstruct the point cloud pair of the input query image, with a deep CNN network in order to robustly estimate the position and rotation information of the camera. For this integration, the existing datasets have the shortcoming of paired RGB/point cloud images for indoor environments. Therefore, we created a new dataset to handle this situation. By implementing the proposed APS system, we could achieve a highly accurate camera positioning with a precision level of less than a centimeter.