B Birmingham scite author profile

B Birmingham

5Publications

16Citation Statements Received

66Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Malta

Publications

Order By: Most citations

Adding the Third Dimension to Spatial Relation Detection in 2D Images

Birmingham¹,

Muscat²,

Belz³

2018

View full text Add to dashboard Cite

Detection of spatial relations between objects in images is currently a popular subject in image description research. A range of different language and geometric object features have been used in this context, but methods have not so far used explicit information about the third dimension (depth), except when manually added to annotations. The lack of such information hampers detection of spatial relations that are inherently 3D. In this paper, we use a fully automatic method for creating a depth map of an image and derive several different object-level depth features from it which we add to an existing feature set to test the effect on spatial relation detection. We show that performance increases are obtained from adding depth features in all scenarios tested.

show abstract

Effect of Data Annotation, Feature Selection and Model Choice on Spatial Description Generation in French

Belz¹,

Muscat²,

Birmingham³

et al. 2016

View full text Add to dashboard Cite

In this paper, we look at automatic generation of spatial descriptions in French, more particularly, selecting a spatial preposition for a pair of objects in an image. Our focus is on assessing the effect on accuracy of (i) increasing data set size, (ii) removing synonyms from the set of prepositions used for annotation, (iii) optimising feature sets, and (iv) training on best prepositions only vs. training on all acceptable prepositions. We describe a new data set where each object pair in each image is annotated with the best and all acceptable prepositions that describe the spatial relationship between the two objects. We report results for three new methods for this task, and find that the best, 75% Accuracy, is 25 points higher than our previous best result for this task.

show abstract

KENGIC: KEyword-driven and N-Gram Graph based Image Captioning

Birmingham¹,

Muscat²

2023

Preprint

View full text Add to dashboard Cite

This paper presents a Keyword-driven and N-gram Graph based approach for Image Captioning (KENGIC). Most current state-of-the-art image caption generators are trained endto-end on large scale paired image-caption datasets which are very laborious and expensive to collect. Such models are limited in terms of their explainability and their applicability across different domains. To address these limitations, a simple model based on N-Gram graphs which does not require any end-toend training on paired image captions is proposed. Starting with a set of image keywords considered as nodes, the generator is designed to form a directed graph by connecting these nodes through overlapping n-grams as found in a given text corpus. The model then infers the caption by maximising the most probable n-gram sequences from the constructed graph. To analyse the use and choice of keywords in context of this approach, this study analysed the generation of image captions based on (a) keywords extracted from gold standard captions and (b) from automatically detected keywords. Both quantitative and qualitative analyses demonstrated the effectiveness of KENGIC. The performance achieved is very close to that of current state-of-the-art image caption generators that are trained in the unpaired setting. The analysis of this approach could also shed light on the generation process behind current top performing caption generators trained in the paired setting, and in addition, provide insights on the limitations of the current most widely used evaluation metrics in automatic image captioning.

show abstract

Clustering-based Model for Predicting Multi-spatial Relations in Images

Birmingham

Muscat

2019

View full text Add to dashboard Cite

Using thumbnail affinity for fragmentation point detection of JPEG files

Birmingham

Farrugia

Vella

2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

B Birmingham

Adding the Third Dimension to Spatial Relation Detection in 2D Images

Effect of Data Annotation, Feature Selection and Model Choice on Spatial Description Generation in French

KENGIC: KEyword-driven and N-Gram Graph based Image Captioning

Clustering-based Model for Predicting Multi-spatial Relations in Images

Using thumbnail affinity for fragmentation point detection of JPEG files

Contact Info

Product

Resources

About