Nikita Bhalla scite author profile

One of the ways blind people understand their surroundings is by clicking images and relying on descriptions generated by image captioning systems. Current work on captioning images for the visually impaired do not use the textual data present in the image when generating captions. This problem is critical as many visual scenes contain text. Moreover, up to 21% of the questions asked by blind people about the images they click pertain to the text present in them (Bigham et al., 2010). In this work, we propose altering AoANet, a state-of-the-art image captioning model, to leverage the text detected in the image as an input feature. In addition, we use a pointer-generator mechanism to copy the detected text to the caption when tokens need to be reproduced accurately. Our model outperforms AoANet on the benchmark dataset VizWiz, giving a 35% and 16.2% performance improvement on CIDEr and SPICE scores, respectively..

show abstract

Multi-Modal Image Captioning for the Visually Impaired

Ahsan¹,

Bhalla²,

Bhatt³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Local Edge Dynamics and Opinion Polarization

Bhalla

Adam

Musco

2023

View full text Add to dashboard Cite

Local Edge Dynamics and Opinion Polarization

Bhalla¹,

Lechowicz²,

Musco³

2021

Preprint

View full text Add to dashboard Cite

The proliferation of social media platforms, recommender systems, and their joint societal impacts have prompted significant interest in opinion formation and evolution within social networks. In this work, we study how local dynamics in a network can drive opinion polarization. In particular, we study time evolving networks under the classic Friedkin-Johnsen opinion model. Edges are iteratively added or deleted according to simple local rules, modeling decisions based on individual preferences and network recommendations. 1 We give theoretical bounds showing how individual edge updates affect polarization, and a related measure of disagreement across edges. Via simulations on synthetic and real-world graphs, we find that the presence of two simple dynamics gives rise to high polarization: 1) confirmation bias -i.e., the preference for nodes to connect to other nodes with similar expressed opinions and 2) friend-of-friend link recommendations, which encourage new connections between closely connected nodes. We also investigate the role of fixed connections which are not subject to these dynamics. We find that even a small number of fixed edges can significantly limit polarization, but still lead to multimodal opinion distributions, which may be considered polarized in a different sense.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nikita Bhalla

Fluoxetine improves the memory deficits caused by the chemotherapy agent 5-fluorouracil

Multi-Modal Image Captioning for the Visually Impaired

Multi-Modal Image Captioning for the Visually Impaired

Local Edge Dynamics and Opinion Polarization

Local Edge Dynamics and Opinion Polarization

Contact Info

Product

Resources

About