Aveek Purohit scite author profile

Large foundation models can exhibit unique capabilities depending on the domain of data they are trained on. While these domains are generic, they may only barely overlap. For example, visual-language models (VLMs) are trained on Internet-scale image captions, but large language models (LMs) are further trained on Internetscale text with no images (e.g. from spreadsheets, to SAT questions). As a result, these models store different forms of commonsense knowledge across different domains. In this work, we show that this model diversity is symbiotic, and can be leveraged to build AI systems with structured Socratic dialogue -in which new multimodal tasks are formulated as a guided languagebased exchange between different pre-existing foundation models, without additional finetuning. In the context of egocentric perception, we present a case study of Socratic Models (SMs) that can provide meaningful results for complex tasks such as generating freeform answers to contextual questions about egocentric video, by formulating video Q&A as short story Q&A, i.e. summarizing the video into a short story, then answering questions about it. Additionally, SMs can generate captions for Internet images, and are competitive with state-of-the-art on zero-shot video-to-text retrieval with 42.8 R@1 on MSR-VTT 1k-A. SMs demonstrate how to compose foundation models zeroshot to capture new multimodal functionalities, without domain-specific data collection. Prototypes are available at socraticmodels.github.io.

show abstract

A compliant multi-module robot for climbing big step-like obstacles

Avinash

Srivastava

Purohit

et al. 2014

View full text Add to dashboard Cite

A novel compliant robot is proposed for traversing on unstructured terrains. The robot consists of modules, each containing a link and an active wheel-pair, and neighboring modules are connected using a passive joint. This type of robots are lighter and provide high durability due to the absence of link-actuators. However, they have limited climbing ability due to tendency of tipping over while climbing big obstacles. To overcome this disadvantage, the use of compliant joints is proposed in this work. Stiffness of each compliant joint is estimated by formulating an optimization problem with an objective to minimize link joint moments while maintaining static-equilibrium. This is one of the key novelties of the proposed work. A design methodology is also proposed for developing an n-module compliant robot for climbing a given height on a known surface. The efficacy of the proposed formulation is illustrated using numerical simulations of the three and five module robots. The robot is successfully able to climb maximum heights upto three times and six times the wheel diameter using three and five modules, respectively. A working prototype was developed and the simulation results were successfully validated on it.

show abstract

SugarTrail: Indoor navigation in retail environments without surveys and maps

Purohit

Sun

Pan

et al. 2013

View full text Add to dashboard Cite

Deployment of swarms of micro-aerial vehicles: From theory to practice

Purohit

Zhang

Sadler

et al. 2014

View full text Add to dashboard Cite

We study the problem of deploying a high number of low-cost, low-complexity robots inside a known environment with the objective that at least one robotic platform reaches each of N preassigned goal locations. Our study is inspired by SensorFly, a micro-aerial vehicle successfully used for mobile sensor network applications. SensorFly nodes feature limited on-board sensors, so one has to rely on simple navigation strategies and increase performance through redundance in the team. We introduce a simple, fully scalable deployment algorithm exploiting the limited capabilities offered by the SensorFly platform, and we explore its performance by feeding the simulation system with parameters extracted from the real SensorFly platform.

show abstract

Demo abstract: Collaborative indoor sensing with the SensorFly aerial sensor network

Purohit

Mokaya

Zhang

2012

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Aveek Purohit

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

A compliant multi-module robot for climbing big step-like obstacles

SugarTrail: Indoor navigation in retail environments without surveys and maps

Deployment of swarms of micro-aerial vehicles: From theory to practice

Demo abstract: Collaborative indoor sensing with the SensorFly aerial sensor network

Contact Info

Product

Resources

About