Advances in Large Language Models (LLMs) have inspired a surge of research exploring their expansion into the visual domain. While recent models exhibit promise in generating abstract captions for images and conducting natural conversations, their performance on textrich images leaves room for improvement. In this paper, we propose the Contrastive Reading Model (Cream), a novel neural architecture designed to enhance the language-image understanding capability of LLMs by capturing intricate details typically overlooked by existing methods. Cream integrates vision and auxiliary encoders, complemented by a contrastive feature alignment technique, resulting in a more effective understanding of textual information within document images. Our approach, thus, seeks to bridge the gap between vision and language understanding, paving the way for more sophisticated Document Intelligence Assistants. Rigorous evaluations across diverse tasks, such as visual question answering on document images, demonstrate the efficacy of Cream as a state-of-the-art model in the field of visual document understanding. We provide our codebase and newly-generated datasets at https://github.com/naver-ai/cream.
Low modulus materials that can shape-morph into different three-dimensional (3D) configurations in response to external stimuli have wide-ranging applications in flexible/stretchable electronics, surgical instruments, soft machines and soft robotics. This paper reports a shape-programmable system that exploits liquid metal microfluidic networks embedded in an elastomer matrix, with electromagnetic forms of actuation, to achieve a unique set of properties. Specifically, this materials structure is capable of fast, continuous morphing into a diverse set of continuous, complex 3D surfaces starting from a two-dimensional (2D) planar configuration, with fully reversible operation. Computational, multi-physics modeling methods and advanced 3D imaging techniques enable rapid, real-time transformations between target shapes. The liquid-solid phase transition of the liquid metal allows for shape fixation and reprogramming on demand. An unusual vibration insensitive, dynamic 3D display screen serves as an application example of this type of morphable surface.
Dynamic shape-morphing soft materials systems are ubiquitous in living organisms; they are also of rapidly increasing relevance to emerging technologies in soft machines 1-4 , flexible electronics 5-7 , and smart medicines 8,9 . Soft matter equipped with responsive components can switch between designed shapes or structures, but cannot support the types of dynamic morphing capabilities needed to reproduce natural, continuous processes of interest for many applications [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27] . Challenges lie in the development of schemes to reprogram target shapes post fabrication, especially when complexities associated with the operating physics and disturbances from the environment can prohibit the use of deterministic theoretical models to guide inverse design and control strategies 3,[28][29][30][31][32] . Here, we present a mechanical metasurface constructed from a matrix of filamentary metal traces, driven by reprogrammable, distributed Lorentz forces that follow from passage of electrical currents in the presence of a static magnetic field. The resulting system demonstrates complex, dynamic morphing capabilities with response times within 0.1 s. Implementing an in-situ stereo-imaging feedback strategy with a digitally controlled actuation scheme guided by an optimization algorithm, yields surfaces that can self-evolve into a wide range of 3-dimensional (3D) target shapes with high precision, including an ability to morph against extrinsic or intrinsic perturbations. These concepts support a data-driven approach to the design of dynamic, soft matter, with many unique characteristics.
Physically transient forms of electronics enable unique classes of technologies, ranging from biomedical implants that disappear through processes of bioresorption after serving a clinical need to internet-of-things devices that harmlessly dissolve into the environment following a relevant period of use. Here, we develop a sustainable manufacturing pathway, based on ultrafast pulsed laser ablation, that can support high-volume, cost-effective manipulation of a diverse collection of organic and inorganic materials, each designed to degrade by hydrolysis or enzymatic activity, into patterned, multi-layered architectures with high resolution and accurate overlay registration. The technology can operate in patterning, thinning and/or cutting modes with (ultra)thin eco/bioresorbable materials of different types of semiconductors, dielectrics, and conductors on flexible substrates. Component-level demonstrations span passive and active devices, including diodes and field-effect transistors. Patterning these devices into interconnected layouts yields functional systems, as illustrated in examples that range from wireless implants as monitors of neural and cardiac activity, to thermal probes of microvascular flow, and multi-electrode arrays for biopotential sensing. These advances create important processing options for eco/bioresorbable materials and associated electronic systems, with immediate applicability across nearly all types of bioelectronic studies.
Recently reported winged microelectronic systems offer passive flight mechanisms as a dispersal strategy for purposes in environmental monitoring, population surveillance, pathogen tracking, and other applications. Initial studies indicate potential for technologies of this type, but advances in structural and responsive materials and in aerodynamically optimized geometries are necessary to improve the functionality and expand the modes of operation. Here, we introduce environmentally degradable materials as the basis of 3D fliers that allow remote, colorimetric assessments of multiple environmental parameters—pH, heavy metal concentrations, and ultraviolet exposure, along with humidity levels and temperature. Experimental and theoretical investigations of the aerodynamics of these systems reveal design considerations that include not only the geometries of the structures but also their mass distributions across a range of bioinspired designs. Preliminary field studies that rely on drones for deployment and for remote colorimetric analysis by machine learning interpretation of digital images illustrate scenarios for practical use.
In vivo optogenetics and photopharmacology are two techniques for controlling neuronal activity that have immense potential in neuroscience research. Their applications in tether-free groups of animals have been limited in part due to tools availability. Here, we present a wireless, battery-free, programable multilateral optofluidic platform with user-selected modalities for optogenetics, pharmacology and photopharmacology. This system features mechanically compliant microfluidic and electronic interconnects, capabilities for dynamic control over the rates of drug delivery and real-time programmability, simultaneously for up to 256 separate devices in a single cage environment. Our behavioral experiments demonstrate control of motor behaviors in grouped mice through in vivo optogenetics with co-located gene delivery and controlled photolysis of caged glutamate. These optofluidic systems may expand the scope of wireless techniques to study neural processing in animal models.
Thermal sensations contribute to our ability to perceive and explore the physical world. Reproducing these sensations in a spatiotemporally programmable manner through wireless computer control could enhance virtual experiences beyond those supported by video, audio and, increasingly, haptic inputs. Flexible, lightweight and thin devices that deliver patterns of thermal stimulation across large areas of the skin at any location of the body are of great interest in this context. Applications range from those in gaming and remote socioemotional communications, to medical therapies and physical rehabilitation. Here, we present a set of ideas that form the foundations of a skin-integrated technology for power-efficient generation of thermal sensations across the skin, with real-time, closed-loop control. The systems exploit passive cooling mechanisms, actively switchable thermal barrier interfaces, thin resistive heaters and flexible electronics configured in a pixelated layout with wireless interfaces to portable devices, the internet and cloud data infrastructure. Systematic experimental studies and simulation results explore the essential mechanisms and guide the selection of optimized choices in design. Demonstration examples with human subjects feature active thermoregulation, virtual social interactions, and sensory expansion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.