Advances in Large Language Models (LLMs) have inspired a surge of research exploring their expansion into the visual domain. While recent models exhibit promise in generating abstract captions for images and conducting natural conversations, their performance on textrich images leaves room for improvement. In this paper, we propose the Contrastive Reading Model (Cream), a novel neural architecture designed to enhance the language-image understanding capability of LLMs by capturing intricate details typically overlooked by existing methods. Cream integrates vision and auxiliary encoders, complemented by a contrastive feature alignment technique, resulting in a more effective understanding of textual information within document images. Our approach, thus, seeks to bridge the gap between vision and language understanding, paving the way for more sophisticated Document Intelligence Assistants. Rigorous evaluations across diverse tasks, such as visual question answering on document images, demonstrate the efficacy of Cream as a state-of-the-art model in the field of visual document understanding. We provide our codebase and newly-generated datasets at https://github.com/naver-ai/cream.
Dynamic shape-morphing soft materials systems are ubiquitous in living organisms; they are also of rapidly increasing relevance to emerging technologies in soft machines 1-4 , flexible electronics 5-7 , and smart medicines 8,9 . Soft matter equipped with responsive components can switch between designed shapes or structures, but cannot support the types of dynamic morphing capabilities needed to reproduce natural, continuous processes of interest for many applications [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27] . Challenges lie in the development of schemes to reprogram target shapes post fabrication, especially when complexities associated with the operating physics and disturbances from the environment can prohibit the use of deterministic theoretical models to guide inverse design and control strategies 3,[28][29][30][31][32] . Here, we present a mechanical metasurface constructed from a matrix of filamentary metal traces, driven by reprogrammable, distributed Lorentz forces that follow from passage of electrical currents in the presence of a static magnetic field. The resulting system demonstrates complex, dynamic morphing capabilities with response times within 0.1 s. Implementing an in-situ stereo-imaging feedback strategy with a digitally controlled actuation scheme guided by an optimization algorithm, yields surfaces that can self-evolve into a wide range of 3-dimensional (3D) target shapes with high precision, including an ability to morph against extrinsic or intrinsic perturbations. These concepts support a data-driven approach to the design of dynamic, soft matter, with many unique characteristics.
Low modulus materials that can shape-morph into different three-dimensional (3D) configurations in response to external stimuli have wide-ranging applications in flexible/stretchable electronics, surgical instruments, soft machines and soft robotics. This paper reports a shape-programmable system that exploits liquid metal microfluidic networks embedded in an elastomer matrix, with electromagnetic forms of actuation, to achieve a unique set of properties. Specifically, this materials structure is capable of fast, continuous morphing into a diverse set of continuous, complex 3D surfaces starting from a two-dimensional (2D) planar configuration, with fully reversible operation. Computational, multi-physics modeling methods and advanced 3D imaging techniques enable rapid, real-time transformations between target shapes. The liquid-solid phase transition of the liquid metal allows for shape fixation and reprogramming on demand. An unusual vibration insensitive, dynamic 3D display screen serves as an application example of this type of morphable surface.
Physically transient forms of electronics enable unique classes of technologies, ranging from biomedical implants that disappear through processes of bioresorption after serving a clinical need to internet-of-things devices that harmlessly dissolve into the environment following a relevant period of use. Here, we develop a sustainable manufacturing pathway, based on ultrafast pulsed laser ablation, that can support high-volume, cost-effective manipulation of a diverse collection of organic and inorganic materials, each designed to degrade by hydrolysis or enzymatic activity, into patterned, multi-layered architectures with high resolution and accurate overlay registration. The technology can operate in patterning, thinning and/or cutting modes with (ultra)thin eco/bioresorbable materials of different types of semiconductors, dielectrics, and conductors on flexible substrates. Component-level demonstrations span passive and active devices, including diodes and field-effect transistors. Patterning these devices into interconnected layouts yields functional systems, as illustrated in examples that range from wireless implants as monitors of neural and cardiac activity, to thermal probes of microvascular flow, and multi-electrode arrays for biopotential sensing. These advances create important processing options for eco/bioresorbable materials and associated electronic systems, with immediate applicability across nearly all types of bioelectronic studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.