Nicholas Chiang scite author profile

Hardware component databases are vital resources in designing embedded systems. Since creating these databases requires hundreds of thousands of hours of manual data entry, they are proprietary, limited in the data they provide, and have random data entry errors. We present a machine learning based approach for creating hardware component databases directly from datasheets. Extracting data directly from datasheets is challenging because: (1) the data is relational in nature and relies on non-local context, (2) the documents are filled with technical jargon, and (3) the datasheets are PDFs, a format that decouples visual locality from locality in the document. Addressing this complexity has traditionally relied on human input, making it costly to scale. Our approach uses a rich data model, weak supervision, data augmentation, and multi-task learning to create these knowledge bases in a matter of days. We evaluate the approach on datasheets of three types of components and achieve an average quality of 77 F1 points—quality comparable to existing human-curated knowledge bases. We perform application studies that demonstrate the extraction of multiple data modalities including numerical properties and images. We show how different sources of supervision such as heuristics and human labels have distinct advantages that can be utilized together to improve knowledge base quality. Finally, we present a case study to show how this approach changes the way practitioners create hardware component knowledge bases.

show abstract

Automating the generation of hardware component knowledge bases

Hsiao

Chiang³

et al. 2019

View full text Add to dashboard Cite

Nurturing Strong Families in Singapore

Goh¹,

Chiang²,

Suhaiemi³

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nicholas Chiang

VISTA.COMP — an engineered checkpoint receptor agonist that potently suppresses T cell–mediated immune responses

Development and Validation of a Uremic Pruritus Treatment Algorithm and Patient Information Toolkit in Patients With Chronic Kidney Disease and End Stage Kidney Disease

Creating Hardware Component Knowledge Bases with Training Data Generation and Multi-task Learning

Automating the generation of hardware component knowledge bases

Nurturing Strong Families in Singapore

Contact Info

Product

Resources

About