Mark Roantree scite author profile

The usefulness of genomic prediction in crop and livestock breeding programs has prompted efforts to develop new and improved genomic prediction algorithms, such as artificial neural networks and gradient tree boosting. However, the performance of these algorithms has not been compared in a systematic manner using a wide range of datasets and models. Using data of 18 traits across six plant species with different marker densities and training population sizes, we compared the performance of six linear and six non-linear algorithms. First, we found that hyperparameter selection was necessary for all non-linear algorithms and that feature selection prior to model training was critical for artificial neural networks when the markers greatly outnumbered the number of training lines. Across all species and trait combinations, no one algorithm performed best, however predictions based on a combination of results from multiple algorithms (i.e., ensemble predictions) performed consistently well. While linear and non-linear algorithms performed best for a similar number of traits, the performance of non-linear algorithms vary more between traits. Although artificial neural networks did not perform best for any trait, we identified strategies (i.e., feature selection, seeded starting weights) that boosted their performance to near the level of other algorithms. Our results highlight the importance of algorithm selection for the prediction of trait values.

show abstract

A Path-Oriented RDF Index for Keyword Search Query Processing

Cappellari

Virgilio

Maccioni

et al. 2011

View full text Add to dashboard Cite

"Most of the recent approaches to keyword search employ graph structured representation of data. Answers to queries are generally sub-structures of the graph, containing one or more keywords. While finding the nodes matching keywords is relatively easy, determining the connections between such nodes is a complex problem requiring on-the-fly time consuming graph exploration. Current techniques suffer from poorly performing worst case scenario or from indexing schemes that provide little support to the discovery of connections between nodes. In this paper, we present an indexing scheme for RDF that exposes the structural characteristics of the graph, its paths and the information on the reachability of nodes. This knowledge is exploited to expedite the retrieval of the sub-structures representing the query results. In addition, the index is organized to facilitate maintenance operations as the dataset evolves. Experimental results demonstrates the feasibility of our index that significantly improves the query execution performance.

show abstract

Using a Metadata Software Layer in Information Systems Integration

Roantree

Kennedy

Barclay

2001

View full text Add to dashboard Cite

A Federated Information System requires that multiple (often heterogenous) information systems are integrated to an extent that they can share data. This shared data often takes the form of a federated schema, which is a global view of data taken from distributed sources. One of the issues faced in the engineering of a federated schema is the continuous need to extract metadata from cooperating systems. Where cooperating systems employ an object-oriented common model to interact with each other, this requirement can become a problem due to the type and complexity of metadata queries. In this research, we specified and implemented a metadata software layer in the form of a high-level query interface for the ODMG schema repository, in order to simplify the task of integration system engineers. Two clears benefits have emerged: the reduced complexity of metadata queries during system integration (and federated schema construction) and a reduced learning curve for programmers who need to use the ODMG schema repository.

show abstract

Benchmarking algorithms for genomic prediction of complex traits

Azodi

McCarren

Roantree

et al. 2019

Preprint

View full text Add to dashboard Cite

The usefulness of Genomic Prediction (GP) in crop and livestock breeding programs has led to efforts to develop new and improved GP approaches including non-linear algorithm, such as artificial neural networks (ANN) (i.e. deep learning) and gradient tree boosting. However, the performance of these algorithms has not been compared in a systematic manner using a wide range of GP datasets and models. Using data of 18 traits across six plant species with different marker densities and training population sizes, we compared the performance of six linear and five non-linear algorithms, including ANNs. First, we found that hyperparameter selection was critical for all non-linear algorithms and that feature selection prior to model training was necessary for ANNs when the markers greatly outnumbered the number of training lines. Across all species and trait combinations, no one algorithm performed best, however predictions based on a combination of results from multiple GP algorithms (i.e. ensemble predictions) performed consistently well. While linear and non-linear algorithms performed best for a similar number of traits, the performance of non-linear algorithms vary more between traits than that of linear algorithms. Although ANNs did not perform best for any trait, we identified strategies (i.e. feature selection, seeded starting weights) that boosted their performance near the level of other algorithms. These results, together with the fact that even small improvements in GP performance could accumulate into large genetic gains over the course of a breeding program, highlights the importance of algorithm selection for the prediction of trait values.

show abstract

Capturing Personal Health Data from Wearable Sensors

Camous

McCann

Roantree

2008

View full text Add to dashboard Cite

Providing views and closure for the object data management group object model

Roantree

Kennedy

Barclay

1999

Information and Software Technology

View full text Add to dashboard Cite

Integrating View Schemata Using an Extended Object Definition Language

Roantree

Kennedy

Barclay

2001

View full text Add to dashboard Cite

View mechanisms play an important role in restructuring data for users, while maintaining the integrity and autonomy for the underlying database schema. Although far more complex than their relational counterparts, numerous object-oriented view mechanisms have been specified and implemented over the last decade. These view mechanisms have served different functions: view schemata for object-oriented databases; object views of relational (and other) database systems, and the formation of federated schemata for distributed information systems. In the latter category there is still a significant amount of research required to construct a view language powerful enough to support federated views. Such a language (or set of languages) should support not only object views, but also a wrapper specification language for external information sources, and a set of restructuring and integration operators. Furthermore, with the advent of standard models and technologies such as CORBA for distribution, ODMG for storage, and XML for web publishing, these languages should be based upon, or cooperate with, these standards. In this research, we present a view mechanism which retains the semantic information incorporated in ODMG schemata, provide a set of operators which facilitate the restructuring and integration necessary to merge schemata, and provide wrappers to heterogenous systems such as legacy systems, ODBC databases, and XML data sources.

show abstract

Data transformation and query management in personal health sensor networks

Roantree

Song

Cappellari

et al. 2012

Journal of Network and Computer Applications

View full text Add to dashboard Cite

Sensor technology has been exploited in many application areas ranging from climate monitoring, to traffic management, and healthcare. The role of these sensors is to monitor human beings, the environment or instrumentation and provide continuous streams of information regarding their status or well being. In the case study presented in this work, the network is provided by football teams with sensors generating continuous heart rate values during a number of different sporting activities. In wireless networks such as these, the requirement is for methods of data management and transformation in order to present data in a format suited to high level queries. In effect, what is required is a traditional database-style query interface where domain experts can continue to probe for the answers required in more specialised environments. The challenge arises from the gap that emerges between the low level sensor output and the high level user requirements of the domain experts. This paper describes a process to close this gap by automatically harvesting the raw sensor data and providing semantic enrichment through the addition of context data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mark Roantree

Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits

A Path-Oriented RDF Index for Keyword Search Query Processing

Using a Metadata Software Layer in Information Systems Integration

Benchmarking algorithms for genomic prediction of complex traits

Capturing Personal Health Data from Wearable Sensors

Providing views and closure for the object data management group object model

Integrating View Schemata Using an Extended Object Definition Language

Data transformation and query management in personal health sensor networks

Contact Info

Product

Resources

About