Guilhem Sempéré scite author profile

Motivation Modern genomic breeding methods rely heavily on very large amounts of phenotyping and genotyping data, presenting new challenges in effective data management and integration. Recently, the size and complexity of datasets have increased significantly, with the result that data are often stored on multiple systems. As analyses of interest increasingly require aggregation of datasets from diverse sources, data exchange between disparate systems becomes a challenge. Results To facilitate interoperability among breeding applications, we present the public plant Breeding Application Programming Interface (BrAPI). BrAPI is a standardized web service API specification. The development of BrAPI is a collaborative, community-based initiative involving a growing global community of over a hundred participants representing several dozen institutions and companies. Development of such a standard is recognized as critical to a number of important large breeding system initiatives as a foundational technology. The focus of the first version of the API is on providing services for connecting systems and retrieving basic breeding data including germplasm, study, observation, and marker data. A number of BrAPI-enabled applications, termed BrAPPs, have been written, that take advantage of the emerging support of BrAPI by many databases. Availability and implementation More information on BrAPI, including links to the specification, test suites, BrAPPs, and sample implementations is available at https://brapi.org/. The BrAPI specification and the developer tools are provided as free and open source.

show abstract

WIDDE: a Web-Interfaced next generation database for genetic diversity exploration, with a first application in cattle

Sempéré

Moazami‐Goudarzi²,

Eggen

et al. 2015

BMC Genomics

View full text Add to dashboard Cite

BackgroundThe advent and democratization of next generation sequencing and genotyping technologies lead to a huge amount of data for the characterization of population genetic diversity in model and non model-species. However, efficient storage, management, cross-analyzing and exploration of such dense genotyping datasets remain challenging. This is particularly true for the bovine species where many SNP datasets have been generated in various cattle populations with different genotyping tools.DescriptionWe developed WIDDE, a Web-Interfaced Next Generation Database that stands as a generic tool applicable to a wide range of species and marker types (http://widde.toulouse.inra.fr). As a first illustration, we hereby describe its first version dedicated to cattle biodiversity, which includes a large and evolving cattle genotyping dataset for over 750,000 SNPs available on 129 (89 public) different cattle populations representative of the world-wide bovine genetic diversity and on 7 outgroup bovid species. This version proposes an optional marker and individual filtering step, an export of genotyping data in different popular formats, and an exploration of genetic diversity through a principal component analysis. Users can also explore their own genotyping data together with data from WIDDE, assign their samples to WIDDE populations based on distance assignment method and supervised clustering, and estimate their ancestry composition relative to the populations represented in the database.ConclusionThe cattle version of WIDDE represents to our knowledge the first database dedicated to cattle biodiversity and SNP genotyping data that will be very useful for researchers interested in this field. As a generic tool applicable to a wide range of marker types, WIDDE is overall intended to the genetic diversity exploration of any species and will be extended to other species shortly. The structure makes it easy to include additional output formats and new tools dedicated to genetic diversity exploration.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2181-1) contains supplementary material, which is available to authorized users.

show abstract

MGIS: managing banana (Musa spp.) genetic resources information and high-throughput genotyping data

Ruas

Guignon

Sempéré

et al. 2017

View full text Add to dashboard Cite

Unraveling the genetic diversity held in genebanks on a large scale is underway, due to advances in Next-generation sequence (NGS) based technologies that produce high-density genetic markers for a large number of samples at low cost. Genebank users should be in a position to identify and select germplasm from the global genepool based on a combination of passport, genotypic and phenotypic data. To facilitate this, a new generation of information systems is being designed to efficiently handle data and link it with other external resources such as genome or breeding databases. The Musa Germplasm Information System (MGIS), the database for global ex situ-held banana genetic resources, has been developed to address those needs in a user-friendly way. In developing MGIS, we selected a generic database schema (Chado), the robust content management system Drupal for the user interface, and Tripal, a set of Drupal modules which links the Chado schema to Drupal. MGIS allows germplasm collection examination, accession browsing, advanced search functions, and germplasm orders. Additionally, we developed unique graphical interfaces to compare accessions and to explore them based on their taxonomic information. Accession-based data has been enriched with publications, genotyping studies and associated genotyping datasets reporting on germplasm use. Finally, an interoperability layer has been implemented to facilitate the link with complementary databases like the Banana Genome Hub and the MusaBase breeding database. Database URL: https://www.crop-diversity.org/mgis/

show abstract

A genomic map of climate adaptation in Mediterranean cattle breeds

et al. 2019

View full text Add to dashboard Cite

Domestic species such as cattle (Bos taurus taurus and B. t. indicus) represent attractive biological models to characterize the genetic basis of short‐term evolutionary response to climate pressure induced by their post‐domestication history. Here, using newly generated dense SNP genotyping data, we assessed the structuring of genetic diversity of 21 autochtonous cattle breeds from the whole Mediterranean basin and performed genome‐wide association analyses with covariables discriminating the different Mediterranean climate subtypes. This provided insights into both the demographic and adaptive histories of Mediterranean cattle. In particular, a detailed functional annotation of genes surrounding variants associated with climate variations highlighted several biological functions involved in Mediterranean climate adaptation such as thermotolerance, UV protection, pathogen resistance or metabolism with strong candidate genes identified (e.g., NDUFB3, FBN1, METTL3, LEF1, ANTXR2 and TCF7). Accordingly, our results suggest that main selective pressures affecting cattle in Mediterranean area may have been related to variation in heat and UV exposure, in food resources availability and in exposure to pathogens, such as anthrax bacteria (Bacillus anthracis). Furthermore, the observed contribution of the three main bovine ancestries (indicine, European and African taurine) in these different populations suggested that adaptation to local climate conditions may have either relied on standing genomic variation of taurine origin, or adaptive introgression from indicine origin, depending on the local breed origins. Taken together, our results highlight the genetic uniqueness of local Mediterranean cattle breeds and strongly support conservation of these populations.

show abstract

SNiPlay3: a web-based application for exploration and large scale analyses of genomic variations

et al. 2015

View full text Add to dashboard Cite

SNiPlay is a web-based tool for detection, management and analysis of genetic variants including both single nucleotide polymorphisms (SNPs) and InDels. Version 3 now extends functionalities in order to easily manage and exploit SNPs derived from next generation sequencing technologies, such as GBS (genotyping by sequencing), WGRS (whole gre-sequencing) and RNA-Seq technologies. Based on the standard VCF (variant call format) format, the application offers an intuitive interface for filtering and comparing polymorphisms using user-defined sets of individuals and then establishing a reliable genotyping data matrix for further analyses. Namely, in addition to the various scaled-up analyses allowed by the application (genomic annotation of SNP, diversity analysis, haplotype reconstruction and network, linkage disequilibrium), SNiPlay3 proposes new modules for GWAS (genome-wide association studies), population stratification, distance tree analysis and visualization of SNP density. Additionally, we developed a suite of Galaxy wrappers for each step of the SNiPlay3 process, so that the complete pipeline can also be deployed on a Galaxy instance using the Galaxy ToolShed procedure and then be computed as a Galaxy workflow. SNiPlay is accessible at http://sniplay.southgreen.fr.

show abstract

Gigwa—Genotype investigator for genome-wide analyses

Sempéré

Florian

Dereeper

et al. 2016

GigaSci

View full text Add to dashboard Cite

BackgroundExploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions.DescriptionHere we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats.ConclusionsThe Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.Electronic supplementary materialThe online version of this article (doi:10.1186/s13742-016-0131-8) contains supplementary material, which is available to authorized users.

show abstract

The composition and abundance of bacterial communities residing in the gut of Glossina palpalis palpalis captured in two sites of southern Cameroon

et al. 2019

View full text Add to dashboard Cite

Background A number of reports have demonstrated the role of insect bacterial flora on their host’s physiology and metabolism. The tsetse host and vector of trypanosomes responsible for human sleeping sickness (human African trypanosomiasis, HAT) and nagana in animals (African animal trypanosomiasis, AAT) carry bacteria that influence its diet and immune processes. However, the mechanisms involved in these processes remain poorly documented. This underscores the need for increased research into the bacterial flora composition and structure of tsetse flies. The aim of this study was to identify the diversity and relative abundance of bacterial genera in Glossina palpalis palpalis flies collected in two trypanosomiasis foci in Cameroon. Methods Samples of G. p. palpalis which were either negative or naturally trypanosome-positive were collected in two foci located in southern Cameroon (Campo and Bipindi). Using the V3V4 and V4 variable regions of the small subunit of the 16S ribosomal RNA gene, we analyzed the respective bacteriome of the flies’ midguts. Results We identified ten bacterial genera. In addition, we observed that the relative abundance of the obligate endosymbiont Wigglesworthia was highly prominent (around 99%), regardless of the analyzed region. The remaining genera represented approximately 1% of the bacterial flora, and were composed of Salmonella , Spiroplasma , Sphingomonas , Methylobacterium , Acidibacter , Tsukamurella , Serratia , Kluyvera and an unidentified bacterium. The genus Sodalis was present but with a very low abundance. Globally, no statistically significant difference was found between the bacterial compositions of flies from the two foci, and between positive and trypanosome-negative flies. However, Salmonella and Serratia were only described in trypanosome-negative flies, suggesting a potential role for these two bacteria in fly refractoriness to trypanosome infection. In addition, our study showed the V4 region of the small subunit of the 16S ribosomal RNA gene was more efficient than the V3V4 region at describing the totality of the bacterial diversity. Conclusions A very large diversity of bacteria was identified with the discovering of species reported to secrete anti-parasitic compounds or to modulate vector competence in other insects. For future studies, the analyses should be enlarged with larger sampling including foci from several countries. Electronic supplementary material The online version of this article (10.11...

show abstract

TropGeneDB, the multi-tropical crop information system updated and extended

Hamelin

Sempéré

Jouffe

et al. 2012

View full text Add to dashboard Cite

TropGeneDB (http://tropgenedb.cirad.fr) was created to store genetic, molecular and phenotypic data on tropical crop species. The most common data stored in TropGeneDB are molecular markers, quantitative trait loci, genetic and physical maps, genetic diversity, phenotypic diversity studies and information on genetic resources (geographic origin, parentage, collection). TropGeneDB is organized on a crop basis with currently nine public modules (banana, cocoa, coconut, coffee, cotton, oil palm, rice, rubber tree, sugarcane). Crop-specific Web consultation interfaces have been designed to allow quick consultations and personalized complex queries. TropGeneDB is a component of the South Green Bioinformatics Platform (http://southgreen.cirad.fr/).

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.