A novel database, under the acronym RISSC (Ribosomal Intergenic Spacer Sequence Collection), has been created. It compiles more than 1600 entries of edited DNA sequence data from the 16S-23S ribosomal spacers present in most prokaryotes and organelles (e.g. mitochondria and chloroplasts) and is accessible through the Internet (http://ulises.umh.es/RISSC), where systematic searches for specific words can be conducted, as well as BLAST-type sequence searches. Additionally, a characteristic feature of this region, the presence/absence and nature of tRNA genes within the spacer, is included in all the entries, even when not previously indicated in the original database. All these combined features could provide a useful documentation tool for studies on evolution, identification, typing and strain characterization, among others.
In the literature of Economics, Engineering and Operations Research, the estimation of production frontiers is a current hot topic. Many parametric and nonparametric methodologies have been introduced for estimating technical efficiency of a set of units (for example, firms) from the production frontier. However, few of these methodologies are based upon machine learning techniques, despite being a rising field of research. Recently, a bridge has been built between these literatures, machine learning and production theory, through a new technique proposed in Esteve et al (2020), called Efficiency Analysis Trees (EAT). The algorithm developed from EAT, based on the well-known Classification and Regression Trees (CART) machine learning technique, is a greedy technique that uses a particular heuristic for the selection of the next node to be split during the decision tree development process. Nevertheless, as we show in this paper, for different sample sizes and number of variables, the heuristic used by EAT is not capable of obtaining the tree with the minimum mean square error (MSE). For this reason, in this paper, a backtracking technique is implemented to improve the MSE obtained by the EAT algorithm. Additionally, a pair of new algorithms are introduced which combine the heuristic technique used by the standard EAT and the backtracking algorithm to enhance the reduction of the MSE, while decreasing the computation time. Our research is based on some simulated experiments. According to our computational results, the combination of the heuristic and the backtracking algorithm, in particular, that in which the tree growth starts with heuristics and ends with backtracking, has achieved an accuracy similar to that of backtracking and within a reasonable computational time. The contribution of the paper could be of special interest for industrial engineers interested in measuring efficiency and productivity of industrial processes in many sectors, such as energy, agri-food or service industries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.