“…Following these initial studies, gene catalogs have become ubiquitous in the analysis of metagenomic datasets, and have been created for the gut microbiota of multiple animals [e.g. mouse ( Xiao et al ., 2015 ), rat ( Pan et al ., 2018 ), pig ( Xiao et al ., 2016a , b ), dog ( Coelho et al , 2018 ), cow ( Li et al ., 2020 ), macaque ( Li et al ., 2018 ), chicken ( Huang et al ., 2018 ), lion, leopard and tiger ( Mittal et al ., 2020 )], ocean bacteria ( Sunagawa et al , 2015 ), soil bacteria ( Lou et al ., 2019 ) and the human vagina ( Ma et al ., 2020 ) and respiratory tract ( Dai et al ., 2019 ). Gene catalogs are commonly used to: (i) reduce redundancy in the data, thereby improving estimates of diversity ( Yooseph et al , 2007 ); (ii) act as a common frame of reference across samples and studies; (iii) serve as a basis for metagenomic-wide association studies ( Wang and Jia, 2016 ); and (iv) guide the binning of metagenomic contigs into organism-specific groups ( Nielsen et al ., 2014 ; Plaza Oñate et al., 2018 ).…”