Though all biologists deal with information, only recently have the computational challenges of systematically collecting, storing, organizing, manipulating, visualizing and analysing large amounts of biological information come to be widely appreciated. The cause of this is the explosive growth of genomics. The term
bioinformatics
was originally coined for the application of information technology to large volumes of biological, and particularly genomic, data. The field of bioinformatics has come to be intermingled with traditional computational biology and biostatistics, which are strictly concerned not with how to handle the information itself, but rather how to extract biological meaning from it. Thus bioinformatics, in its broad sense, can be seen as providing both the infrastructure and the scientific framework in which biologists take information and use computers to help convert it into knowledge.
Despite the relative youth of the field as a recognized discipline, there is an impressive diversity of bioinformatics resources currently available. By necessity, we only focus on a small slice of this diversity here. We pay particular attention to sequence analysis because of its centrality to genomics. Though a wide array of commercial resources exist, some of which are ideally suited to specific tasks, many of the most fundamental and long‐lived bioinformatics tools are freely available. For this reason, we primarily describe non‐commercial software in this chapter. Many of the databases and analysis tools we describe are hosted by government or academic research centres and can be accessed via user‐friendly web interfaces.