5The adaptive immune system uses two main types of antigen receptors: T-cell receptors (TCRs) 6 and antibodies. While both proteins share a globally similar β-sandwich architecture, TCRs are 7 specialised to recognise peptide antigens in the binding groove of the major histocompatibility 8 complex, while antibodies can bind an almost infinite range of molecules. For both proteins, 9 the main determinants of target recognition are the complementarity-determining region (CDR) 10 loops. Five of the six CDRs adopt a limited number of backbone conformations, known as the 11 'canonical classes'; the remaining CDR (β3 in TCRs and H3 in antibodies) is more structurally 12 diverse. In this paper, we first update the definition of canonical forms in TCRs, build an auto-13 updating sequence-based prediction tool (available at http://opig.stats.ox.ac.uk/resources) and 14 demonstrate its application on large scale sequencing studies. Given the global similarity of TCRs 15 and antibodies, we then examine the structural similarity of their CDRs. We find that TCR and 16 antibody CDRs tend to have different length distributions, and where they have similar lengths, 17 they mostly occupy distinct structural spaces. In the rare cases where we found structural simi-18 larity, the underlying sequence patterns for the TCR and antibody version are different. Finally, 19 where multiple structures have been solved for the same CDR sequence, the structural variability 20 in TCR loops is higher than that in antibodies, suggesting TCR CDRs are more flexible. These 21 structural differences between TCR and antibody CDRs may be important to their different bio-22 logical functions. 23 1 Introduction 24The adaptive immune system defends the host organism against a wide range of foreign molecules, 25 or antigens, using two types of receptors: T-cell receptors (TCRs) and antibodies (23). TCRs typi-26 cally recognise peptide antigens presented via the major histocompatibility complex (MHC; 44), 27 while antibodies can bind almost any antigen, including proteins, peptides, and haptens (46).
28Despite their different roles in the immune response, these proteins share a β-sandwich fold (Fig-29 ure 1; 16, 51).30 In humans, most TCRs are αβTCRs, consisting of one TCRα chain and one TCRβ chain, while 31 most antibodies are comprised of two heavy(H)-light(L) chain dimers (Figure 1). All four types 32 of chains (α, β, H and L) are formed from the somatic rearrangement of the respective V, D, and 33 J genes of the TCR or antibody loci. The random combination of these genes, alongside further 34 diversification mechanisms (e.g. random nucleotide addition), are estimated to yield trillions of 35 unique TCRs and antibodies (5, 20). TCRα and antibody light chains are made from the V and J 36 genes, while TCRβ and antibody heavy chains are assembled from the V, D, and J genes, making 37 the L-chain equivalent to α-chain and H-chain equivalent to β-chain (5, 45). In both types of 38 antigen receptors, sequence and structural diversity is concentrated i...