9Gene duplication, from single genes to whole genomes, has been observed in organisms across all 10 taxa. Despite its prevalence, the evolutionary benefits of this mechanism are the subject of ongoing 11 debate. Gene duplication can significantly alter the self-assembly of protein quaternary structures, 12 impacting the dosage or interaction proclivity. Here we use a lattice model of self-assembly as a 13 coarse-grained representation of protein complex assembly, and show that it can be used to 14 examine potential evolutionary advantages of duplication. Duplication provides a unique 15 mechanism for increasing the evolvability of protein complexes by enabling the transformation of 16 symmetric homomeric interactions into heteromeric ones. This transformation is extensively 17 observed in in silico evolutionary simulations of the lattice model, with duplication events 18 significantly accelerating the rate at which structural complexity increases. These coarse-grained 19 simulation results are corroborated with a large-scale analysis of complexes from the Protein Data 20 Bank. 21 22 37ing. Importantly, both symmetric homomeric and heteromeric interactions are possible, as well 38 as combinations thereof, allowing for more general assembly dynamics than previously permitted 39 in polyomino models. We additionally allow variable genetic length, with genotypes growing and 40 1 of 16 Manuscript submitted to eLife shrinking dynamically through duplication and deletion respectively. 41 The importance of duplication in the structural evolution of proteins was unknown when first 42 encountered in the 1970s, but was quickly realised to be ubiquitous McLachlan (1979). Whether 43 duplication is predominantly a driver of innovationMagadum et al. (2013), distributor of subfunc-44 tionsGibson and Goldberg (2009), safeguard against extinctionCrow and Wagner (2005), or merely a 45 passive passengerKimura (1991) remains more an open question. However, evolutionary histories 46 are rife with duplication events, from single genes to whole genomes, and there is some consensus 47 109 site changes the interaction strength by two increments, affecting both the location of the point 110 mutation and the counter-aligned bit. This property of symmetric homomeric interactions is not 111 unique to polyomino models, with greater variance in interaction energetics preferring symmetric 112 homomeric formation extremely generallyLukatsky et al. (2007); Lukatsky and Shakhnovich (2008). 113 The simplicity of binary string binding sites allows the formation rates of both symmetric 114 homomeric and heteromeric interactions to be calculated analytically. The formation of interactions 115 is modelled by an absorbing Markov chain of interaction strengths, randomly walking below the 116 strength threshold. Point mutations correspond to stepping to adjacent strength states. Symmetric 117 homomeric interactions, which can only increment by two states, can be mapped to an equivalent 118 3 of 16 497 Greenbury S, Johnston I, Louis A, Ahnert S. A trac...