We present a multimodel analysis for mechanistic hypothesis testing in landscape evolution theory. The study site is a watershed with well-constrained initial and boundary conditions in which a river network locally incised 50 m over the last 13 ka. We calibrate and validate a set of 37 landscape evolution models designed to hierarchically test elements of complexity from four categories: hillslope processes, channel processes, surface hydrology, and representation of geologic materials. Comparison of each model to a base model, which uses stream power channel incision, uniform lithology, hillslope transport by linear diffusion, and surface water discharge proportional to drainage area, serves as a formal test of which elements of complexity improve model performance. Model fit is assessed using an objective function based on a direct difference between observed and simulated modern topography. A hybrid optimization scheme identifies optimal parameters and uncertainty. Multimodel analysis determines which elements of complexity improve simulation performance. Validation tests which model improvements persist when models are applied to an independent watershed. The three most important model elements are (1) spatial variation in lithology (differentiation between shale and glacial till), (2) a fluvial erosion threshold, and (3) a nonlinear relationship between slope and hillslope sediment flux. Due to nonlinear interactions between model elements, some process representations (e.g., nonlinear hillslopes) only become important when paired with the inclusion of other processes (e.g., erosion thresholds). This emphasizes the need for caution in identifying the minimally sufficient process set. Our approach provides a general framework for hypothesis testing in landscape evolution.