Massively parallel reporter assays (MPRAs) have emerged as a popular means for understanding noncoding variation in a variety of conditions. However, development of statistical analysis methods has not kept pace with the use of this assay. We present a linear model framework, mpralm, for the differential analysis of activity measures from these experiments that we show is calibrated and powerful. We show that it outperforms statistical tests that are commonly used in the literature, in the first comprehensive evaluation of statistical methods on several datasets. We investigate the theoretical and real-data properties of barcode summarization methods, and show an unappreciated impact of summarization method for some datasets. Finally, we perform a power analysis and show substantial improvements in power by performing up to 6 replicates per condition, whereas sequencing depth has limited impact; we recommend to always use at least 4 replicates. These results inform recommendations for differential analysis, general group comparisons, and power analysis. Our contributions in investigating the functional dependence of statistical power on sample sizes and sequencing depth will help MPRA practitioners make informed choices in study design, and lead to improved inference.