On the Web, there is always a need to aggregate opinions from the crowd (as in posts, social networks, forums, etc.). Di erent mechanisms have been implemented to capture these opinions such as Like in Facebook, Favorite in Twi er, thumbs-up/down, agging, and so on. However, in more contested domains (e.g. Wikipedia, political discussion, and climate change discussion) these mechanisms are not su cient since they only deal with each issue independently without considering the relationships between di erent claims. We can view a set of con icting arguments as a graph in which the nodes represent arguments and the arcs between these nodes represent the defeat relation. A group of people can then collectively evaluate such graphs. To do this, the group must use a rule to aggregate their individual opinions about the entire argument graph. Here, we present the rst experimental evaluation of di erent principles commonly employed by aggregation rules presented in the literature. We use randomized controlled experiments to investigate which principles people consider be er at aggregating opinions under di erent conditions. Our analysis reveals a number of factors, not captured by traditional formal models, that play an important role in determining the e cacy of aggregation. ese results help bring formal models of argumentation closer to real-world application. CCS Concepts: •Computing methodologies →Nonmonotonic, default reasoning and belief revision; Multi-agent systems; •Applied computing →Law, social and behavioral science;