IntroductionBig data or data-intensive applications (DIAs) process large amounts of data for the purpose of gaining key business intelligence through complex analytics using machinelearning techniques [20,35]. These applications are receiving increased attention in the last years given their ability to yield competitive advantage by direct investigation of user needs and trends hidden in the enormous quantities of data produced daily by the average Internet user. According to Gartner [1] business intelligence and analytics applications will remain a top focus for Chief-Information Officers (CIOs) of most Fortune 500 companies until at least 2019-2021. However, the cost of ownership of the systems that process big data analytics are high due to infrastructure costs, steep learning curves for the different frameworks (such as Apache Storm [21], Apache Spark [2] or Apache Hadoop [3]) typically involved in design and development of big data applications and complexities in large-scale architectures.A key complexity of the above design and development activity lies in quickly and continuously refining the configuration parameters of the middleware and service platforms on top of which the DIA is running [12]. The process in question is especially complex as the number of middleware involved in DIAs design increases; the more middleware are involved the more parameters need co-evaluation (e.g., latency or beaconing times, caching policies, queue retention and more)-fine-tuning these "knobs" on so many Abstract Big data architectures have been gaining momentum in recent years. For instance, Twitter uses stream processing frameworks like Apache Storm to analyse billions of tweets per minute and learn the trending topics. However, architectures that process big data involve many different components interconnected via semantically different connectors. Such complex architectures make possible refactoring of the applications a difficult task for software architects, as applications might be very different with respect to the initial designs. As an aid to designers and developers, we developed OSTIA (Ordinary Static Topology Inference Analysis) that allows detecting the occurrence of common anti-patterns across big data architectures and exploiting software verification techniques on the elicited architectural models. This paper illustrates OSTIA and evaluates its uses and benefits on three industrial-scale case-studies. which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.