Todor Ivanov scite author profile

The ethical and societal implications of artificial intelligence systems raise concerns. In this paper we outline a novel process based on applied ethics, namely Z-inspection®, to assess if an AI system is trustworthy. We use the definition of trustworthy AI given by the high-level European Commission's expert group on AI. Z-inspection® is a general inspection process that can be applied to a variety of domains where AI systems are used, such as business, healthcare, public sector, among many others. To the best of our knowledge, Z-inspection® is the first process to assess trustworthy AI in practice.

show abstract

BigBench V2: The New and Improved BigBench

Ghazal

Ivanov

Kostamaa³

et al. 2017

View full text Add to dashboard Cite

Setting Up a Big Data Project: Challenges, Opportunities, Technologies and Optimization

Zicari

Rosselli

Ivanov

et al. 2016

View full text Add to dashboard Cite

In the first part of this chapter we illustrate how a big data project can be set up and optimized. We explain the general value of big data analytics for the enterprise and how value can be derived by analyzing big data. We go on to introduce the characteristics of big data projects and how such projects can be set up, optimized and managed. Two exemplary real word use cases of big data projects are described at the end of the first part. To be able to choose the optimal big data tools for given requirements, the relevant technologies for handling big data are outlined in the second part of this chapter. This part includes technologies such as NoSQL and NewSQL systems, in-memory databases, analytical platforms and Hadoop based solutions. Finally, the chapter is concluded with an overview over big data benchmarks that allow for performance optimization and evaluation of big data technologies. Especially with the new big data applications, there are requirements that make the platforms more complex and more heterogeneous. The relevant benchmarks designed for big data technologies are categorized in the last part

show abstract

The impact of columnar file formats on SQL‐on‐hadoop engine performance: A study on ORC and Parquet

Ivanov

Pergolesi

2019

Concurrency and Computation

View full text Add to dashboard Cite

Columnar file formats provide an efficient way to store data to be queried by SQL-on-Hadoop engines. Related works consider the performance of processing engine and file format together, which makes it impossible to predict their individual impact. In this work, we propose an alternative approach: by executing each file format on the same processing engine, we compare the different file formats as well as their different parameter settings. We apply our strategy to two processing engines, Hive and SparkSQL, and evaluate the performance of two columnar file formats, ORC and Parquet. We use BigBench (TPCx-BB), a standardized application-level benchmark for Big Data scenarios. Our experiments confirm that the file format selection and its configuration significantly affect the overall performance. We show that ORC generally performs better on Hive, whereas Parquet achieves best performance with SparkSQL. Using ZLIB compression brings up to 60.2% improvement with ORC, while Parquet achieves up to 7% improvement with Snappy. Exceptions are the queries involving text processing, which do not benefit from using any compression.

show abstract

Performance Evaluation of Spark SQL Using BigBench

Ivanov

Beer

2016

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Todor Ivanov

Z-Inspection^®: A Process to Assess Trustworthy AI

BigBench V2: The New and Improved BigBench

Setting Up a Big Data Project: Challenges, Opportunities, Technologies and Optimization

The impact of columnar file formats on SQL‐on‐hadoop engine performance: A study on ORC and Parquet

Performance Evaluation of Spark SQL Using BigBench

Contact Info

Product

Resources

About

Todor Ivanov

Z-Inspection®: A Process to Assess Trustworthy AI

BigBench V2: The New and Improved BigBench

Setting Up a Big Data Project: Challenges, Opportunities, Technologies and Optimization

The impact of columnar file formats on SQL‐on‐hadoop engine performance: A study on ORC and Parquet

Performance Evaluation of Spark SQL Using BigBench

Contact Info

Product

Resources

About

Z-Inspection^®: A Process to Assess Trustworthy AI