“…First, the dynamic implementation of each scenario takes longer to run than the static Spark implementation. This matches the results of the earlier experiments done for spark-dynamic (Lazovik et al, 2017). The reason for this is that we have added extra functionality on top of the existing static Spark code.…”
Section: Runtime Overheadsupporting
confidence: 87%
“…In a previous work done by the authors (Lazovik et al, 2017), we have investigated the feasibility of dynamically updating the processing pipeline of an Apache Spark application. Apache Spark is one of the most popular big data processing platforms.…”
Section: Spark-dynamicmentioning
confidence: 99%
“…The performance of the prototype was also measured as part of the feasibility study, with promising results (Lazovik et al, 2017). The solutions from this paper are applied on top of this earlier system.…”
Section: Spark-dynamicmentioning
confidence: 99%
“…Next, we compare the adaptive framework with the sparkdynamic framework (Lazovik et al, 2017) and with regular Spark implementations. These experiments are based on three implemented scenarios inspired by real projects, each built using commonly used operations and increasing in complexity.…”
Section: Dynamic Versus Staticmentioning
confidence: 99%
“…In a previous paper, we have developed a framework sparkdynamic (Lazovik et al, 2017), built on top of the popular distributed data processing platform Apache Spark (The Apache Software Foundation, 2015b) to enable the updating of the steps and algorithm parameters of running pipelines without restarting them. This process is called reconfiguration.…”
Distributed data processing systems have become the standard means for big data analytics. These systems are based on processing pipelines where operations on data are performed in a chain of consecutive steps. Normally, the operations performed by these pipelines are set at design time, and any changes to their functionality require the applications to be restarted. This is not always acceptable, for example, when we cannot afford downtime or when a long-running calculation would lose significant progress. The introduction of variation points to distributed processing pipelines allows for on-the-fly updating of individual analysis steps. In this paper, we extend such basic variation point functionality to provide fully automated reconfiguration of the processing steps within a running pipeline through an automated planner. We have enabled pipeline modeling through constraints. Based on these constraints, we not only ensure that configurations are compatible with type but also verify that expected pipeline functionality is achieved. Furthermore, automating the reconfiguration process simplifies its use, in turn allowing users with less development experience to make changes. The system can automatically generate and validate pipeline configurations that achieve a specified goal, selecting from operation definitions available at planning time. It then automatically integrates these configurations into the running pipeline. We verify the system through the testing of a proof-of-concept implementation. The proof of concept also shows promising results when reconfiguration is performed frequently.
“…First, the dynamic implementation of each scenario takes longer to run than the static Spark implementation. This matches the results of the earlier experiments done for spark-dynamic (Lazovik et al, 2017). The reason for this is that we have added extra functionality on top of the existing static Spark code.…”
Section: Runtime Overheadsupporting
confidence: 87%
“…In a previous work done by the authors (Lazovik et al, 2017), we have investigated the feasibility of dynamically updating the processing pipeline of an Apache Spark application. Apache Spark is one of the most popular big data processing platforms.…”
Section: Spark-dynamicmentioning
confidence: 99%
“…The performance of the prototype was also measured as part of the feasibility study, with promising results (Lazovik et al, 2017). The solutions from this paper are applied on top of this earlier system.…”
Section: Spark-dynamicmentioning
confidence: 99%
“…Next, we compare the adaptive framework with the sparkdynamic framework (Lazovik et al, 2017) and with regular Spark implementations. These experiments are based on three implemented scenarios inspired by real projects, each built using commonly used operations and increasing in complexity.…”
Section: Dynamic Versus Staticmentioning
confidence: 99%
“…In a previous paper, we have developed a framework sparkdynamic (Lazovik et al, 2017), built on top of the popular distributed data processing platform Apache Spark (The Apache Software Foundation, 2015b) to enable the updating of the steps and algorithm parameters of running pipelines without restarting them. This process is called reconfiguration.…”
Distributed data processing systems have become the standard means for big data analytics. These systems are based on processing pipelines where operations on data are performed in a chain of consecutive steps. Normally, the operations performed by these pipelines are set at design time, and any changes to their functionality require the applications to be restarted. This is not always acceptable, for example, when we cannot afford downtime or when a long-running calculation would lose significant progress. The introduction of variation points to distributed processing pipelines allows for on-the-fly updating of individual analysis steps. In this paper, we extend such basic variation point functionality to provide fully automated reconfiguration of the processing steps within a running pipeline through an automated planner. We have enabled pipeline modeling through constraints. Based on these constraints, we not only ensure that configurations are compatible with type but also verify that expected pipeline functionality is achieved. Furthermore, automating the reconfiguration process simplifies its use, in turn allowing users with less development experience to make changes. The system can automatically generate and validate pipeline configurations that achieve a specified goal, selecting from operation definitions available at planning time. It then automatically integrates these configurations into the running pipeline. We verify the system through the testing of a proof-of-concept implementation. The proof of concept also shows promising results when reconfiguration is performed frequently.
Las líneas dinámicas de productos de software representan una forma para que los
desarrolladores de software exploten características comunes y variables entre un
conjunto de requisitos y así construir familias enteras de productos lo cual permite
cambiar de una configuración de producto a otra en tiempo de ejecución. Estas son
líneas de productos donde la derivación ocurre en tiempo de ejecución e implica una
reconfiguración tanto en términos de los servicios disponibles como en la plataforma
subyacente. Por otro lado, la nube ha permitido a los desarrolladores crear aplicaciones que se pueden reconfigurar y volver a implementar de forma dinámica y autónoma, independientemente de la infraestructura de hardware física subyacente. Estas
dos estrategias combinadas tienen el potencial de construir aplicaciones de software
altamente reutilizables y reconfigurables. En este documento presentamos un enfoque
para lograr un DSPL utilizando microservicios. Proponemos dos procesos de derivación diferentes, uno en tiempo de diseño basado en el reemplazo de binarios, y otro
en tiempo de ejecución que utiliza un modelo de características para el contexto del
usuario y la adaptación basada en servicios modulares independientes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.