Digital pen and paper technologies provide the basis for linking digital content and services to printed materials in the form of interactive paper publications. To realise the potential of these technologies, it is important to develop platforms and tools that can support the large-scale publishing of interactive paper documents. We show how an extensible content management system that was developed to support context-aware publishing was used for the production of interactive paper documents. The publishing process consists of two phases and requires one channel to support the production of the document together with cross-media link definitions and a second channel to support interaction with the document.
The support vector machine (SVM) is a powerful learning algorithm, e.g., for classification and clustering tasks, that works even for complex data structures such as strings, trees, lists and general graphs. It is based on the usage of a kernel function for measuring scalar products between data units. For analyzing string data Lodhi et al. (J Mach Learn Res 2:419-444, 2002) have introduced a String Subsequence kernel (SSK). In this paper we propose an approximation to SSK based on dropping higher orders terms (i.e., subsequences which are spread out more than a certain threshold) that reduces the computational burden of SSK. As we are also concerned with practical application of complex kernels with high computational complexity and memory consumption, we provide an empirical model to predict runtime and memory of the approximation as well as the original SSK, based on easily measurable properties of input data. We provide extensive results on the properties of the proposed approximation, SSK-LP, with respect to prediction accuracy, runtime and memory consumption. Using some real-life datasets of text mining tasks, we show that models based on SSK and SSK-LP perform similarly for a set of real-life learning tasks, and that the empirical runtime model is also useful in roughly determining total learning time for a SVM using either kernel.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.