Rudolf Mayer scite author profile

Unsupervised learning is very important in the processing of multimedia content as clustering or partitioning of data in the absence of class labels is often a requirement. This chapter begins with a review of the classic clustering techniques of k-means clustering and hierarchical clustering. Modern advances in clustering are covered with an analysis of kernel-based clustering and spectral clustering. One of the most popular unsupervised learning techniques for processing multimedia content is the self-organizing map, so a review of self-organizing maps and variants is presented in this chapter. The absence of class labels in unsupervised learning makes the question of evaluation and cluster quality assessment more complicated than in supervised learning. So this chapter also includes a comprehensive analysis of cluster validity assessment techniques.

show abstract

Combination of audio and lyrics features for genre classification in digital audio collections

Mayer

Neumayer

Rauber

2008

View full text Add to dashboard Cite

In many areas multimedia technology has made its way into mainstream. In the case of digital audio this is manifested in numerous online music stores having turned into profitable businesses. The widespread user adaption of digital audio both on home computers and mobile players show the size of this market. Thus, ways to automatically process and handle the growing size of private and commercial collections become increasingly important; along goes a need to make music interpretable by computers. The most obvious representation of audio files is their sound -there are, however, more ways of describing a song, for instance its lyrics, which describe songs in terms of content words. Lyrics of music may be orthogonal to its sound, and differ greatly from other texts regarding their (rhyme) structure. Consequently, the exploitation of these properties has potential for typical music information retrieval tasks such as musical genre classification; so far, there is a lack of means to efficiently combine these modalities. In this paper, we present findings from investigating advanced lyrics features such as the frequency of certain rhyme patterns, several parts-of-speech features, and statistic features such as words per minute (WPM). We further analyse in how far a combination of these features with existing acoustic feature sets can be exploited for genre classification and provide experiments on two test collections.

show abstract

On the Utility of Synthetic Data

Hittmeir

Ekelhart

Mayer

2019

View full text Add to dashboard Cite

With the recent advances and increasing activities in data mining and analysis, the protection of the privacy of individuals is crucial. Several approaches address this concern, from techniques like data anonymisation to secure, non-disclosive computation, all of which have their specific strengths and weaknesses, depending on the specific requirements. A slightly different approach is the generation of synthetic data, which tries to preserve the overall properties and characteristics of the original data without revealing information about actual individual data samples. The promise is that, for most purposes, models trained on the synthetic data instead of the real data do not show a significant loss of performance. In this paper, we give an overview on currently available approaches for synthetic data generation, and empirically evaluate the utility of the generated synthetic data by testing them on a number of supervised machine learning tasks on several publicly available datasets. CCS CONCEPTS • Computing methodologies → Supervised learning; • Security and privacy → Data anonymization and sanitization; Usability in security and privacy; Privacy protections;

show abstract

Utility and Privacy Assessments of Synthetic Data for Regression Tasks

Hittmeir

Ekelhart

Mayer

2019

View full text Add to dashboard Cite

With ever increasing capacity for collecting, storing, and processing of data, there is also a high demand for intelligent data analysis methods. While there have been impressive advances in machine learning and similar domains in recent years, this also gives rise to concerns regarding the protection of personal and otherwise sensitive data, especially if it is to be analysed by third parties. Besides anonymisation, which becomes challenging with high dimensional data, one approach for privacy-preserving data mining lies in the usage of synthetic data, which comes with the promise of protecting the users' data and producing analysis results close to those achieved by using real data. In this paper, we analyse a number of different approaches for creating synthetic data, and study the utility of the created datasets for regression tasks, i.e. the prediction of a numeric value. We further investigate the similarity of real and synthetic data samples. Finally, we contribute to privacy assessments and measurements of the risk of attribute disclosure on synthetic data by extending an approach developed for categorical data.

show abstract

Improving Scientific Conferences by Enhancing Conference Management Systems with Information Mining Capabilities

Pesenhofer¹,

Mayer

Rauber

2007

View full text Add to dashboard Cite

Heterocyclic Amines with Antihistaminic Activity¹

Huttrer¹,

Djerassi²,

Beears³

et al. 1946

J. Am. Chem. Soc.

View full text Add to dashboard Cite

Procedures have been developed for the synthesis of 5,6-diamino-2,4-dihydroxypyrimidine and 4-hydroxy-2,5,6-triaminopyrimidine bisulfite in appreciably better yields and involving fewer isolations of intermediate products than previously reported.2. These compounds have been condensed with several dicarbonyl compounds to yield pyrimido [4,5-b]pyrazines symmetrically substituted in the 6-and 7positions.3. Ultraviolet absorption spectra of alkaline solutions of the compounds have been measured. Ithaca, X. Y.

show abstract

A Baseline for Attribute Disclosure Risk in Synthetic Data

Hittmeir

Mayer

Ekelhart

2020

View full text Add to dashboard Cite

The generation of synthetic data is widely considered as viable method for alleviating privacy concerns and for reducing identification and attribute disclosure risk in micro-data. The records in a synthetic dataset are artificially created and thus do not directly relate to individuals in the original data in terms of a 1-to-1 correspondence. As a result, inferences about said individuals appear to be infeasible and, simultaneously, the utility of the data may be kept at a high level. In this paper, we challenge this belief by interpreting the standard attacker model for attribute disclosure as classification problem. We show how disclosure risk measures presented in recent publications may be compared to or even be reformulated as machine learning classification models. Our overall goal is to empirically analyze attribute disclosure risk in synthetic data and to discuss its close relationship to data utility. Moreover, we improve the baseline for attribute disclosure risk from the attacker's perspective by applying variants of the RadiusNearestNeighbor and the EnsembleVote classifiers.

show abstract

Ensuring sustainability of web services dependent processes

Miksa

Mayer

Rauber

2015

IJCSE

View full text Add to dashboard Cite

High dependence on web services and service-oriented architecture affects not only business solutions, but also scientific research. Web services may be delivered by third parties, and thus are candidates for outsourcing. However, they represent a source of risks, which can jeopardise the robustness of processes. Hence, there is a need for actions which can contribute to the mitigation of possible threats to the continuity of processes. In this paper, risk affecting processes are classified, followed by a discussion about particular changes stemming from web services. Three distinct approaches allowing improvements are described: a newly proposed web services monitoring framework supported by a software solution, the concept of resilient web services, which specifies new design requirements for web services, and digital preservation strategies, which apart from long-term benefits can support sustainability of currently running processes.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rudolf Mayer

Unsupervised Learning and Clustering

Combination of audio and lyrics features for genre classification in digital audio collections

On the Utility of Synthetic Data

Utility and Privacy Assessments of Synthetic Data for Regression Tasks

Improving Scientific Conferences by Enhancing Conference Management Systems with Information Mining Capabilities

Heterocyclic Amines with Antihistaminic Activity¹

A Baseline for Attribute Disclosure Risk in Synthetic Data

Ensuring sustainability of web services dependent processes

Contact Info

Product

Resources

About

Rudolf Mayer

Unsupervised Learning and Clustering

Combination of audio and lyrics features for genre classification in digital audio collections

On the Utility of Synthetic Data

Utility and Privacy Assessments of Synthetic Data for Regression Tasks

Improving Scientific Conferences by Enhancing Conference Management Systems with Information Mining Capabilities

Heterocyclic Amines with Antihistaminic Activity1

A Baseline for Attribute Disclosure Risk in Synthetic Data

Ensuring sustainability of web services dependent processes

Contact Info

Product

Resources

About

Heterocyclic Amines with Antihistaminic Activity¹