Feromônios associados aos coleópteros-praga de produtos armazenados

The popularity of collaborative tagging sites has created new challenges and opportunities for designers of web items, such as electronics products, travel itineraries, popular blogs, etc. An increasing number of people are turning to online reviews and user-specified tags to choose from among competing items. This creates an opportunity for designers to build items that are likely to attract desirable tags when published. In this paper, we consider a novel optimization problem: given a training dataset of existing items with their user-submitted tags, and a query set of desirable tags, design the k best new items expected to attract the maximum number of desirable tags. We show that this problem is NPComplete, even if simple Naive Bayes Classifiers are used for tag prediction. We present two principled algorithms for solving this problem: (a) an exact "two-tier" algorithm (based on top-k querying techniques), which performs much better than the naive brute-force algorithm and works well for moderate problem instances, and (b) a novel polynomialtime approximation algorithm with provable error bound for larger problem instances. We conduct detailed experiments on synthetic and real data crawled from the web to evaluate the efficiency and quality of our proposed algorithms.

show abstract

Fast Rule Mining Over Multi-Dimensional Windows

Das¹,

Deepak²,

Deshpande³

et al. 2011

View full text Add to dashboard Cite

Association rule mining is an indispensable tool for discovering insights from large databases and data warehouses. The data in a warehouse being multi-dimensional, it is often useful to mine rules over subsets of data defined by selections over the dimensions. Such interactive rule mining over multi-dimensional query windows is difficult since rule mining is computationally expensive. Current methods using pre-computation of frequent itemsets require counting of some itemsets by revisiting the transaction database at query time, which is very expensive. We develop a method (RMW) that identifies the minimal set of itemsets to compute and store for each cell, so that rule mining over any query window may be performed without going back to the transaction database. We give formal proofs that the set of itemsets chosen by RMW is sufficient to answer any query and also prove that it is the optimal set to be computed for 1 dimensional queries. We demonstrate through an extensive empirical evaluation that RMW achieves extremely fast query response time compared to existing methods, with only moderate overhead in pre-computation and storage.

show abstract

Learning to question

Das

Morales

Gionis

et al. 2013

View full text Add to dashboard Cite

Who tags what?

Das

Thirumuruganathan

Amer-Yahia³

et al. 2012

Proc. VLDB Endow.

View full text Add to dashboard Cite

The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three components, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by certain users to certain items has important implications in helping users search for desired information. In this paper, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, similarity and diversity , being applied to one or more tagging components, and therefore enables a wide range of analysis scenarios such as characterizing similar users tagging diverse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We design efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mahashweta Das

An expressive framework and efficient algorithms for the analysis of collaborative tagging

Leveraging collaborative tagging for web item design

Fast Rule Mining Over Multi-Dimensional Windows

Learning to question

Who tags what?

Contact Info

Product

Resources

About