On the Complexity of Mining Itemsets from the Crowd Using Taxonomies

Antoine Amarilli; Yael Amsterdamer; Tova Milo

Communication Dans Un Congrès Année : 2014

On the Complexity of Mining Itemsets from the Crowd Using Taxonomies

(1, 2, 3) , (1) , (1)

1
2
3

Antoine Amarilli

Fonction : Auteur
PersonId : 4934
IdHAL : a3nm
ORCID : 0000-0002-7977-4441
IdRef : 195286677

School of Computer Science

Data, Intelligence and Graphs

Département Informatique et Réseaux

Yael Amsterdamer

Fonction : Auteur

School of Computer Science

Tova Milo

Fonction : Auteur

School of Computer Science

Résumé

We study the problem of frequent itemset mining in domains where data is not recorded in a conventional database but only exists in human knowledge. We provide examples of such scenarios, and present a crowdsourcing model for them. The model uses the crowd as an oracle to find out whether an itemset is frequent or not, and relies on a known taxonomy of the item domain to guide the search for frequent itemsets. In the spirit of data mining with oracles, we analyze the com- plexity of this problem in terms of (i) crowd complexity, that measures the number of crowd questions required to iden- tify the frequent itemsets; and (ii) computational complexity, that measures the computational effort required to choose the questions. We provide lower and upper complexity bounds in terms of the size and structure of the input taxonomy, as well as the size of a concise description of the output item- sets. We also provide constructive algorithms that achieve the upper bounds, and consider more efficient variants for practical situations.

Domaines

Base de données [cs.DB] Web

Admin Télécom Paristech : Connectez-vous pour contacter le contributeur

https://imt.hal.science/hal-00986184

Soumis le : jeudi 1 mai 2014-16:54:18

Dernière modification le : lundi 9 octobre 2023-12:49:40

Dates et versions

hal-00986184 , version 1 (01-05-2014)

Identifiants

HAL Id : hal-00986184 , version 1

Citer

Antoine Amarilli, Yael Amsterdamer, Tova Milo. On the Complexity of Mining Itemsets from the Crowd Using Taxonomies. ICDT (International Conference on Database Theory), Mar 2014, Athens, Greece. pp.15-25. ⟨hal-00986184⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM PARISTECH LTCI INFRES DIG

94 Consultations

0 Téléchargements

On the Complexity of Mining Itemsets from the Crowd Using Taxonomies

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager