WebChild: Harvesting and Organizing Commonsense Knowledge from the Web

Niket Tandon; Gerard de Melo; Fabian M. Suchanek; Gerhard Weikum

doi:10.1145/2556195.2556245

Communication Dans Un Congrès Année : 2014

WebChild: Sammeln und Organisieren von Wissen aus dem Internet

WebChild: Harvesting and Organizing Commonsense Knowledge from the Web

, (1) , (2) , (1)

1
2

Niket Tandon

Fonction : Auteur

Gerard de Melo

Fonction : Auteur

Max-Planck-Institut für Informatik

Fabian M. Suchanek

Fonction : Auteur
PersonId : 12540
IdHAL : fabian-suchanek
ORCID : 0000-0001-7189-2796
IdRef : 203477707

Laboratoire Traitement et Communication de l'Information

Gerhard Weikum

Fonction : Auteur

Max-Planck-Institut für Informatik

Résumé

This paper presents a method for automatically constructing a large commonsense knowledge base, called WebChild 1 , from Web contents. WebChild contains triples that connect nouns with adjectives via fine-grained relations like hasShape, hasTaste, evokesEmotion, etc. The arguments of these assertions, nouns and adjectives, are disambiguated by mapping them onto their proper WordNet senses. Our method is based on semi-supervised Label Propagation over graphs of noisy candidate assertions. We automatically derive seeds from WordNet and by pattern matching from Web text collections. The Label Propagation algorithm provides us with domain sets and range sets for 19 different relations, and with confidence-ranked assertions between WordNet senses. Large-scale experiments demonstrate the high accuracy (more than 80 percent) and coverage (more than four million fine grained disambiguated assertions) of WebChild.

Mots clés

H4 [Information Systems Applications]: Miscellaneous I26 [Artificial Intelligence]: Learning Keywords Knowledge Bases Commonsense Knowledge Web Mining La- bel Propagation Word Sense Disambiguation

Domaines

Web Base de données [cs.DB]

Fichier principal

wsdm2014.pdf (293.22 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Fabian Suchanek : Connectez-vous pour contacter le contributeur

https://imt.hal.science/hal-01699891

Soumis le : vendredi 2 février 2018-18:26:44

Dernière modification le : mardi 28 février 2023-15:36:24

Dates et versions

hal-01699891 , version 1 (02-02-2018)

Identifiants

HAL Id : hal-01699891 , version 1
DOI : 10.1145/2556195.2556245

Citer

Niket Tandon, Gerard de Melo, Fabian M. Suchanek, Gerhard Weikum. WebChild: Harvesting and Organizing Commonsense Knowledge from the Web. WSDM, Feb 2014, New York, United States. ⟨10.1145/2556195.2556245⟩. ⟨hal-01699891⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM PARISTECH LTCI INFRES DIG

215 Consultations

446 Téléchargements

WebChild: Sammeln und Organisieren von Wissen aus dem Internet

WebChild: Harvesting and Organizing Commonsense Knowledge from the Web

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager