Semantic knowledge integration for learning from semantically imprecise data

Brust, Clemens-Alexander

Veröffentlicht

Semantic knowledge integration for learning from semantically imprecise data

Low availability of labeled training data often poses a fundamental limit to the accuracy of computer vision applications using machine learning methods. While these methods are improved continuously, e.g., through better neural network architectures, there cannot be a single methodical change that increases the accuracy on all possible tasks. This statement, known as the no free lunch theorem, suggests that we should consider aspects of machine learning other than learning algorithms for opportunities to escape the limits set by the available training data. In this thesis, we focus on two main aspects, namely the nature of the training data, where we introduce structure into the label set using concept hierarchies, and the learning paradigm, which we change in accordance with requirements of real-world applications as opposed to more academic setups.Concept hierarchies represent semantic relations, which are sets of statements such as "a bird is an animal." We propose a hierarchical classifier to integrate this domain knowledge in a pre-existing task, thereby increasing the information the classifier has access to. While the hierarchy's leaf nodes correspond to the original set of classes, the inner nodes are "new" concepts that do not exist in the original training data. However, we pose that such "imprecise" labels are valuable and should occur naturally, e.g., as an annotator's way of expressing their uncertainty. Furthermore, the increased number of concepts leads to more possible search terms when assembling a web-crawled dataset or using an image search. We propose CHILLAX, a method that learns from semantically imprecise training data, while still offering precise predictions to integrate seamlessly into a pre-existing application.

Vorschau

Einordnung

Gutachter(in), Rezensent(in):

Denzler, Joachim ; Mäder, Patrick

Datum der Annahme der Promotion / des Abschlusses:

07.04.2022

Datum der Veröffentlichung:

2022

URN:

urn:nbn:de:gbv:27-dbt-20220428-105521-006

PPN:

1800480393

Sprache:

Englisch

Ressourcentyp:

Text

Umfang:

205 Seiten

Erscheinungsort:

Jena

Schlagwörter:

Klassifikation^GND; Daten^GND; Vorwissen^GND; Maschinelles Lernen^GND

DDC-Sachgruppe der DNB:

004 Informatik

Bibliothekssignatur::

2022 J 260

Einrichtung:

Friedrich-Schiller-Universität Jena, Fakultät für Mathematik und Informatik

Hochschulvermerk:

Dissertation, Friedrich-Schiller-Universität Jena, 2022

auf die Merkliste

Zitieren

Zitierform:

Zitier-Link kopieren

Rechte

Export

BibTeX, Endnote, MODS, MARCXML, RIS, ISI, PICA, DC, CSV