Fractality and variability in canonical and non-canonical English fiction and in non-fictional texts

GND
1217812954
Zugehörigkeit
Experimental Aesthetics Group, Institute of Anatomy I, Jena University Hospital, University of Jena
Mohseni, Mahdi;
GND
173663133
ORCID
0000-0002-3034-0052
Zugehörigkeit
Department of English and American Studies, University of Jena
Gast, Volker;
GND
1100898638
ORCID
0000-0002-5220-8319
Zugehörigkeit
Experimental Aesthetics Group, Institute of Anatomy I, Jena University Hospital, University of Jena
Redies, Christoph

This study investigates global properties of three categories of English text: canonical fiction, non-canonical fiction, and non-fictional texts. The central hypothesis of the study is that there are systematic differences with respect to structural design features between canonical and non-canonical fiction, and between fictional and non-fictional texts. To investigate these differences, we compiled a corpus containing texts of the three categories of interest, the Jena Corpus of Expository and Fictional Prose (JEFP Corpus). Two aspects of global structure are investigated, variability and self-similar (fractal) patterns, which reflect long-range correlations along texts. We use four types of basic observations, (i) the frequency of POS-tags per sentence, (ii) sentence length, (iii) lexical diversity, and (iv) the distribution of topic probabilities in segments of texts. These basic observations are grouped into two more general categories, (a) the lower-level properties (i) and (ii), which are observed at the level of the sentence (reflecting linguistic decoding), and (b) the higher-level properties (iii) and (iv), which are observed at the textual level (reflecting comprehension/integration). The observations for each property are transformed into series, which are analyzed in terms of variance and subjected to Multi-Fractal Detrended Fluctuation Analysis (MFDFA), giving rise to three statistics: (i) the degree of fractality ( H ), (ii) the degree of multifractality ( D ), i.e., the width of the fractal spectrum, and (iii) the degree of asymmetry ( A ) of the fractal spectrum. The statistics thus obtained are compared individually across text categories and jointly fed into a classification model (Support Vector Machine). Our results show that there are in fact differences between the three text categories of interest. In general, lower-level text properties are better discriminators than higher-level text properties. Canonical fictional texts differ from non-canonical ones primarily in terms of variability in lower-level text properties. Fractality seems to be a universal feature of text, slightly more pronounced in non-fictional than in fictional texts. On the basis of our results obtained on the basis of corpus data we point out some avenues for future research leading toward a more comprehensive analysis of textual aesthetics, e.g., using experimental methodologies.

Zitieren

Zitierform:
Zitierform konnte nicht geladen werden.

Rechte

Rechteinhaber: Copyright © 2021 Mohseni, Gast and Redies.

Nutzung und Vervielfältigung:
Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG-geförderten) Allianz- bzw. Nationallizenz frei zugänglich.