Op 9 november 2021 is bij Amsterdam University press Het raadsel literatuur. Is literaire kwaliteit meetbaar? verschenen. In dit boek presenteert Karina van Dalen-Oskam een synthese van de onderzoeksresultaten uit het project The Riddle of Literary Quality. Het boek gaat vergezeld van een speciale website, waarop veel extra informatie te vinden is, onder meer over het R package met data die uit het project zijn voortgekomen. Zo kunnen veel van de metingen waarover Karina in haar boek schrijft door andere onderzoekers worden herhaald en gecontroleerd, en kunnen ook andere vragen aan de data worden gesteld.
Category Archives: News
Replication of The Riddle in the UK
The research done in The Riddle of Literary Quality for the Netherlands is now being replicated in the United Kingdom. Thanks to a grant from the AHRC, Professor in English Literature Bas Groes and Karina van Dalen-Oskam could launch the project Novel Perceptions: Towards an Inclusive Canon on 16 November 2020.
The British project at the University of Wolverhampton also comes with a large survey to gather readers’ opinions about recent novels: The 2020 Reader Review also launched on 16 November. All readers of fiction published in English or translated into English are invited to participate in the survey. It is also possible to share your opinion about novels you have not read.
Corina Koolen: Dit is geen vrouwenboek
On 15 September 2020 HarperCollins Holland published Dit is geen vrouwenboek. De waarheid achter man-vrouw-verschillen in de literatuur. In this book Corina Koolen presents Dutch readers with an overview of the research she reported on in her much-praised PhD-thesis from 2018. She expands on her earlier work with results of additional research, leading to new observations and a deeper insight into – as the Dutch subtitle states – ‘the truth behind gender differences in literature’.
From one Riddle to the next
The project The Riddle of Literary Quality officially ended on December 31st 2019. The project was funded by the Computational Humanities Programme of the Royal Netherlands Academy of Arts and Sciences.
Project summary: The attribution of literary value is an intriguing social process, and difficult to grasp. What is ‘high-brow’ literature, and can we measure it? That was the key question of the Dutch computational humanities project The Riddle of Literary Quality (2012-2019). The project combined computational analysis of writing style with the results of a large online survey of readers, completed by almost 14,000 participants. Correlating readers’ opinions and stylometric analyses makes visible which linguistic features play a role, but also which cultural biases are in place. Why are some authors and some works attributed with more literary prestige than others? What about translations? And what does this tell us about contemporary Dutch society?
Plenty of articles and small follow-up projects are still in the pipeline, however. It is worth-while to keep on following the work and the new publications of the now dispersed but still collaborating team.
A large follow-up project is currently in preparation: The Riddle of the Literary Canon. It will apply the results to originally Dutch and translated into Dutch novels from the last 200 years in search of changing conventions of literariness. More information will follow in due time.
For information please contact the principal investigator, Karina van Dalen-Oskam.
Vector space explorations of literary language
A new article (open access, peer-reviewed) resulting from The Riddle of Literary Quality has appeared in the journal Language Resources and Evaluation. Previous work already showed that literary novels can be recognized successfully with textual features. This article shows that literature can even be recognized with short fragments (2-3 pages), and also considers judgments of quality and novels that respondents had not read. The textual features are automatically learned document representations that require no feature engineering and are only based on word frequencies.
In addition, the article tests the hypothesis that literary novels are more complex than non-literary novels. By measuring the similarity and variety of topics in the novels, literary novels are shown to stand out more than non-literary novels.
A keyword analysis uncovers some of the stylistic markers that explain the success of the predictive models. It also highlights certain biases related to genre and gender, both in the data and the models. Nevertheless, we find that the greater part of factors affecting judgments of literariness are explicable in terms of word frequencies, even in short text fragments and among novels with higher literary ratings.
van Cranenburgh, A., van Dalen-Oskam, K. & van Zundert, J. (2019), Language Resources & Evaluation. https://doi.org/10.1007/s10579-018-09442-4
The Riddle in PLOS One
Do novel readers want to be swept away by a novel? Or do they seek an intellectual challenge? Or both at the same time?
In 2013, the research team of The Riddle of Literary Quality, led by Professor Karina van Dalen-Oskam in the Netherlands, completed The National Reader Survey. Almost 14,000 respondents shared their opinions about recently published novels. The participants also responded to six statements that related to the question of what they personally considered important to reading fiction.
Today, in the international (open access, peer-reviewed) journal PLOS One, an initial analysis of these reactions appears. The American researcher Allen Riddell applied methods to analyze these responses that were previously used in research into voting behavior. He discovered an interesting pattern in the responses of the representative group of participants in the National Reader Survey. Readers who indicated they read in order to be intellectually challenged also indicated that they wanted to be carried away by a novel. Conversely, this did not apply: not all readers who wanted to be dragged along, also wanted to be challenged intellectually.
Readers who want more than just being swept away by a story are usually ‘literary’ readers. They usually have more demands and higher expectations when they read, they not only restrict themselves to the story, but also want to enjoy the use of language in a novel. This group is often critical of popular books that are “devoured” by others. But this research shows that they, as readers, do report seeking emotional engagement with works: they report wanting to be carried away by a novel. An interesting outcome, not only because it tells us something about the readers of literature but also about the literature itself.
In the article, Riddell and van Dalen-Oskam state that they suspect that the literary reader has come to appreciate this way of reading in the education they have enjoyed or by developing their own reading experience. Literary readers sometimes are critical about ‘non-literary’ readers and the books they read, but this research shows that they do want the same. Only a little more …
Allen Riddell & Karina van Dalen-Oskam, Readers and their roles: evidence from readers or contemporary fiction in the Netherlands. PLOS One, 26 July 2018, https://doi.org/10.1371/journal.pone.0201157
“The mother, the wife” and The Riddle
Corina Koolen’s PhD defense already received much attention in Dutch newspapers, but when the Dutch organization CPNB (Collectieve Propaganda voor het Nederlandse Boek; Collective Propaganda for the Dutch Book) decided its new theme for the 2019 book week, new interest sprang up. CPNB chose as a theme “De moeder, de vrouw” (“The mother, the wife” or “The mother, the woman”), after a poem by Martinus Nijhoff. This might not have caused much of a stir, had they not asked two male authors to write the accompanying Book Week novella and essay. This resulted in a protest by both female and male authors. On June 18th Koolen was interviewed for Nieuwsuur, a Dutch human interest programme, and asked to explain the problematic nature of this CPNB action. Radio programme Focus also payed attention to the discussion. Finally, Dutch newspaper Trouw published an interview, asking Koolen to discuss five possible solutions to the problem of female authors still not receiving equal attention in the Netherlands.
PhD defense ‘Reading beyond the female’
The second PhD project in The Riddle of Literary Quality is about to be finished. On Friday the 18th of May, 11:00 sharp, Corina Koolen will publicly defend her PhD thesis Reading beyond the female. The relationship between perception of author gender and literary quality in the Aula of the University of Amsterdam, Singel 411, Amsterdam.
After the Riddle project gathered the results of the National Reader Survey, the issue with author gender became obvious. Readers did not consider works by female authors to be of high literary quality. In her thesis, Koolen researches why this is the case. First, she examines the position of female authors in the Netherlands: in which genres do they publish? Why are their novels judged to be of lesser literary quality by the Dutch public? Second, she looks at the content of the novels: is gender really as important a factor in text style, topics, etc.? And is the topic of ‘attention to physical appearance’ typically a topic for female authors or not? With her analyses she debunks a number of the myths about literary quality that still exist in the current Dutch literary field: that female authors are judged equally; that they have equal chances; and that if they do not, that it is solely to blame on the literary quality of their work.
The PhD was funded by the digital humanities special research area of the Faculty of Humanities, University of Amsterdam.
Summaries of the thesis are available in Dutch and English.
Title: Reading beyond the female. The relationship between perception of author gender and literary quality
Date/time: Friday May 18th, 2018, 11:00 sharp (until about 12:15)
Location: Aula of the University of Amsterdam, Singel 411, Amsterdam
A reception will be held at the same location after the defense.
For more information, you can contact Thijs van der Veen (Huygens ING).
Let’s LIWC at The Riddle from a slightly different perspective…
The Riddle is in its final phase. In the next two years, much of the experiments that have been done will be reported on in one more PhD-thesis, scholarly articles and in a book for all those who enjoy reading and may even have participated in our National Reader Survey in 2013. One of the things still on our to do list was applying stylometric tools with the knowledge we gained concerning the literary conventions of contemporary novels on older fiction. In April 2017, Floor Naber has started a short stint at Huygens ING’s Riddle Team to do an experiment with LIWC. She will use the software Linguistic Inquiry and Word Count on the Riddle corpus and on a corpus of late nineteenth-century Dutch novels to test whether old and new compare or where they differ.
PhD defense Andreas van Cranenburgh
2 November 2016: Andreas van Cranenburgh defends his PhD-thesis Rich statistical parsing and literary language.
This thesis studies parsing and literature with the Data-Oriented Parsing framework, which assumes that chunks of previous experience can be exploited to analyze new sentences. As chunks we consider syntactic tree fragments.
After presenting a method to efficiently extract such fragments from treebanks based on heuristics of re-occurrence, we employ them to develop a multi-lingual statistical parser. We show how a mildly context-sensitive grammar can be employed to produce discontinuous constituents, and compare this to an approximation that stays within the efficiently parsable context-free framework. We show that tree fragments allow the grammar to adequately capture the statistical regularities of non-local relations, without the need for the increased generative capacity of mildly context-sensitive grammar.
The second part investigates what separates literary from other novels. We work with a corpus of novels and a reader survey with ratings of how literary they are perceived to be. The main goal is to find out the extent to which the literary ratings can be predicted from the texts. We first evaluate simple measures such as vocabulary richness, text compressibility, and the number of cliché expressions. In addition we apply more sophisticated, predictive models: a topic model, bag-of-words model, and a model based on syntactic tree fragments. We find that literary ratings are predictable from textual features to a large extent. While it is not possible to infer a causal relation, this result clearly rules out the notion that these value-judgments of literary merit were arbitrary, or predominantly determined by factors beyond the text.
Link to the thesis: http://dare.uva.nl/record/1/543163