The Riddle is in its final phase. In the next two years, much of the experiments that have been done will be reported on in one more PhD-thesis, scholarly articles and in a book for all those who enjoy reading and may even have participated in our National Reader Survey in 2013. One of the things still on our to do list was applying stylometric tools with the knowledge we gained concerning the literary conventions of contemporary novels on older fiction. In April 2017, Floor Naber has started a short stint at Huygens ING’s Riddle Team to do an experiment with LIWC. She will use the software Linguistic Inquiry and Word Count on the Riddle corpus and on a corpus of late nineteenth-century Dutch novels to test whether old and new compare or where they differ.
2 November 2016: Andreas van Cranenburgh defends his PhD-thesis Rich statistical parsing and literary language.
This thesis studies parsing and literature with the Data-Oriented Parsing framework, which assumes that chunks of previous experience can be exploited to analyze new sentences. As chunks we consider syntactic tree fragments.
After presenting a method to efficiently extract such fragments from treebanks based on heuristics of re-occurrence, we employ them to develop a multi-lingual statistical parser. We show how a mildly context-sensitive grammar can be employed to produce discontinuous constituents, and compare this to an approximation that stays within the efficiently parsable context-free framework. We show that tree fragments allow the grammar to adequately capture the statistical regularities of non-local relations, without the need for the increased generative capacity of mildly context-sensitive grammar.
The second part investigates what separates literary from other novels. We work with a corpus of novels and a reader survey with ratings of how literary they are perceived to be. The main goal is to find out the extent to which the literary ratings can be predicted from the texts. We first evaluate simple measures such as vocabulary richness, text compressibility, and the number of cliché expressions. In addition we apply more sophisticated, predictive models: a topic model, bag-of-words model, and a model based on syntactic tree fragments. We find that literary ratings are predictable from textual features to a large extent. While it is not possible to infer a causal relation, this result clearly rules out the notion that these value-judgments of literary merit were arbitrary, or predominantly determined by factors beyond the text.
Link to the thesis: http://dare.uva.nl/record/1/543163
Saturday the 7th of November, de Vereniging van Schrijvers en Vertalers and het Genootschap van Nederlandstalige Misdaadauteurs organized an event where literary authors, crime authors and translators met to discuss various kinds of topics. In the afternoon session there was an interview with thriller writer Charles den Tex and literary author Nausicaa Marbe. The topic of the interview was about thriller elements in literary novels and literary ingredients in thrillers. In other words: what happens when the genres make use of each others ingredients?
On March 4th 2013, the survey of the project The Riddle of Literary Quality was launched on http://www.hetnationalelezersonderzoek.nl/ . This “National Readers’ Enquiry” hopes to reach many thousands of respondents. As can be expected based on the nature of our project, the language of the survey is Dutch. All readers of Dutch among you are welcome to give your ratings on a set of novels (originals and translations) published during the last five years in The Netherlands. The list of novels contains those that were borrowed most from public libraries and that ranked highests on the bestseller lists of the last three years. Enjoy!
The Riddle team has recently become a lot larger. PhD-student Corina Koolen was liaised to the project starting in September 2012, project assistant Fernie Maas joined the 1st of November, and Kim Jautze started work on her PhD on 15 November. More news is sure to follow, the team now being complete!