Corpus linguistics can be defined most simply as that set of studies into the form and/or function of language which incorporate the use of computerised corpora in their analyses. It is a form of text linguistics and as such is evidence-driven. It shares with other types of text linguistics the purpose and rationale of describing the interactions between writers/speakers and readers/hearers as evidenced in the linguistic trace, that is, the texts, that these interactions leave behind, but also the overarching endeavour of describing how the language system or some part of it is organised and of explaining why it functions as it does. It differs from most other forms of text linguistics in incorporating statistical analyses of large numbers of text at some stage in the research, generally using dedicated software.
The SiBol Group of corpus linguists has a particular interest in Corpus-assisted Discourse Studies (CADS) in which such statistical analyses are married to the more traditional kind of analyses employed in discourse studies; there is typically a “shunting” (Halliday 2002) between statistical analyses and close textual reading. Among the aims of the CADS approach is the uncovering, in the discourse type under study, of what we might call non-obvious meaning, that is, meaning which might not be readily available to naked-eye perusal and simple introspection and, since meaning also resides in the interaction between text and reader, to generate fresh insights into reader-text interaction.
The SiBol Group has conducted research applying corpus-assisted techniques in a wide variety of linguistic areas including lexical grammar, conversation analysis (particularly im/politeness strategies), semantic (or evaluative) prosody, forced lexical priming, evaluation and control, irony, humour, metaphor and translation / comparative studies. The Group has also conducted ground-breaking research into thorny methodological issues in corpus linguistics, such as how to search for “absences” from a corpus, how to search for similarities as well as differences among datasets, and how dividing a dataset in different ways can lead the analyst to differing observations. Having its origins in faculties of political science it has also conducted a variety of corpus-assisted investigations into socio-political issues such as immigration, the reporting of antisemitism in the UK press, of how so-called “underclasses” are represented in various parts of the world, of China’s self-reporting of its relations with its near neighbours and how the actors and events of the so-called Arab Spring were reported in the USA and UK.
Researchers of the SiBol Group were also among the pioneers to devise a new form of CADS, denominated Modern-Diachronic Corpus-assisted Discourse Studies (MD-CADS) (Partington 2010) where large corpora of a parallel structure and content from different moments of contemporary time are employed in order to track changes in modern language usage but also social, cultural and political changes over modern times, as reflected in language.
The SiBol Group was responsible for founding the now biannual international conference in Corpus and Discourse Studies.
The Coordinator of the Bologna unit, Alan Partington, is a member of the Challenge Panel of the “Corpus Approaches to the Social Sciences” project at the University of Lancaster (UK), funded by the Economic and Social Research Council.
Members of the SiBol Group participated in the following inter-university collaborative projects:
- Integrated and united? A quest for citizenship in an “Ever Closer Europe”, financed by the European Union within the scope of the 6th Framework Programme.
- Corpora and discourse: A quantitative and qualitative linguistic analysis of political and media discourse on the conflict in Iraq in 2003, a national research project (PRIN) involving the universities of Siena, Bologna, and LUISS in Rome, financed by the Italian ministry for Education (MIUR).