I’m still working on a FutureLearn/Lancaster University course on Corpus Linguistics (CL). It runs for 8 weeks and is much more work than any of the previous FutureLearn courses that I have undertaken, so whether I’ll get to the end of it remains to be seen, given my new teaching commitments and other roles to juggle. In the meantime, I’ll share with you my thoughts as each week progresses.

Although I decided last week that I would just store away the CQPweb videos for later, in the end, I did watch several of them after I posted my blog post, trying to find some interesting collocates on the EEBO corpus.  This led me to wondering how to get round the spelling issue – as words are spelled differently in historical corpora, the only thing I could think of was to put in a lot of variant search terms, which is a little tedious but, more importantly, is limited by what variations I could come up with.  I did try looking into the issue in a bit more depth, as the video interview with Steve Pumfrey suggested that he and Paul Rayson had discussed this very point and that there was a simple solution.  It wasn’t one that I could find among their publications, though, so I posted a comment asking about the problem and the course leader, Tony McEnery, replied suggesting a paper that I could read.  I’ve now downloaded it, and it’s waiting in my ‘to do’ pile.

And so on to Week 7, where we began by thinking about our experience of using corpora for language learning and teaching, and how useful they would be. I don’t have any direct experience of teaching languages apart from to my children as native speakers. I suppose we are all using our own corpora all the time, in that respect – there are the words and phrases we use all the time, and the ones which are less frequent, which is what generates the electronic corpora we have been talking about.  I wonder if my own language-learning experiences (yes, we’re back to the Spanish!!!) have been informed by corpora or not…..?

It might well be useful to use corpora for language teaching, albeit with the caveats we discussed last week – some of the most frequently used ideas and phrases might require a higher level of thinking and ability that beginners don’t have, while the sheer number of meanings of the most frequently used might be confusing for beginners too. So there is a compromise to be had between teaching words which will be most useful and ‘flexible’ – giving a lot of usage with a small vocabulary – and keeping things straightforward, perhaps by introducing a few frequent meanings first and less common ones later.

The first video looked at how corpora could help with language learning research, because they both look at patterns in language.  We can look at the variation between the use of language by native speakers and second language learners.  It raised the interesting point that corpora of second language learners’ language are influenced by what they are taught when, as well as how often those words and phrases are needed.  The next video was about what different learner corpora could tell us about the development of language proficiency, while the third talked about the use of this information in developing corpus-based teaching materials.  Exposure to authentic language use allows language to be learned implicitly, which is more fluent.  Another interesting point that it raised was about active learning – that knowledge gained by discovery and finding out for themselves is more robust and likely to be retained in the long term.  The final video described how data-driven learning could be used in language teaching and learning by allowing learners access to the corpora in order to explore language of personal relevance to them or complete controlled tasks set by the teacher, as well as the challenges these techniques raised.  Learners needed preparing for the use of corpora in language learning as they found some aspects difficult – for example, they needed clear instructions, but they also needed training on interpreting the corpus results such as whether something was a pattern or not.  BNClab contains teaching materials which help language learners use the corpora effectively, so they get an introduction to the linguistic topic explaining why it is important, then a series of tasks which allow them to access the data directly and interpret the results effectively.

The practical activity had us looking at the use of utterly and perfectly, which appear to have more or less the same meaning according to dictionary definitions.  This means that learner English speakers would expect to use them interchangeably, but their actual usage by native English speakers tells a rather different story. 

Your query “utterly” returned 33 matches in 26 different texts (in 1,147,097 words [500 texts]; frequency: 28.77 instances per million words)
I noticed that 18 of the 33 were ‘JJ’, although I’m not sure what that classifies them as.
Your query “perfectly” returned 55 matches in 47 different texts (in 1,147,097 words [500 texts]; frequency: 47.95 instances per million words

Before beginning the task, I would have guessed that utterly was negative and perfectly was positive, but looking at the situations which cropped up, I wonder if perfectly is often used in a passive aggressive way! Utterly is more used in explicitly negative situations, whereas perfectly can be implicitly negative.  Nevertheless, the teacher notes backed up my gut response to the positive/negative division.

So despite my worries about getting through the course, I’ve made it to the final week! On to Week 8…