Phd defence by Simon Hellemann Flachs

Ph.d.-forsvar — On 6 May 2021, Simon Hellemann Flachs, PhD student at Department of Computer Science will defend his PhD thesis titled "Computational Grammatical Error Correction: Bridging the Gap from Academia to Industry".


Date & Time:

Zoom: https://ucph-ku.zoom.us/j/63156025235?pwd=RWp4c0Iwbyt4UVZWc3hBM2pRSnFPZz09

Hosted by:
Department of Computer Science



Computational Grammatical Error Correction:
Bridging the Gap from Academia to Industry


Grammatical Error Correction (GEC) is the research field studying computational methods for correcting grammatical errors in text. These methods have the potential to improve human communication by enabling clear and error-free text.

While GEC has been studied thoroughly in academia, industrial adoption has been limited. This thesis aims to overcome some of the obstacles preventing industrial use.
In the first part of the thesis, we look into two methods for developing GEC systems without large amounts of expensive training data. Firstly, we show that artificially generated training data can be used to train robust systems for detecting subject-verb agreement errors. Secondly, we show that language models can be used for creating useful GEC systems, without using annotated training data. 
In the second part of the thesis, we look into how GEC systems perform when not evaluated on text written by English language learners – we release a new GEC benchmark, CWEB, consisting of website text with corrections, and show that current GEC systems perform poorly on this domain. 
In the final part, we focus on GEC for non-English languages, and show that GEC systems pre-trained on noisy data can be fine-tuned effectively on only small amounts of expert-annotated data, which opens up for creating inexpensive GEC systems in new languages.

Assessment Committee

  • Chair: Associate professor, Isabelle Augenstein, Department of Computer Science, UCPH
  • Engineering Manager, Zornitza Kozareva, Google
  • Senior Director of Research, Joel Tetreault, DataMinr


Academic supervisor

Professor, Anders Søgaard, Department of Computer Science, University of Copenhagen


Moderator at this defense will be

Assistant professor, Daniel Hershcovich Department of Computer Science,
University of Copenhagen

For an electronic copy of the thesis, please go to: https://di.ku.dk/english/research/phd/