Poster presentation

Issues and concerns in the automatic generation of vocabulary training and testing items

Sat, Jun 18, 11:30-12:15 Asia/Tokyo

Vocabulary learning is a nearly universal feature of language education curricula. A common assessment approach involves multiple-choice cloze items (MCC; aka 'ana-ume' in Japanese). However, with the advent of the Coronavirus situation and assessment methods moving to online platforms, MCC items present difficulties: Once used online, they should be considered "public" and cannot be re-used; producing items manually is laborious; and there are very few available end-to-end tools to produce items automatically (though see Word Quiz Constructor discussed in Rose 2020).

This poster is a progress report on a project that aims to build a system to generate items automatically for vocabulary training and testing. We will identify and summarize some of the key issues involved in this process and the approaches we are taking to resolve these issues. In particular, the poster will focus on the preparatory steps for using machine learning methods to generate items: the construction and validation of a "gold-standard" set of items.

At present, the project has created gold-standard items using both the General Service List (GSL: West, 1953) and the Academic Word List (AWL: Coxhead, 2000), consisting of a total of 2786 items. This standard is currently being tested in a pilot experiment together with an original vocabulary learning app, with some gamification. We plan to increase the gold standard list over time to at least 4,000 items and evaluate them as well as training and testing methods with a large scale population (>1000 students) in spring, 2022.

Issues that have arisen during the process thus far include estimating item difficulty (cf., Kurdi et al 2019), suitability for various audiences, and list coverage of items. These issues will be discussed along with our resolutions in the present project. This poster should be of interest to vocabulary specialists and programmers working on vocabulary-related educational applications.

JALTCALL 2022 - VocaTT Issues (Rose et al).pdf
Download PDF

Ralph Rose

Professor at the Center for English Language Education in Science and Engineering (CELESE) in the Faculty of Science and Engineering.

Issues and concerns in the automatic generation of vocabulary training and testing items

Ralph L. Rose, Qiao Wang, Naho Orita, and Ayaka Sugawara
Center for English Language Education (CELESE)
Faculty of Science and Engineering @ Waseda University

All poster sessions will be held in our Gather.Town space here: https://app.gather.town/invite?token=OjJ4Wf96xzIjVGDgu-qJSVQMmnXjyxiM