AI teaches spoken English in Taiwan
« previous post |
Taiwan education ministry adds AI to English speaking test:
New system gives students instant feedback on spoken English
Lai Jyun-tang, Taiwan News | Feb. 3, 2026
Is this a first in the whole world? Or is it already common in many countries?
TAIPEI (Taiwan News) — Taiwan’s education ministry has added artificial intelligence to its English speaking assessment system to help students better learn and practice spoken English.
Liberty Times reported Monday that the upgraded system uses artificial intelligence to score pronunciation and analyze spoken answers in real time. Education officials said the move supports Taiwan’s 2030 bilingual policy by placing greater emphasis on practical communication skills.
Tsai I-ching (蔡宜靜), a division chief at the ministry’s K-12 Education Administration, said the system is free for students from elementary school to university and covers listening, speaking, reading, and writing. She said the new speaking tasks include open-ended questions and information-based responses to mirror real-life situations and international test formats.
The system evaluates pronunciation accuracy, fluency, rhythm, vocabulary use, grammar, and how well responses match the question, according to the education ministry. After each test, students receive instant, personalized feedback and learning suggestions, the ministry said.
At Changhua County’s Shengang Junior High School, students use the system to practice speaking beyond textbook exercises and gain a clearer understanding of their strengths and weaknesses. Teachers also guide students to use the feedback to refine pronunciation and sentence structure.
In Keelung City, Cheng Kung Junior High School applies the test results to build a learning cycle that links assessment, feedback, and improvement. The approach has helped boost student motivation and engagement, per CNA.
Selected readings
- "Artificial Intelligence in Language Education: with a note on GPT-3" (2/4/23)
- "AI panics" (11/27/16)
- "English as a prestige language in Taiwan" (11/29/20)
- "English as an official language in Taiwan" (12/8/18)
- Ralph Jennings, "Isolation-wary, Chinese-speaking Taiwan moves to make English an official language", Los Angeles Times (10/15/18)
Jarek Weckwerth said,
February 10, 2026 @ 4:38 am
Automated spoken English testing has been around for quite some time already. I don't know if it has been done at this (official/ministerial) level, but it's quite common, especially in Asian contexts. So far, the approach has been rather crude in that you simply do ASR and see if the result makes sense and is grammatical in English. The wet dream of "communicative" teaching approaches: If you get your meaning across, you're good. Fine-grained testing of pronunciation (as in "accent detection/reduction") is probably fake at this stage, even though efforts are being made in that direction. At least last week when I tested ChatGPT and Gemini on accent stuff using audio, ChatGPT couldn't do it and Gemini cheated.
For general spoken English testing, you can probably inspect these quick results (I haven't had a detailed look but looking at the headings that's exactly what it is):
Google search
Philip Taylor said,
February 10, 2026 @ 6:47 am
Only partially on-topic, but I think sufficiently relevant for it to be permitted as a follow-up :
How would readers of this forum (especially those with TEFL / TESOL experience) recommend that my Vietnamese brother-in-law acquire fluency in spoken, and competence in written, English ? He is now resident in the U.K., speaks almost no English whatsoever, and the local council are not offering TESOL classes for complete beginners ?
Victor Mair said,
February 10, 2026 @ 8:33 am
Further following up on Jarek Weckwerth's comment:
=====
Automatic Speech Recognition (ASR) is a rapidly advancing technology used in English Language Teaching (ELT) to convert spoken English into text, primarily focusing on improving pronunciation, speaking fluency, and listening skills. As a form of Computer-Assisted Pronunciation Training (CAPT), ASR offers immediate, objective feedback, acting as an automated, patient tutor that allows students to practice independently without fear of judgment. Cambridge University Press +3 (AIO)
=====
Jerry Packard said,
February 10, 2026 @ 8:50 am
One of my students did her PhD on automatic AI testing of oral production of L2 Chinese – it is remarkable how the algorithm uses rate of syllable production and length of inter-syllable segment to calculate a fluency metric.
Jarek Weckwerth said,
February 10, 2026 @ 10:02 am
I think it's useful to note that that paragraph form CUP is mostly advertising.
As I said, pronunciation testing by so-called AI is based around looking at the ASRed text. A typical test some time ago would be to get the student to read a piece of text, and then compare the output of ASR to that text.
In other words, it tests intelligibility. The results will depend on the details of the solution used, and in particular how "generous" it is towards mispronunciations. For example, the more it uses a language model (let us say, the more "autocomplete on steroids" it does), the less useful the results will be in a teaching situations. Importantly, today, there is a very real possibility of the system doing a better recognition job than a human, especially one unfamiliar with the accent. Which will make the approach essentially pointless.
(BTW in pronunciation research these days, "intelligibility" has the technical meaning of actual (word) recognition success, while "comprehensibility" is the processing difficulty as perceived by the listener.)
Importantly, all of that is a long way from "tutoring" as advertised by CUP above. In order to teach something, the system would have to know exactly what was wrong (beyond perhaps offering the "misrecognized" word) and what to do with it.
People are certainly trying to develop systems that would be be capable of doing just that, but at least on the circuit I'm on, there's quite some skepticism.
You can check out the CAPT papers at this conference, or this one, or of course ICPhS for a sense of what is being done.
The claim of patience is of course true. That of no judgement — not so sure. If the system says, your score is 45%, then that is judgement. But the current intellectual climate is that any "judgement" is bad, so it's a useful advertising claim.
Victor Mair said,
February 10, 2026 @ 1:31 pm
Trying to add the penultimate paragraph with the links in Jarek Weckwerth's comment:
You can check out the CAPT papers at this conference, or this one, or of course ICPhS for a sense of what is being done.
[Apparently it worked.]
Jarek Weckwerth said,
February 10, 2026 @ 10:16 am
Oh I can see my post has been eaten because of links. So here's a version without them.
I think it's useful to note that that paragraph form CUP is mostly advertising.
As I said, pronunciation testing by so-called AI is based around looking at the ASRed text. A typical test some time ago would be to get the student to read a piece of text, and then compare the output of ASR to that text.
In other words, it tests intelligibility. The results will depend on the details of the solution used, and in particular how "generous" it is towards mispronunciations. For example, the more it uses a language model (let us say, the more "educated guessing" it does), the less useful the results will be in a teaching situations. Importantly, today, there is the possibility of the system doing a better recognition job than a human, especially one unfamiliar with the accent. Which will make the approach essentially pointless.
(BTW in pronunciation research these days, "intelligibility" has the technical meaning of actual (word) recognition success, while "comprehensibility" is the processing difficulty perceived by the listener.)
Importantly, all of that is a long way from "tutoring" as advertised by CUP above. In order to teach something, the system would have to know exactly what was wrong (beyond perhaps offering the "misrecognized" word) and what to do with it.
People are certainly trying to develop systems that would be be capable of doing just that, but at least on the circuit I'm on, there's quite some skepticism.
You can check out the CAPT sections of several conferences, notably of course the International Congress of Phonetic Sciences, for a sense of what is being done.
The claim of patience is of course true though. The claim of no judgement — no so sure. If the system says, your score is 45%, then that is judgement. But in the current intellectual climate where any "judgement" is bad, it is a useful advertising claim.
cliff arroyo said,
February 10, 2026 @ 11:25 am
"my Vietnamese brother-in-law acquire fluency in spoken, and competence in written, English ?"
It's gonna be messy…. throw things against the wall and see what sticks, start with online stuff (youtube seems to have no shortage of simple english for Viet speakers), duolingo whatever is free and see what seems to get his interest more.
Find some beginning text books and work thru them with him (or find someone who can).
And it won't be fast, you're looking for survival English first and then hopefully you can build on that.
You might try to find online Viet teachers of English (esp to explain sounds that don't occur in Vietnamese).