Posted on 14 November 2023 in artificial intelligence, bulletin, cities, digital humanities

AI And Hangul Learning

In this contribution to Bulletin 28, ISRF Director of Research Christopher Newfield uses his recent experience with an AI-powered translation app to think through the challenges facing the humanities.

Chris Newfield

ISRF Director of Research

Image in public domain by
Daniel Bernard (via Unsplash).


For my first trip to South Korea and first immersion in its language, I made two main preparations. First, I learned Hangul, the logical and surprisingly fun alphabet invented in the 1440s and then kept out of circulation by various powers until the end of World War II. I wanted to be able to read names of hotels, restaurants, museums, cities, and street signs, for starters. The word “learned” is extremely elastic, and it means that in my case I can piece together syllables one character at a time, like a three-year-old who unfortunately doesn’t actually speak Korean. I can make out the right direction if I’m standing at a street corner and come to a full stop. Were I driving at highway speeds, I’m not fast enough and would miss my exit. Naver Maps puts smaller locations in Korean rather than English and I can figure those out. I can read the names of the candidates and locations on this display in Seoul’s National History Museum about the crucial 1987 election, in which the two non-military candidates (the pro-business Kim Young-Sam, in red, and the longtime democracy campaigner Kim Dae-yung, in yellow) split the centre and left vote and gave the right a further five years in power (Roh Tae-woo). But I can’t read their slogans.

Figure 1 (image by author)

I got to this low level—plus about ten spoken phrases—in around four hours total study spread over the last few days before the trip and intermittently since. A colleague who teaches German to speakers of other languages told me that basic functionality—order food, fill a prescription—takes a hundred hours of serious coursework and some outside study. That’s much more time than most people will commit if there’s an easier way.

But 100 hours is the threshold of language instruction. I horrified my University of California Education Abroad students in France when they would say, “I’m going to go home in 12 weeks fluent in French” and I’d say, “fluency will take you 10,000 hours.” I used the old Malcolm Gladwell number that people contest, but I think it’s a decent rule of thumb for serious skill in anything. His example was The Beatles becoming The Beatles by taking the only job they could get, which was playing all-night clubs in Hamburg 8 hours at a time, 7 days a week, for weeks or months on end. The language equivalent for German, French, or Korean is using the language 8 hours a day, 7 days a week, for a period of three and a half years. 

As it turns out, South Korea has bi-lingual Romanised street names even in smaller towns, most foreigner-facing places in Seoul and Busan have English speakers, though not so much in my experience in the remarkable southern cities of Gwangju, Yeosu, Suncheon, and Gyeongju. And yet Hangul is part of every minute in the country when anyone else is around. It is the crossroads of the culture and history of Korea on its very difficult road towards democratisation. Learning the basics is a first step toward functionality, connection and putting contradictory pieces together. Since half the buildings in the country are covered with Hangul, I’ve been “reading” nonstop since I landed at Incheon airport.

The second preparation for the trip took 60 seconds. I downloaded Naver Papago’s “AI” translator, which seems generally better than Google for Asian languages. It does what I can’t do at all—it turns Romanised Korean words into English words and words into sentences. I used Papago constantly. For example, most smaller exhibit labels may have the title also in English, Chinese, or Japanese but not the text, so I used Papago to “read” that. It makes mistakes. In Yeosu, I saw some people wearing official high-visibility vests and wanted to know why. Here’s the Papago for “crossing guard”:

Figure 2 (image by author)

He’s a safety king while she’s a “children’s eye.” Such results raised as many questions as they answered. Elsewhere, Papago translated a sign on the bridge where police killed many dissidents in 1948 as “martyrdom.” A Korean friend translated it correctly as “Suncheon Bridge,” saying that a Hangul character had the ghost of a Chinese character that in some contexts means “one who sacrifices.” The user does not control Papago’s associations, or know them in the first place.

But what Papago did in the 3-4 hours I spent getting a rudimentary grasp of Hangul was put thousands of words of Korean history into my brain. It also allowed communication across the language divide my short study had not crossed: “could I have a second towel please?” at the hotel, or, “both kinds of fried chicken are delicious, thank you!” to the woman who cooks, serves, cleans, and bartends by herself in the chicken restaurant called Buffalo in Gyeongju. Papago also finds poetry.

Figure 3 (image by author)

My partner was doing research on democracy rebellions—their brutal repression but even more what they had hoped for in the first place. Papago helped with this. In Suncheon there are displays explaining the purposes of the dissidents and rebels that were killed in large numbers there. Here’s one panel.

Figure 4 (image by author)

Which Papago renders like this.

Figure 5 (image by author)

Papago got the spirit of the research, and greatly expanded my intelligence during the trip.

As a translator, Papago seems so much more powerful than my individual brain that it’s tempting to see learning Hangul for one visit as a waste of time. Digital tools that now go under the marketing label, AI, are often heralded not as complements to the difficult humanities skill of language learning—a learning accelerator—but as their replacements. This idea that we won’t need to take languages into ourselves, speak them ourselves, “have” the language in that annoying usage, seems self-evidently ridiculous. It certainly is to anyone who has tried to say more than one thing to the interesting-seeming woman running a good restaurant in southeastern Korea by typing and then pointing at their phone. And yet in the past year some journalists have proposed exactly this: the digital-human partnership represented by Duolingo will be replaced by “generative AI” doing all the work by itself.[1] The inevitable accompanying claim is that sure, there are problems with the details, but soon enough they will all be fixed. 

The idea is that large language models (LLMs) are now so good that they have arrived at the threshold of human consciousness, and are on the verge, in another version or two, of attaining consciousness itself.[2] This is a discourse of manifest destiny, which seems to come naturally to a certain type of American. From the first moment it confuses two things: the intelligence of the programme itself, and a person’s use of the programme as (part of) their own intelligence. Many of the attempted new technologies of the tech goliaths—Google Glass, Facebook’s Metaverse—intend to fuse or at least confuse the two, that is, the software programme and the user’s intellect. 

The mystifications have been tremendous and undoing them very hard,[3] but still the world should be much clearer than it is that LLMs do not have intelligence. Elsewhere I’ve argued this point about intelligence as such,[4] and here I’d like to focus on how an “AI” technology like Papago translation services functions as part of a person’s own intelligence. 

The most obvious thing when I use Papago to “read” Korean is that I do not know Korean. The chicken chef instantly sees this about me. When she sat down to chat with a table of local customers for a few minutes I could not possibly have joined them. Papago signals my total personal inadequacy in Korean conversation and also my outsider status in the web of social relations that the language helps to constitute. 

In contrast, I was part of many professional-level conversations about Korean history, politics, and political economy on the trip. I did function at my normal level of intelligence. The reason was because the Korean interlocutor spoke English—“had” the English language as a personal capacity, not a phone-based programme—or because of a multilingual interpreter hired for the occasion. The capability was theirs not mine. Mine consisted of a few hours memorizing Hangul and my handy Papago application. 

In the current geopolitical period, English speakers depend on the personal capabilities of people from all of the countries in the world to speak to us in our language. Again, the language capability is theirs, not ours, and yet we often take it for granted. When we visit their country, we expect them to give us their English capability for free. If this happens enough, their capability can become invisible. The most invisible thing about it is the sheer human labour that went into it—someone’s 10,000 hours of real work that we get for free, as when the hotel valet in Busan, who speaks like he grew up in California, gets us to the museum about forced mobilization during the Japanese occupation of Korea. 

My intuition is that some people’s expectation of possessing the capabilities of others, for free, allows the general confusion in which the capability of a programme can be possessed as our capability. I don’t see how else we could think that Korean AI could replace learning Korean.

Possessing other people’s capabilities and their labour: these paired conditions underlie most of the worst human endeavours in history. During its occupation of Korea, imperial Japan forced at least 7 million Koreans into various modes of supporting the war machine, from industrial overwork to military service to sex slavery as “comfort women.” We see the problem now. The technologically advanced and philosophically sophisticated Japanese didn’t see it then. One reason is the process of dehumanization, which is again a major subject of philosophical analysis in the work of Judith Butler and many others: a system strips away the human person to leave the capability or the labour, of which we then take possession as though it were ours. 

LLMs are not colonial occupations. But their companies are being sued by artists and writers for taking their work as its own—for training company models on their works without acknowledgement, credit, or consent.[5] Since our Athens meeting, a hue and cry has been spreading to the effect that LLMs are functioning as a device for the appropriation of human capabilities by large corporations intent on building platforms through the ownership and control of the materials created by the capabilities of thousands or millions of others. Copyright and intellectual property in general were designed to combine authorship and public use by allowing authors to be paid by the public for the use of their creations, not to empower corporate intermediaries through free bulk data collection. My sense is that a tech corporation’s determined, self-righteous entitlement to absorb the entirety of artistic production into its proprietary model assumes the dehumanization of artistic capability, and perhaps of the actual artists as the lower order in a Two Cultures tech future.

Translation is one instance of the general phenomenon of education. Education focuses primarily on creating capabilities in individuals, where training in the digital extensions of those capabilities must come in second. The AI wave reverses the priority, not by making real arguments for putting the assist before the skill, but by marketing convenience joined to inevitability. The motto for foreign travel can become, “talk to the phone.”

One of my colleagues teaches at Korea University. Her students are among the highest testers in the country, which makes them among the highest testers in the world. She’s spending extra time teaching them how to write papers in the social sciences. “They don’t know how to make a thesis,” she said. “They also don’t know how to come up with a research question. So we have to start from scratch.”

I taught writing in various settings for forty years, and know how hard it is to teach people how to develop their own position on something, and argue for it. It’s a higher-order intellectual activity. It never gets that much easier, but you do learn how to launch the process, stay in it, and make it work. Because it is hard, it’s tempting for students and everyone else to skip the part where you generate your own idea. Lots of learning research shows that learning is happening the most when one is struggling, that is, precisely when one feels stupid.[6] AI models like ChatGPT have now arrived to make sure one can instead feel smart—always. 

This point was perfectly articulated by an undergraduate at Columbia University, Owen Kichizo Terry.[7] In Spring 2023, he wrote, 

The common fear among teachers is that AI is actually writing our essays for us, but that isn’t what happens. You can hand ChatGPT a prompt and ask it for a finished product, but you’ll probably get an essay with a very general claim, middle-school-level sentence structure, and half as many words as you wanted. The more effective, and increasingly popular, strategy is to have the AI walk you through the writing process step by step. You tell the algorithm what your topic is and ask for a central claim, then have it give you an outline to argue this claim. Depending on the topic, you might even be able to have it write each paragraph the outline calls for, one by one, then rewrite them yourself to make them flow better.

He then takes the reader through his process of GPTing a close reading of The Illiad. “[O]ne of the main challenges of writing an essay is just thinking through the subject matter and coming up with a strong, debatable claim. With one snap of the fingers and almost zero brain activity, I suddenly had one.” 

Terry’s point isn’t only that instructors won’t be able to catch AI-based cheating, but that students are using AI to bypass thinking. It’s the worst of both worlds: students aren’t being taught to use AI for activities where it will be useful, and they aren’t “being forced to think anymore.” 

This is a completely unnecessary outcome. Avoiding it starts with university administrators, the media, governments, everyone, simply refusing to accept the frame of AI as replacing—after taking—people’s capabilities.

Universities have a particular obligation as primarily formers of human capabilities and generators of Bildung.In the current world, this means Bildung for all. My definition of “humanities knowledge” includes a range of subject knowledges, but also a set of capabilities that enable intellectual capabilities very much including creating a thesis statement.

Here is the list:[8]

1. Knowledge of what a research question is

2. Basic subject knowledge in a chosen topic area; its major research questions 

3. Developed capacity for being interested in questions where the answer is “nonobvious”

4. Ability to inquire into one’s own core interests

5. Development of the project topic research question (with self-reflexivity/metacognition)

6. Identifying a thesis or hypothesis about the topic (interesting and nonobvious)

7. Planning the investigation (identification of steps; ongoing revision of methods)

8. Organized research, including recording and sorting of conflicting information

9. Interpretation of research results (incl. divergent, disorganized, unsanctioned, anomalous)

10. Development of analysis and narrative into a coherent narrative (gaps included)

11. Public/social presentation of findings and responding to criticism

12. Ability to reformulate conclusions and narrative in response to new info and contexts

13. Ability to fight opposition, to develop within institutions, to negotiate with society

Everyone graduating from university should be able to do each of these things, and all of them together.

Educators always see this list as asking a lot. And yet it’s less that what we ask in domains that society loves and cares about. At one point at lunch, I mentioned the power of K-pop to my colleague at Korea University. “I’m always impressed,” I said, “by the world-level training, the immaculate dancing, the perfectionist production of everything.” “Those performers,” she said, “they move into dorms at age 15, dancing and singing is all they do, they do it every day, endlessly, all day.” K-pop follows the rule of 10,000 hours. Why not Bildung? 

I see no way around this: it will one day be obvious that LLMs are tools that extend our powers and are not replacements for them. But only when we’ve raised our intellectual expectations for people.

But let’s ask Papago what it thinks.


Footnotes

[1] For example, Pan Kwan Yuk, “The Lex Newsletter: AI Translation May Supersede AI-Aided Language Learning,” Financial Times, August 23, 2023, sec. Lex, https://www.ft.com/content/cd30d49c-b38b-4843-a516-aba62e4b347f.

[2] Sébastien Bubeck et al., “Sparks of Artificial General Intelligence: Early Experiments with GPT-4” (arXiv, March 27, 2023), https://doi.org/10.48550/arXiv.2303.12712.

[3] Lucas Ropek, “ChatGPT Is Powered by Human Contractors Paid $15 Per Hour,” Gizmodo, May 8, 2023, https://gizmodo.com/chatgpt-openai-ai-contractors-15-dollars-per-hour-1850415474.

[4] Christopher Newfield, “How to Make ‘AI’ Intelligent; or, The Question of Epistemic Equality,” Critical AI 1, no. 1–2 (October 1, 2023), https://doi.org/10.1215/2834703X-10734076.

[5] For an op-ed on AI appropriation of artists’ work, see Molly Crabapple, “Op-Ed: Beware a World Where Artists Are Replaced by Robots. It’s Starting Now,” Los Angeles Times, December 21, 2022, https://www.latimes.com/opinion/story/2022-12-21/artificial-intelligence-artists-stability-ai-digital-images. Crabapple made particularly powerful arguments on Paris Marx, “Why AI is a Threat to Artists with Molly Crabapple,” Tech Won’t Save Us, June 29, 2023 https://podcasts.apple.com/us/podcast/tech-wont-save-us/id1507621076?i=1000618717251. On the suit against Stable Diffusion, see Matthew Butterick, “Stable Diffusion Litigation · Joseph Saveri Law Firm & Matthew Butterick,” Stable Diffusion litigation, January 13, 2023, https://stablediffusionlitigation.com/. On one of the training corpi, Books3, see Alex Reisner, “Revealed: The Authors Whose Pirated Books Are Powering Generative AI,” The Atlantic (blog), August 19, 2023, https://www.theatlantic.com/technology/archive/2023/08/books3-ai-meta-llama-pirated-books/675063/. For a sign that the mainstream business press sees the problem with AI as a device for the appropriation of human capabilities, see John Gapper, “Generative AI Should Pay Human Artists for Training,” Financial Times, January 27, 2023, sec. Artificial intelligence, https://www.ft.com/content/c42189e0-4069-4e17-8dc0-72544dc1d51b; and Rana Foroohar, “Workers Could Be the Ones to Regulate AI,” Financial Times, October 2, 2023, sec. Artificial intelligence, https://www.ft.com/content/edd17fbc-b0aa-4d96-b7ec-382394d7c4f3.

[6] An accessible synthesis is Peter C. Brown, Henry L. Roediger III, and Mark A. McDaniel, Make It Stick: The Science of Successful Learning (Cambridge, Massachusetts: Belknap Press, 2014).

[7] Owen Kichizo Terry, “I’m a Student. You Have No Idea How Much We’re Using ChatGPT.,” The Chronicle of Higher Education, May 12, 2023, https://www.chronicle.com/article/im-a-student-you-have-no-idea-how-much-were-using-chatgpt.

[8] See Christopher Newfield, The Great Mistake: How We Wrecked Public Universities and How We Can Fix Them (Baltimore: Johns Hopkins University Press, 2016), 323-33.


Christopher Newfield

Christopher Newfield is the Director of Research at the Independent Social Research Foundation in London. He is a Distinguished Professor Emeritus of English at the University of California, Santa Barbara. Newfield has recently published two books on the metrics of higher education: Metrics That Matter: Counting What’s Really Important to College Students (Baltimore: Johns Hopkins University Press, 2023) and The Limits of the Numerical: The Abuses and Uses of Quantification (Chicago: University of Chicago Press, 2022).
Newfield is currently conducting research on the nature and effects of literary knowledge.