Ting-Hao “Kenneth” Huang (PENN STATE) – “What Roles Will Humans Play in the Future of Data Annotation?”
3400 N CHARLES ST
Abstract
Large language models (LLMs) have rapidly and impressively taken over many tasks once handled by human annotators in constructing text datasets. But does this mean we no longer need humans in the annotation loop? What roles should humans play in future data annotation pipelines? In this talk, I will present two recent studies that explore the evolving role of humans in the landscape of text data annotation. First, we ask whether a well-designed and carefully executed traditional crowdsourcing pipeline can still outperform LLMs in labeling quality. Our study offers an in-depth and holistic comparison between human and LLM annotation performance. Second, we turn to a future where LLMs increasingly replace manual annotation labor. In this scenario, the human role shifts toward instructing the models–often through prompting. But how effective are humans at prompting LLMs for annotation tasks, especially when working without access to gold-standard labels? We investigate this growing practice, which we call “prompting in the dark,” and assess its implications for the quality and reliability of LLM-generated annotations.
Bio
Dr. Ting-Hao ‘Kenneth’ Huang is an Associate Professor at the Pennsylvania State University’s College of Information Sciences and Technology. With expertise in natural language processing (NLP) and human-computer interaction (HCI), he built interactive and intelligent systems that support people in achieving their social and creative goals in day-to-day activities. His systems generated captions for scientific figures (Best Long Paper of INLG 2023), enabled collective action that empowers community advocates to fight air pollution (Outstanding Paper Award of IUI 2019), conversed with hundreds of users about any topics well before Chat GPT (Best Paper Honorable Mention Award of CHI 2018), and aided creative writers in developing intricate story arcs. His research has been published in top-tier HCI, NLP, and AI conferences, including CHI, IUI, ACL, NAACL, EMNLP, HCOMP, and AAAI, and has been supported by Adobe, Meta, NSF, NIH, and Sloan Foundation. He co-founded and co-organizes the Workshop on Intelligent and Interactive Writing Assistants (In2Writing, https://in2writing.glitch.me/). Dr. Huang earned his Ph.D. in Computer Science from Carnegie Mellon University in 2018. More about his work can be found at http://kennethhuang.cc/.