Tutorial on Artificial Text Detection

Tutorial at INLG 2022 Conference


Overview

Recent advances in natural language generation have led to the development of models capable of generating high-quality and human-like texts among many languages and domains. However, it is known that such models can be misused for malicious purposes, including but not limited to generating fake news, spreading propaganda, and facilitating fraud. This tutorial aims at bringing awareness of artificial text detection, a fast-growing niche field devoted to mitigating the misuse of these models. It targets NLP researchers and industrial practitioners who work with text generative models and/or on mitigating ethical, social, and privacy harms. Our tutorial provides the attendees with a comprehensive background on this topic and reviews in a holistic manner: (1) issues of generative models that can exacerbate their misuse, (2) terminologies and task definitions, (3) models well-studied for the task, (4) existing datasets and benchmarks, (5) approaches to detecting generated texts, (6) standard crowd-sourcing practices and related critical studies, (7) downstream applications, and (8) established risks of harm. We conclude by outlining unresolved methodological problems and future work directions.

There are 4 parts of the Tutorial:

Part 1: Introduction

Part 2: Landscape

Part 3: Automatic and Human Artificial Text Detectors

Part 4: Conclusion - Application, Ethics & Summary


Materials

Tutorial Overview

Slides

ATD Tutorial slides


Organizers

Adaku Uchendu

Adaku Uchendu

The Pennsylvania State University

Vladislav Mikhailov

Vladislav Mikhailov

HSE University

Jooyoung Lee

Jooyoung Lee

The Pennsylvania State University

Saranya Venkatraman

Saranya Venkatraman

The Pennsylvania State University

Tatiana Shavrina

Tatiana Shavrina

AI Research Institute

Ekaterina Artemova

Ekaterina Artemova

Huawei Noah’s Ark Lab