CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning

Bill Yuchen Lin, Ming Shen, Wangchunshu Zhou, Pei Zhou, Chandra Bhagavatula, Yejin Choi, Xiang Ren

Recently, large-scale pre-trained language models have demonstrated impressive performance on several commonsense benchmark datasets. However, building machines with common-sense to compose realistically plausible sentences remains challenging. In this paper, we present a constrained text generation task, CommonGen associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning. Given a set of common concepts (e.g., {dog, frisbee, catch, throw}); the task is to generate a coherent sentence describing an everyday scenario using these concepts (e.g., "a man throws a frisbee and his dog catches it"). CommonGen is challenging because it inherently requires 1) relational reasoning using background commonsense knowledge, and 2) compositional generalization ability to work on unseen concept combinations. Our dataset, constructed through a combination of crowdsourcing and existing caption corpora, consists of 30k concept-sets and 50k sentences. Experiments show that there is a large gap between state-of-the-art text generation models (e.g., T5) and human performance (30.6% v.s. 63.5% in SPICE metric). The models struggle at the task, often generating grammatically sound yet realistically implausible sentences -- pointing to interesting future research.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment