Reducing Sequence Length by Predicting Edit Operations with Large Language Models

Masahiro Kaneko, Naoaki Okazaki

Large Language Models (LLMs) have demonstrated remarkable performance in various tasks and gained significant attention. LLMs are also used for local sequence transduction tasks, including grammatical error correction (GEC) and formality style transfer, where most tokens in a source text are kept unchanged. However, it is inefficient to generate all target tokens because a prediction error of a target token may cause a catastrophe in predicting subsequent tokens and because the computational cost grows quadratically with the target sequence length. This paper proposes to predict a set of edit operations for the source text for local sequence transduction tasks. Representing an edit operation with a span of the source text and changed tokens, we can reduce the length of the target sequence and thus the computational cost for inference. We apply instruction tuning for LLMs on the supervision data of edit operations. Experiments show that the proposed method achieves comparable performance to the baseline in four tasks, paraphrasing, formality style transfer, GEC, and text simplification, despite reducing the length of the target text by as small as 21\%. Furthermore, we report that the instruction tuning with the proposed method achieved the state-of-the-art performance in the four tasks.

Knowledge Graph



Sign up or login to leave a comment