DeepTitle -- Leveraging BERT to generate Search Engine Optimized Headlines

Cristian Anastasiu, Hanna Behnke, Sarah Lück, Viktor Malesevic, Aamna Najmi, Javier Poveda-Panter

Automated headline generation for online news articles is not a trivial task - machine generated titles need to be grammatically correct, informative, capture attention and generate search traffic without being "click baits" or "fake news". In this paper we showcase how a pre-trained language model can be leveraged to create an abstractive news headline generator for German language. We incorporate state of the art fine-tuning techniques for abstractive text summarization, i.e. we use different optimizers for the encoder and decoder where the former is pre-trained and the latter is trained from scratch. We modify the headline generation to incorporate frequently sought keywords relevant for search engine optimization. We conduct experiments on a German news data set and achieve a ROUGE-L-gram F-score of 40.02. Furthermore, we address the limitations of ROUGE for measuring the quality of text summarization by introducing a sentence similarity metric and human evaluation.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment