Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization

Authors: Lei Xu, Mohammed Asad Karim, Saket Dingliwal, Aparna Elangovan

Abstract: Large language models (LLMs) can generate fluent summaries across domains
using prompting techniques, reducing the need to train models for summarization
applications. However, crafting effective prompts that guide LLMs to generate
summaries with the appropriate level of detail and writing style remains a
challenge. In this paper, we explore the use of salient information extracted
from the source document to enhance summarization prompts. We show that adding
keyphrases in prompts can improve ROUGE F1 and recall, making the generated
summaries more similar to the reference and more complete. The number of
keyphrases can control the precision-recall trade-off. Furthermore, our
analysis reveals that incorporating phrase-level salient information is
superior to word- or sentence-level. However, the impact on hallucination is
not universally positive across LLMs. To conduct this analysis, we introduce
Keyphrase Signal Extractor (SigExt), a lightweight model that can be finetuned
to extract salient keyphrases. By using SigExt, we achieve consistent ROUGE
improvements across datasets and open-weight and proprietary LLMs without any
LLM customization. Our findings provide insights into leveraging salient
information in building prompt-based summarization systems.

Source: http://arxiv.org/abs/2410.02741v1

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may also like these