Please use this identifier to cite or link to this item: http://idr.nitk.ac.in/jspui/handle/123456789/15170
Title: Benchmarking semantic, centroid, and graph-based approaches for multi-document summarization
Authors: Agrawal A.
George R.A.
Ravi S.S.
Kamath S.S.
Issue Date: 2021
Citation: Advances in Intelligent Systems and Computing , Vol. 1177 , , p. 255 - 263
Abstract: Multi-document summarization (MDS) is a pre-programmed process to excerpt data from various documents regarding similar topics. We aim to employ three techniques for generating summaries from various document collections on the same topic. The first approach is to calculate the importance score for each sentence using features including TF-IDF matrix as well as semantic and syntax similarity. We build our algorithm to sort the sentences by importance and add it to the summary. In the second approach, we use the k-means clustering algorithm for generating the summary. The third approach makes use of the Page Ranking algorithm wherein edges of the graph are formed between sentences that are syntactically similar but are not semantically similar. All these techniques have been used to generate 100–200 word summaries for the DUC 2004 dataset. We use ROUGE scores to evaluate the system-generated summaries with respect to the manually generated summaries. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd 2021.
URI: https://doi.org/10.1007/978-981-15-5679-1_24
http://idr.nitk.ac.in/jspui/handle/123456789/15170
Appears in Collections:2. Conference Papers

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.