Founds

Summary Extraction Algorithm (SEA)

The Summary Extraction Algorithm (SEA) has been added in 4.35 version of DataparkSearch (in December of 2005). This algorithm of automatic summary construction is based on ideas of Rada Mihalcea described in the paper Rada Mihalcea and Paul Tarau, An Algorithm for Language Independent Single and Multiple Document Summarization, in Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Korea, October 2005..

Differences in DataparkSearch's SEA:

To enable the SEA algorithm in DataparkSearch you need only to define a section in your sections.conf file:

Section sea 29 1024

After indexing of document collection with this section defined, you may use $(sea) meta-variable in your template to show summary for a search result.

Some limitation in current implementation: a page should have four or more sentences of length greater 32 characters; only first 64 sentences of a page (if available) are using to construct the summary.