Changes since version 4.49:
- A word break has been added for French-style contractions.
Break between apostrophe and vowels adds according to UAX#29, WB5a
- Big lists of Russian and English synonyms have been added.
- MaxSiteLevel command accept now a negative argument to group URLs on subdirectory basis.
For example, for the value of -1, the URLs http://www.site.com/dir1/ and http://www.site.com/dir2/ group as documents from different sites.
- The SkipUnreferred command has been extended to delete unreferred documents if necessary.
Use it to automaticaly remove "dead" documents from your database.
- Del log processing has been fixed in splitter for case when cache log is empty.
- Some German letters automatically replace by bi-letter combinations in accent-free search mode.
ß -> ss, ä -> ae, ö -> oe, ü -> ue. - SQLite3 support has been added. Use --with-sqlite3 option for configure to enable it.
- Indexing has been fixed for documents with several versions in different languages. You need to execute "indexer -Erehashstored" command when upgrade.
- HTML parser understands now <!-- google_ad_section_start -->, <!-- google_ad_section_start(weight=ignore) --> and
<!-- google_ad_section_end --> comments as tags to include/exclude content for indexing. - Relevance calculation has been improved for case when acronyms and abbreviations are used.
This snapshot can be downloaded from DataparkSearch site.