Skip to content

Changes since version 4.49:

  • A word break has been added for French-style contractions.

    Break between apostrophe and vowels adds according to UAX#29, WB5a

  • Big lists of Russian and English synonyms have been added.
  • MaxSiteLevel command accept now a negative argument to group URLs on subdirectory basis.

    For example, for the value of -1, the URLs http://www.site.com/dir1/ and http://www.site.com/dir2/ group as documents from different sites.

  • The SkipUnreferred command has been extended to delete unreferred documents if necessary.

    Use it to automaticaly remove "dead" documents from your database.

  • Del log processing has been fixed in splitter for case when cache log is empty.
  • Some German letters automatically replace by bi-letter combinations in accent-free search mode.
    ß -> ss, ä -> ae, ö -> oe, ü -> ue.
  • SQLite3 support has been added. Use --with-sqlite3 option for configure to enable it.
  • Indexing has been fixed for documents with several versions in different languages. You need to execute "indexer -Erehashstored" command when upgrade.
  • HTML parser understands now <!-- google_ad_section_start -->, <!-- google_ad_section_start(weight=ignore) --> and
    <!-- google_ad_section_end --> comments as tags to include/exclude content for indexing.
  • Relevance calculation has been improved for case when acronyms and abbreviations are used.

This snapshot can be downloaded from DataparkSearch site.

Last spring I studied English in the EF School in Dublin, Ireland. It was something like a mind correction experience and a wide white stripe in my life. Although it flew out in one single instance. Many thanks to all EF staff, especially to Eoin Ryan, you're the best teacher indeed.

Occasionally I found a photo of Dublin (a view at Christ Church Cathedral from O'Donovan Rossa bridge) made in the middle of 195x, as I was told. Some days since then I made a photo from around the same point. So you can see how Dublin changed in a half century:

A new version of DataparkSearch, 4.49, has been released. Changes since previous release are:

  • String tokenization has been improved. For example, "c--" and "c#" are now cosidered as words.
  • A subdocument indexing technique has been implemented.
  • LongestTextItems command has been added. Use it to specify the number of longest text items to index.
  • The support has been added for georgian-academy and georgian-ps charsets.
  • URL data preloading has been fixed for multi-DBAddr configurations.
  • HTML parser is now skiping indexing within tags with visibility set to none or hidden in style attribute.
  • Subnet command has been fixed.
  • $*(x) type of template meta-variable has been added. Use it to HTML-escape value without search words highlighting.
  • $(np) and $(p) have been fixed in "resbot" and "bottom" sections of search template.
  • PagesInGroup command has been added. Use it to specify the number of additional pages from the same site when google-like groupping is enabled.
  • ServerWeight command has been fixed.

Changes since version 4.48:

  • A subdocument indexing technique has been implemented.
  • LongestTextItems command has been added. Use it to specify the number of longest text items to index.
  • The support has been added for georgian-academy and georgian-ps charsets.
  • URL data preloading has been fixed for multi-DBAddr configurations.
  • HTML parser is now skiping indexing within tags with visibility set to none or hidden in style attribute.
  • Subnet command has been fixed.
  • $*(x) type of template meta-variable has been added. Use it to HTML-escape value without search words highlighting.
  • $(np) and $(p) have been fixed in "resbot" and "bottom" sections of search template.
  • PagesInGroup command has been added. Use it to specify the number of additional pages from the same site when google-like groupping is enabled.
  • ServerWeight command has been fixed.

The most important: a memory leak has been fixed in searchd when configuration reloads with URLData preloaded. But this affects only previous snapshots of 4.49 version.