Skip to content

Changes in this snapshot since 4.47 version are:

  • label parameter has been added to DBAddr command.
  • "Robots no" command has been fixed.
  • -f switch can now be used to specify for indexer a list of files to index/reindex.
  • Several bugs were fixed.

label parameter. Format: label=DBAlabel. This parameter may be used to assign a label to DBAddr command. So, if you pass label CGI-variable to the DataparkSearch, then only DBAddr marked by label value will be used to performing search. Thus, you can use one searchd daemon to answer queries for several search databases selectible by label variable.

Note: If no label is passed as CGI-parameter, then only DBAddr without a label will be used to perform search query.

In indexing, DataparkSearch divide every document onto sections. A section is any part of document, for example, for HTML documents this may be TITLE or META Description tag.

In addition to sections, some document factors are also take in account for relevance calculation: the average distance between query words, the number of query word occurrences, the position of first occurrence of a query word, the difference between the distribution of query word counts and the uniform distribution.

In searching, DataparkSearch compares every document found against an "ideal" document. The "ideal" document should have query words in every section defined and should have also the predefined values of additional factors.

A full method of relevance calculation.

Let x is the weighted sum of all sections. The weights for these sections are define by wf parameter (see Section 8.1.3). Let y is the weighted sum of differences between values of additional factors of document found and corresponding values of additional factors of the "ideal" document. And let xy is the weighted sum of sections where at least one query word has been found. Then value of relevance for a document found is calculates as: 0.5 * ( x + xy ) / (x + y).

A fast method of relevance calculation.

Let x is the number of bits used in weighted values of all sections defined. Let y is the weighted sum of differences between additional factors of document found and corresponding values of the "ideal" document. And let xy is the number of bits where weighted values of sections of the "ideal" document are different to weighted values of sections of document found. Then value of document relevance is calculates as: ( x - xy ) / ( x + y ).

//PS: This is DataparkSearch documentation update, will appear in the next release.

We are not building yet-another-search-engine, we are putting our
efforts into making building ANY search engine easier, better tools,
better methods, more shared systems, etc. This isn't one project,
it's tens or even hundreds of them, and likely to take years.

//Jeremie Miller, Wikia Search mailing list

DataparkSearch 4.47 has been released. Changes since previous version are:

  • Tags and categories are now storing in urlinfo table and they can be set per document basis.
  • Navigation through result pages has been fixed for search results caching.
  • Support for crosswords has been implemented for cache dbmode.
  • A possible trap has been fixed for the indexing via NNTP.
  • Automatic phrase search has been implemented for compound words having dots, commas, dashes, underscores and slashes (.,-_/\) as delimiters between word parts.
  • Reconnection to MySQL has been improved in case of unexpected connection lost.
  • The full method of relevance calculation has been modified.
  • Conditional operators can now be used in variables section of template.
  • Storing documents in stored database has been fixed for non-default value of StoredFiles.
  • Word forms consruction has been improved for words not found in ispell dictionatries.
  • mod_dpsearch is now supply BrowserCharset in server reply headers.
  • -f switch has been added for cached, search and stored, use to run them foreground (don't demonize).
  • Several bugs were fixed.

On July 4th (July 5th on Moscow Time) Sochi has been elected as the host city of XXII'nd Winter Olympic games in 2014.

Other photos of Sochi you can see here: photos.sochi.org.ru/view-sochi. And photos of Krasnaya Polyana (Sochi's district), where most olympic venues will place, here: KrPol-Sep2004, On July 3th, 2005, On July 2nd, 2005, Winter 2005.

Addendum: in the mountains near Sochi, more photos in the mountains near Sochi.