Google Mini | DataparkSearch | |
---|---|---|
License type | Commercial, no source code available | GPL, open source |
Number of documents indexed and pricing |
|
up to several millions, depending of hardware used. Free software |
File formats indexing | 220 different file formats, including HTML, PDF and Microsoft Office documents. | Plain text, HTML, XML, MP3, GIF + any other with external parsers |
Languages | 28 languages | 25 language groups, can segment sentences in Chinese, Japanese, Korean and Thai. |
Accessing files via | HTTP, HTTPS, networked file systems. | HTTP, HTTPS, FTP, NNTP, HTTP Proxy, local file system, htdb:// scheme for SQL databases. |
Accessing content protected by | HTTP Basic, NTLM v1 and v2, LDAP | HTTP Basic |
Collections | Yes | Yes, each collection may be divided onto subsections (tags and categories) |
Integrate search results into your sites's look and feel | users XSLT style sheet, export results in XML | own template language to produce result pages in any text based format. |
Synonyms | Yes | Yes |
Display key attributes of search results | meta tags | meta tags, specified HTML attributes, specified XML tags, regex excerpts from text (all those so called the sections) |
Filter results through meta tags | Yes | Yes, + through any section or combination of sections. |
Assign different weights for meta tags/sections | No | Yes |
Integration with Google Desktop and Google Toolbar for Enterprise | Yes | No |
Excluding pages from the search index | Yes | Yes |
Spell-checker | a self-learning | uses aspell |
Cached versions of documents | Yes | Yes |
Number Range Search | Yes | No |
Date Range Search | Yes | Yes |
Sort search results by | Relevance, Date | Revevence, Date, Popularity, Importance and by all those in reverse order |
Reporting |
|
No reports. Each query can be tracked along with all search parameters for futher processing. |
Automaticaly sitemap construction | Yes | No |
OneBox for Enterprise | Yes | No |
Customer support | Customer support site; email support; guaranteed replacements in the case of any hardware failure | A phorum on project's site |
Addendum, 15 Mar. 2007 | ||
Automatic document summarization | No | Yes, the Summary Extraction Algorithm |
HTTP Content negotiation for specified languages | No | Yes |
Link analysis algorithm | No | Yes, the Neo PopRank and the Goo PopRank |
//Google Mini features, Google Mini Administrator features, DataparkSearch.