udmsearch -- the Ruby interface to the DataparkSearch. This site is mostly in Russian. Short into to get the package:
svn checkout http://svn.maxidoors.ru/udmsearch-ruby
Just DataparkSearch weblog
udmsearch -- the Ruby interface to the DataparkSearch. This site is mostly in Russian. Short into to get the package:
svn checkout http://svn.maxidoors.ru/udmsearch-ruby
A new version of DataparkSearch 4.44 has been released. Changes since previous release are:
People dreamed of flight like birds since the most ancient times… And here in the beginning of 20 century the man has flew off, a little not how dreamed, inside “iron birds”, a.k.a. airplanes.
Around two centuries (may be more), people dreams of the time machine. The end of 20 century has gave hope for realization and this dream — the virtual cyber-worlds.
At present time, a fight between Google and Microsoft for virtual model of the Earth is emerging. Yes, now all is focused on information search about objects on surface in the instant moment of time. But having the virtual Earth and historical archives indexed (yes, still having to dig and dig here), it is possible to reconstruct in detail some fragments of a surface of the Earth in the past, for example famous battlefields, or the propagation of a tsunami wave to Asia in December 2004, or the sole voyage of "Titanic", etc. Basically it is possible "to turn off" back all history of the Earth, till the moment of origin of the Sun.
OFF-TOPIC: Also using a model of virtual Earth, it possible to make a "real" Civilization game online over "real" Earth.
The В«AndromedaВ», a new search engine result page layout:
This is draft version. I need some feedback, wort it any salt to perfect ?
Definitively, those aren't top query words, those are top reply words.Actualy, those top lists were constructed in two stage: at first, an automatic summary has been created for every page indexed (usualy this summary consist of three most common sentences from a page), at second, for every word in the list, the total number of such summaries has been counted where this word was occured and the list of all words has been sorted in decreasing order of occurencies.
Happy New Year to all the World!
Percents of hits by users geographicaly located in Russia came from three major search engines in Russia in 2006:
Yandex | Rambler | ||
---|---|---|---|
January 2006 | 60.6+0.6+0.1=61.3 | 21.7 | 6.3+0.3=6.6 |
February 2006 | 61.5+0.8+0.0=62.3 | 20.9 | 6.3+0.3=6.6 |
March 2006 | 61.4+0.9+0.1=62.4 | 20.9 | 6.4+0.3=6.7 |
April 2006 | 60.3+0.9+0.1=61.3 | 21.6+0.0=21.6 | 6.6+0.3=6.9 |
May 2006 | 60.6+1.0+0.1=61.7 | 21.7+0.1=21.8 | 6.6+0.3=6.9 |
June 2006 | 60.4+1.0+0.1=61.5 | 21.2+0.1=21.3 | 7.1+0.3=7.4 |
July 2006 | 59.9+1.1+0.1=61.1 | 21.2+0.0=21.2 | 7.8+0.3=8.1 |
August 2006 | 60.2+1.0+0.1=61.3 | 20.8+0.1=20.9 | 7.8+0.3=8.1 |
September 2006 | 60.2+1.0+0.1=61.3 | 21.0+0.1=21.1 | 8.1+0.3=8.4 |
October 2006 | 60.6+1.0+0.1=61.7 | 20.3+0.1=20.4 | 8.3+0.3=8.6 |
November 2006 | 60.0+1.0+0.1=61.1 | 20.3+0.1=20.4 | 8.8+0.3=9.1 |
December 2006 | 59.5+0.6+0.1=60.2 | 20.3+0.1=20.4 | 9.4+0.4=9.8 |
The first addend is the number of hits from main search, the seccond addend — the number of hits from image search, the third addend — the number of hits from blog search.
Following a fashion to give forecasts for the next year, I would like to come out with the assumption, that if not in 2007 the following soon after, there will be a new standard of a site of the company where instead of common used home page with the hierarchical menu, will appear the google-like interface — the home page will contain “visiting card” of the company plus a input box of the search engine on all volume of information on site. By the way, the Google has already released Google Apps for Domain package to create your own “google.com”, it's remains to integrate it with Google Appliance or Google Mini, plus it's highly desirable to revive Google Answers in a local variant (in a sort of the Google Answers Mini) and prototype of a next generation CMS will be ready.
The big companies already for a long time can offer much more information to the potential client about the goods and services, rather than it is possible these given to arrange in hierarchical menus conveniently so that the user has understood with structure at the first visit. And so there will be all "intuitively clear" interface of a search box.
Yes of course, the search box standardly is present almost on each site, but it conceived detached in upper right corner of the screen and frequently nominally, because the search features of many people CMS are rather scanty. It will be on the contrary — the search box will be the focus of attention and also the largest object on page, and the menu becomes the auxiliary tool and will vary depending on that the user searches, allowing to be guided in search results more quickly or to specify queries more exactly in one/two clique.
Color of a point on the map NZhole corresponds to Popularity Rank value of a web-page and map's points are ordered from left to right and from top to bottom in ascending order of hops count (the number of "mouse clicks" from a start page). The pages exactly specified in the search engine config file, receive value of hops equal to 0, the pages proposed to indexing via web-form or picked-up from one of internet directories, receive value of hops equal to 1. All other pages at the insert into the base of search engine receive value of hops on 1 more, than the page where the link to this page has been found out. In such sorting the smoothed map looks like this:
If now to order all over again on number of inbound links of page, and then on hops number, the map will look so:
Ordering on number of outbound links, and then on number of hops:
Ordering all over again on a difference between the number of inbound and the number of outbound links:
Ordering on a difference between the number of outbound and the number of inbound links:
On these maps it is possible to notice, that the rating of popularity (Popularity Rank) usually higher for pages with the number of inbound links is relatively high, but also higher than the number of outbound links. And conversely, if the page has more outbound links than inbound, her popularity rating will be usually below.
Thus, it's looks like the PopRank used in DataparkSearch is more robust against linking spam than Google's PageRank is.
This article in Russian: Немного когнитивности.