Well known, the Google makes money mostly on context ads, and bigger part of such income are coming from context ads on SERPs. That is simple, ads are selecting on search keywords entered by user. Does anyone known a system, that select ads based on context of pages, which links are shown on SERPs ? I guees this might be the next step in contextual advertising: pages that linked from SERP have (usualy) much more information and this allow to select advertising more percisly, and ads may vary more from one SERP to another, and, with search result clustering technique, it's possible to cluster contextual ads too.
Author: Maxime
DataparkSearch 4.42
A new version of DataparkSearch 4.42 has been released. Changes since previous release are:
- Some modifications for speed performance has been made.
- XML parser has been improved.
- CRC32 has been totaly replaced by Hash32. Collisions are possible in clones detection when upgrade.
- cache:// dbtype has been fixed for searchd.
- Minor bug has been fixed in content decompressing.
- Indexer can now gather geopositions specified in special meta tags. See blog entry
- &empty= CGI-parameter has been added. Use it to disable use search limits to show results if no query words is entered.
- UseDateHeader command has been added. Use it to get value of Date: HTTP header as date of document if no Last-Modified header is specified.
- Asynchronous SQL commands processing has been added for PgSQL.
- Clones detection has been modified for better performance.
- Possible trap on excerpt construction has been fixed.
- -z swicth for indexer has been added. Use it to limit indexer to documents with hops value no more than specified.
- Severel bugs (#175, #176) were fixed.
dpsearch-4.42-27082006
XML parser has been improved. How it accept tag attributes in square brackets []. Also bug has been fixed in processing < ![CDATA[ ... ]]> construction.
More geopositions: two methods of geoposition specification in XML data are supported now.
- < icbm:latitude>xx.xxxx< /icbm:latitude>
< icbm:longitude>xx.xxxx < /icbm:longitude> - < geo:lat>xx.xxxx< /geo:lat>
< geo:lon>xx.xxxx < /geo:lon>
Russian Answers
Today, Mail.ru, a leading portal in Russia, lunched social search service Ответы@Mail.Ru [Answers@Mail.ru]. This is Q&A system, when one users gives answers for questions posted by other users. It's look like some kind of forum, well known thing in Old Internet, rather as usual search engine. In press-release (see link below), Mail.ru makes accent on human beings giving the answers in contradiction to machines and sophisticated algorithms uses in conventional search engines. Special rating systems (roughly: -1 for question, +1 for answer, and additional +1 for best answer valued by inquiring person) will allow to assign expert status for active users, who gives correct and useful answers in some area of knowledge.
This is the first service of such kind running in Russia.
//Mail.ru (in Russian)
dpsearch-4.42-21082006
In this snapshot, CRC32 functions were totaly replaced by Hash32 functions, derived from Hash Functions for Hash Table Lookup by Robert J. Jenkins Jr. Apparently, Hash32 is faster that CRC32 and have a bit less number of collisions on Russian, English and French ispell dictionaries.
Due to this change, some collisions in clones detection are possible when upgrade from previous versions of DataparkSearch till full reindexing of all data with new algorithms.
Books 2.0
It was thought, that it's technically possible to implement nowadays a device by means of which it is to realize an old student's entertainment when one started to write the novel/story, and another continued, then continued the third, etc. Collective work so to say, with rather not trivial turns not only in plot, but also in genre of a composition.
And so, nowaday notepad PC with something like Wi-Fi builtin is allow to read electronic texts anyware, and when wandering over city it is possible "to gather" from counter similar devices of "priming" of new novels/stories, selecting them by tags (of course 🙂 or by any other method. Yes, first models of PDA can exchange electronic personal cards on-the-fly, but today a much more data can be transmitted between devices in less time.
Also developing systems of micropayments will allow to collect paymentС‹ for continuation from readers, or to pay reading another's continuations.
Well, the second variant: a subscription to novels of any writer which agree to work in a serial mode when the chapter just wrote leaves to subscriber's devices in a touch via Internet. It will relative easy to distribute in such way with free on use metropolitan Wi-Fi networks unfolds.
Yes, this also look as on paid reading/writing in livejournal...
Russian Scholar
Russian Scholar.ru, a service similar to Google Scholar, has been opened. It aims to collect and index digital papers and publications in Russian. This is a non-commercial project and it's looking for volonteers to help maintain and developing.
Cplogs’ offensive
Cplog stands for В«copypaster's blogВ».
Yesterday I discovered two "blogs", where all posts were compiled from others blogs, include mine in Russian (notes.sochi.org.ru). Yes, this is well known trick for splogs (spammer's blog), but cplogs aims to collect revenue from context ads placement.
This is dark side of context ads services with revenue sharing (ex. AdSense, YPN, Begun.ru). Obviously, those services should have strict premoderation for every sites in their system and/or use toll of unique content for a site before to show commercial ads on it. Otherwise, we will have tsunami of doubled/copypasted/hijacked content in blogosphere, for one writing blog will be dozens and hundreds copypasting blogs...
Cplogs’ offensive
Cplog stands for В«copypaster's blogВ».
Yesterday I discovered two "blogs", where all posts were compiled from others blogs, include mine in Russian (notes.sochi.org.ru). Yes, this is well known trick for splogs (spammer's blog), but cplogs aims to collect revenue from context ads placement.
This is dark side of context ads services with revenue sharing (ex. AdSense, YPN, Begun.ru). Obviously, those services should have strict premoderation for every sites in their system and/or use toll of unique content for a site before to show commercial ads on it. Otherwise, we will have tsunami of doubled/copypasted/hijacked content in blogosphere, for one writing blog will be dozens and hundreds copypasting blogs...
Microsoft Research @ SIGIR 2006
Microsoft Research SIGIR Conference Papers has been updated by 2006's works.
I intent to read following two:
- Using Web-Graph Distance for Relevance Feedback in Web Search, Sergei Vassilvitskii (Stanford University), Eric Brill.
- High Accuracy Retrieval with Multiple Nested Ranker, Irina Matveeva (University of Chicago), Chris Burges, Timo Burkard, Andy Laucius, Leon Wong.