Skip to content

Dmitry Koterov, co-founder of professional social network "Moi Krug" (similar to LinkedIn), has announced the launch of a new project called RuTvit.

According to Dmitry, RuTvit combines the best of FriendFeed and twitter, also the improvements focused on Russians has being brought to the project.

As examples of inconveniences of existing services, Dmitry points out the absence of search with word forms in twitter, lack of real time post updates (as it implemented in FriendFeed but not in twitter), lack of Russian hash tags, unrepresentative picture of popular Russian users in twitter.

Despite the fact that RuTvit has lunched in alpha version, it implements a real time search, also gives the possibility to import from twitter, FriendFeed and others (total 19 popular online services and social networks), which would facilitate migration to RuTvit.

2

The www.roem.ru site has a mirror at roem.ru (i.e. both are identical, by design). A vast majority of web sites in the Internet do the same. I had written a sidewiki comment at www.roem.ru, but it didn't appear at roem.ru:
...continue reading "A little caveat in Google SideWiki"

In the latest snapshot of DataparkSearch Engine: an new conditional operator <!IFREGEX has been added for search result template. Using it you can as check value of meta-variable before output, as alter it according regex pattern specified.

E.g., auxiliary search in phone directory on All Sochi's Internet site translates phone numbers in canonical form of +78622xxxxxx into widely used local one: xx-xx-xx; and other numbers are translated from canonical form +7xxxyyyzzzzzz into better looked +7-xxx-yyy-zz-zz-zz using the following code in search template:


<!IFREGEX NAME="tel" CONTENT="\+78622([0-9][0-9])([0-9][0-9])([0-9][0-9])(.*)">$1-$2-$3$4
<!EREGEX NAME="tel" CONTENT="\+7([0-9][0-9][0-9])([0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])(.*)">+7-$1-$2-$3-$4$5
<!ELSE>$&(tel)<!ENDIF>

The support for libextractor library has been added in the latest snapshot of DataparkSearch Engine.

Using this library, DataparkSearch can now index keywords from files of the following formats: PDF, PS, OLE2 (DOC, XLS, PPT), OpenOffice (sxw), StarOffice (sdw), DVI, MAN, FLAC, MP3 (ID3v1 and ID3v2), NSF(E) (NES music), SID (C64 music), OGG, WAV, EXIV2, JPEG, GIF, PNG, TIFF, DEB, RPM, TAR(.GZ), ZIP, ELF, S3M (Scream Tracker 3), XM (eXtended Module), IT (Impulse Tracker), FLV, REAL, RIFF (AVI), MPEG, QT and ASF.

Bellow the relationship between keyword types of libextractor and DataparkSearch's section names is given:
...continue reading "dpsearch-4.53-14072009"

A new feature of regex based automatic query expansion has been added into latest snapshot of DataparkSearch Engine. First of all, it's useful for expanding search requests containing phone numbers, as they frequently are written in different notations.

E.g. the phone number in canonical notation +78622642424 is found by the request 8622-64-24-24 Сочи.

At the moment, both Google and Yandex (the leading search site in Russia) don't provide such feature in their search engines.

Regex based patterns are specified using special comments in a file of acronyms and abbreviations, starting with a pair of characters #* followed by the arguments and options are the same as for ReverseAlias command. Also added a special feature "last", which force the stop pattern matching process right after this rule is executed (this option is also added to the Alias and ReversAlias commands).

An example of regex patterns implement phone number expansion:


#* regex last "(\+7|8)[- \.\(]*(862)[- \.\)]*([0-9])[- \.\)]*([0-9]{2})[- \.]*([0-9])[- \.]*([0-9])[- \.]*([0-9]{2})" "+7$2$3$4$5$6$7"
#* regex last "(\+7|8)[- \.\(]*(9[0-9]{2})[- \.\)]*([0-9])[- \.\)]*([0-9]{2})[- \.]*([0-9])[- \.]*([0-9])[- \.]*([0-9]{2})" "+7$2$3$4$5$6$7"
#* regex last "\(862[- \.\)]*([0-9]?)[- \.\)]*([0-9]{2,3})[- \.]*([0-9]{2,3})[- \.]*([0-9]{2,3})" "+7862$1$2$3$4$5"
#* regex last "([0-9]{2})[- \.]?([0-9]{2})[- \.]?([0-9]{2})" "+78622$1$2$3"