Changes from the previous snapshot are:
- FastHrefCheck command has been added. Use it to skip href checking against server list during parsing.
- The support for KOI8-C (an extension of KOI8-R with old-Russian letters) charset has been added.
- The ActionSQL command has been added. Use it to execute SQL-queries with document related data while indexing.
//DataparkSearch Engine Home
The "FastHrefCheck yes" command is useful to speed-up the indexing when you have a huge list of Server/Realm/Subnet commands.
The syntax of ActionSQL command is as follow:
ActionSQL <section> <pattern> <sql-template> [<dbaddr>]
where <section> -- is the name of document section to check for regex pattern <pattern> match. If a match is found then the <sql-template> is filled with regex meta-variables $1-$9 as well with search template meta-variables (as for example $(Title), $(Last-Modified), etc.) to form a sql-query, which is executed in the first DBAddr defined in indexer.conf file. If the optional <dbaddr> paramater of ActionSQL command is set, a new connection is set according this DBAddr and sql-query is executed in this connection.
Thus you can use ActionSQL commands to mind and collect the data on pages while indexing. For example, the following command collect phone numbers (in Russian local notation) along with titles of pages where these phone numbers have been discovered:
ActionSQL body "\(([0-9]{3})\)[ ]*([0-9]{3})[- \.]*([0-9]{2})[- \.]*([0-9]{2})" "INSERT INTO phonedata(phone,title)VALUES('+7$1$2$3$4','$(title)')"
Pingback: Founds » Blog Archive » dpsearch-4.51-12122008