KinoSearch::Docs::Cookbook::CachedSearcher - Improve search-time responsiveness with a cached Searcher.
When a Searcher object is created, a small portion of the invindex is loaded into memory; additional caches are filled as relevant queries arrive. For small document collections on lightly-loaded servers, the time it takes to warm up the Searcher isn't worth worrying about. For large document collections or busy servers, though, the warmup time may become significant, in which case reusing the Searcher is likely to speed up your application.
A script running under standard CGI runs once per request; in contrast, a script running on a FastCGI-enabled webserver using the CGI::Fast module from CPAN starts up on the first request then executes a loop once per request.
Create your Searcher outside this loop, so that the object persists over multiple requests:
my $searcher = KinoSearch::Searcher->new(
invindex => MySchema->read('/path/to/invindex/')
);
while ( my $cgi = CGI::Fast->new ) {
my $hits = $searcher->search( query => $cgi->param('q') || '' );
...
}
Under mod_perl, the Searcher can be stored in a module loaded by startup.pl.
package CachedSearcher; my $searcher; sub obtain { $searcher ||= KinoSearch::Searcher->new( invindex => MySchema->read('/path/to/invindex/') ); return $searcher; } sub refresh { undef $searcher; return get_searcher(); } # Load at startup rather than wait for first request. obtain();
Individual search processes call CachedSearcher->obtain rather than create their own Searcher object. If an index gets updated, a special http request can be made which triggers a call to CachedSearcher->refresh.
Using Benchmark::Stopwatch to measure a lightly-modified version of the sample search.cgi app, we get the following results for a query for "congress" under standard CGI...
NAME TIME CUMULATIVE PERCENTAGE
load modules 0.121 0.121 73.754%
init searcher 0.004 0.125 2.626%
process search 0.032 0.158 19.735%
fetch hits 0.006 0.164 3.877%
_stop_ 0.000 0.164 0.008%
... and these results under CGI::Fast:
NAME TIME CUMULATIVE PERCENTAGE
process search 0.002 0.002 24.213%
fetch hits 0.006 0.008 75.602%
_stop_ 0.000 0.008 0.186%
Its clear from those numbers that for a simple term query, the time it takes to initialize the Searcher swamps the time it takes to execute the search and return results.
Copyright 2008 Marvin Humphrey
See KinoSearch version 0.20.