NAME

KinoSearch::Docs::Cookbook::CachedSearcher - Improve search-time responsiveness with a cached Searcher.

ABSTRACT

When a Searcher object is created, a small portion of the invindex is loaded into memory; additional caches are filled as relevant queries arrive. For small document collections on lightly-loaded servers, the time it takes to warm up the Searcher isn't worth worrying about. For large document collections or busy servers, though, the warmup time may become significant, in which case reusing the Searcher is likely to speed up your application.

FastCGI

A script running under standard CGI runs once per request; in contrast, a script running on a FastCGI-enabled webserver using the CGI::Fast module from CPAN starts up on the first request then executes a loop once per request.

Create your Searcher outside this loop, so that the object persists over multiple requests:

    my $searcher =  KinoSearch::Searcher->new(
        invindex => MySchema->read('/path/to/invindex/') 
    );
    while ( my $cgi = CGI::Fast->new ) {
        my $hits = $searcher->search( query => $cgi->param('q') || '' );
        ...
    }

mod_perl

Under mod_perl, the Searcher can be stored in a module loaded by startup.pl.

    package CachedSearcher;

    my $searcher;

    sub obtain {
        $searcher ||= KinoSearch::Searcher->new(
            invindex => MySchema->read('/path/to/invindex/') 
        );
        return $searcher;
    }

    sub refresh {
        undef $searcher;
        return get_searcher();
    }

    # Load at startup rather than wait for first request.
    obtain();

Individual search processes call CachedSearcher->obtain rather than create their own Searcher object. If an index gets updated, a special http request can be made which triggers a call to CachedSearcher->refresh.

Benchmarks

Using Benchmark::Stopwatch to measure a lightly-modified version of the sample search.cgi app, we get the following results for a query for "congress" under standard CGI...

    NAME                        TIME        CUMULATIVE      PERCENTAGE
     load modules                0.121       0.121           73.754%
     init searcher               0.004       0.125           2.626%
     process search              0.032       0.158           19.735%
     fetch hits                  0.006       0.164           3.877%
     _stop_                      0.000       0.164           0.008%

... and these results under CGI::Fast:

    NAME                        TIME        CUMULATIVE      PERCENTAGE
     process search              0.002       0.002           24.213%
     fetch hits                  0.006       0.008           75.602%
     _stop_                      0.000       0.008           0.186% 

Its clear from those numbers that for a simple term query, the time it takes to initialize the Searcher swamps the time it takes to execute the search and return results.

COPYRIGHT

Copyright 2008 Marvin Humphrey

LICENSE, DISCLAIMER, BUGS, etc.

See KinoSearch version 0.20.

Copyright © 2004-2008 Marvin Humphrey