Starting from the version 3.2.2
mnoGoSearch
is able to store compressed copies of the indexed
documents, so called cached copies.
Cached copies are stored in the same
SQL database.
search.cgi
uses cached copies for two purposes:
To display smart excerpts from every
found document with the search query
words in their context.
To display the entire original copy of the document,
with the search words highlighted.
Note:
A cached copy is opened in the browser when
the user clicks on the
Display cached copy link
near every document in search results.
Watching a cached copy can be especially useful
when the original site is temporarily down
or the document does not exist any longer.
Cached copies are displayed by with help of
search.cgi
executed with a special HTTP
query string parameter.
search.cgi
fetches a cached copy of the document
from the SQL database, decompresses it,
and the document is displayed in your web
browser, with search keywords highlighted.
To enable cached copies support, compile
mnoGoSearch with zlib support:
./configure --with-zlib <other arguments>
Collecting cached copies is enabled in the default version
of indexer.conf using this line:
Section CachedCopy 0 64000
The number 64000 is the maximum
allowed cached copy size.
When crawling, indexer stores
a cached copy only if its compressed size is smaller
than the maximum allowed size. You can change
this number according to your needs and your
SQL database capabilities.
Note:
Storing too large cached copies can affect
search performance negatively.
You can disable collecting cached copies:
open indexer.conf
in your favorite text editor and delete
the Section CachedCopy line.
Disabling cached copies will save disk space,
however search results presentation will be
not as good as with cached copies enabled.
Displaying cached copies is enabled
in the default search result template search.htm-dist.
To check if your template enables displaying
cached copies, open the template in a text
editor and make sure that you have this
HTML code in the section
<!--res-->:
<A HREF="$(stored_href)">Display cached copy</A>
When using the default search template,
search.cgi refers
to itself recursively, that is it when you follow
the Display Cached Copy
link in your browser, you'll open
search.cgi again
(just with special query string parameters
which tell to display a cached copy rather
than search results).
After cached copies have been configured, it works in the following order during search time:
For each document a link to its cached copy is displayed;
When the user clicks the link,
search.cgi is executed. It sends a query
to the SQL database and fetches the cached copy content.
search.cgi decompresses
the requested cached copy and sends it to the web browser,
highlighting the search keywords using the highlighting method given in
the HlBeg and HlEnd
commands;
You can optionally specify an alternative
URL for the Display Cached Copy
links, to have cached copies reside under another location
of the same server, or even on another physical server.
For example:
<A HREF="http://site2/cgi-bin/search.cgi?$(stored_href)">Display cached copy</A>
Moving cached copies to another server can be useful
to distribute
CPU load between machines.
Note:
mnoGoSearch must be
installed on the machine site2.
Starting from the version 3.3.8,
mnoGoSearch understands
the UseLocalCachedCopy command
in search.htm to force downloading
documents from their original locations when generating
smart excerpts for search results
as well as when generating the "Cached Copy"
documents.
This command can be useful when you index the documents
residing on your local file system and helps to avoid
storing of cached copies in the database and thus
makes the database smaller.