Skip Navigation

Sign up

If you sign up for an account on this web site you can customise elements of this site and subscribe to an email newsletter.

If you have an account on this web site you may login.

If you have an account on this site but have forgotten your user name and / or your password then you can request an account reminder email.

Run the MKSearch indexer

The MKSearch system has been designed to work with the GNU Compiler for Java (GCJ). These notes explain how to index Web content with two of the default configuration sets provided with the project.

Environment settings

The MKSearch build and execution scripts use variable substitution to run from an arbitrary installation directory. It is assumed that the MKSearch source is installed in a single base directory and reflects the original structure of the project in the Subversion repository.

Before running the scripts, four environment variables must be set, see the instructions below.

GNU/Linux environment settings

You can set these properties in your .bash_profile script for instance:

 export mk_build=/home/mksearch/build
 export mk_home=/home/mksearch
 export CLASSPATH=/usr/share/java/libgcj-3.4.1.jar
  • Substitute the actual path to your MKSearch installation for the mk_home variable.

  • The path for the temporary build directory may be outside the MKSearch home path.

  • Include the actual path of your core Java class repository in the CLASSPATH variable.

Exit your current session and log in again to apply the changes. To check the settings have been applied, use the env command piped through less:

$ env | less

Use the down key to scroll down. You should see two lines that look like this:

mk_build=/home/mksearch/build
mk_home=/home/mksearch
CLASSPATH=/usr/share/java/libgcj-3.4.1.jar

Press Q to exit less.

N-Triple index example

The MKSearch project includes a static test site that is used to check the correct operation of the indexer. For simplicity, the "triple" configuration indexes a set of Web pages and generates an N-Triple output file for each on the local file system. The example below runs the MKSearch indexer on the test site using the triple configuration.

  $mk_home/bin/java-jspider.sh http://test.mksearch.mkdoc.org/ triple

The output from this run will generate a new directory structure at: $mk_home/output/org.mkdoc.mksearch.test.

Up

This document was last modified by Philip Shaw on 2005-03-30 07:24:08
Copyright MKDoc Ltd. and others.
The Free Documentation License http://www.gnu.org/copyleft/fdl.html