Skip Navigation

Beta 1 release plans

Beta 1 checker plans

Beta 1 crawler plans

Beta 1 indexer plans

Beta 1 query plans

MKSearch beta 1 release notes

Sign up

If you sign up for an account on this web site you can customise elements of this site and subscribe to an email newsletter.

If you have an account on this web site you may login.

If you have an account on this site but have forgotten your user name and / or your password then you can request an account reminder email.

MKSearch beta 1 release notes

These release notes are a duplicate of the text file notes included in the MKSearch distribution.

This release is available from the downloads section.

MKSearch release notes

The MKSearch beta 1 release is the first essentially complete distribution. It has been released as an interim milestone for the project overall to allow external developers to try the system, and to solicit feedback, bug reports, suggestions for improvement and address queries.

If you would like to offer feedback on this release, please join the project mailing list and post your comments there.

http://www.email-lists.org/mailman/listinfo/mksearch-dev

The main source of information about MKSearch is the project Web site, which includes documentation, ‘how to’ guides, development plans and background research.

http://www.mksearch.mkdoc.org/

There are instructions on using the MKSearch distributions here:

http://www.mksearch.mkdoc.org/howto/unpack-and-configure-mksearch/

The remainder of this document includes notes on:

  • Current state of the distribution

  • Distribution contents

  • Licences

Current state of the distribution

This distribution has more than 88% code coverage and there are no known issues with the system itself. However, the source includes several amendments to the supporting packages to work around problems running the system using the GNU Compiler for Java, GCJ, and GNU Interpreter for Java, GIJ. These amendments are recorded here:

http://www.mksearch.mkdoc.org/documentation/

To date the MKSearch JSpider indexing system has only been tested on a relatively limited scale, running against the project Web site, above, and a test document Web site:

http://test.mksearch.mkdoc.org/

The sample WAR file included in the binary distribution (and RDF in the source), is a recent index of the project Web site. Guidance for deploying this WAR with Tomcat can be found here:

http://www.mksearch.mkdoc.org/documentation/tomcat-on-fc4/

The MKSearch system has not been tested with the full scope of site configuration options available through the JSpider engine. In principle it is possible to configure the indexer to run against multiple sites, to varying depths, with site-specific throttling, user agent and many other options. For full details, see the JSpider user manual:

http://j-spider.sourceforge.net/doc/

The Sesame RDF storage and query system has only been tested to any signifcant extent using file-based storage. The SQL storage features of Sesame have not been developed except for the abstract superclass com.mkdoc.store.AbstractDatabaseStoreManager.

The query interface of the sample Web application included in this release (see WAR above) includes an ‘echo’ parameter that returns the syntax of the Sesame RDF Query Language (SeRQL) query instead of search results. No knowledge of SeRQL is required to use MKSearch, but full details are available from the Sesame documentation:

http://www.openrdf.org/documentation.jsp

The sample application's query syntax page is a reasonably complete first draft, the help page is not. If you have any questions that should be on this page, please send them to the project mailing list.

Distribution contents

The binary distribution only contains a sub-set of these directories.

+--ant          Ant build, filter and property files
|   |
|   +--bin      Scripts to run Ant directly using Java
|
+--bin          Scripts to build the project without using Ant, see README.txt
|   |
|   +--util     Scripts to pass a list of Java source to the compiler
|
+--conf         Sample configuration sets for JSpider
|
+--coverage     MKSearch Hansel coverage test hierarchy
|
+--dist         The MKSearch JAR and WAR files
|
+--lib-opt      Optional supporting packages in binary JAR files, see Licences below
|
+--lib-src      Supporting package source hierarchy, mostly Java source
|
+--license      Licences for all distributed software
|
+--src          MKSearch source hierarchy
|
+--test         MKSearch JUnit test source hierarchy

Licences

The MKSearch system is released under the GNU General Public License. A copy of this licence can be found in the ‘license’ directory of the distribution, with terms for the other component parts of the system, as follows. This licence is named gpl.txt.

Copyright disclaimer for MKSearch

MKDoc Holdings Ltd hereby disclaims all copyright interest in the program MKSearch written by Philip Shaw.

Bruno Postle, 25 October 2005 Managing Director, MKDoc Holdings Ltd

Sesame Storage and Querying Framework for RDF and RDF Schema

GNU Lesser General Public License, lgpl.txt

The MKSearch version of Sesame includes only the minimal package dependencies necessary to compile and run the system for our project purposes. With further modifications, it may be possible to exclude these packages, since they are not currently used by the MKSearch system.

These ‘optional’ packages from the Apache Software Foundation are provided in binary form in the the ‘lib-opt’ directory of the MKSearch distribution, distributed under the Apache 2.0 license, named apache2.txt.

Apache Web Services Project SOAP: soap.jar

Apache Jakarta Commons File Upload package: commons-fileupload-1.0.jar

http://www.openrdf.org/

http://ws.apache.org/soap/

http://jakarta.apache.org/commons/fileupload/

JSpider Web Spider Engine

GNU Lesser General Public License, lgpl.txt

JSpider also has minimal dependencies on two further Apache packages, provided in binary form in the ‘lib-opt’ directory. These are also distributed under the Apache 2.0 license, named apache2.txt.

Apache Jakarta Commons Logging: commons-logging.jar

Apache Jakarta Velocity: velocity-dep-1.3.1.jar

http://j-spider.sourceforge.net/

http://jakarta.apache.org/commons/logging/

http://jakarta.apache.org/velocity/

JTidy

W3C License, jtidy.txt

MKSearch includes a CVS snapshot of the JTidy package.

http://jtidy.sourceforge.net/

GNU Servlet API

GNU Lesser General Public License, lgpl.txt

MKSearch includes a CVS version of the GNU Classpath X Servlet API.

http://www.gnu.org/software/classpathx/

GNU JAXP 1.3

GNU Lesser General Public License, lgpl.txt

MKSearch includes a pure Java subset of the GNU Classpath X JAXP 1.3 snapshot release.

http://www.gnu.org/software/classpathx/jaxp/

PostgreSQL JDBC driver

BSD License, bsd.txt

The PostgreSQL JDBC driver provides database storage for Sesame RDF repositories as an alternative to the basic file system storage in this release.

http://jdbc.postgresql.org/

MySQL JDBC driver

GNU Lesser General Public License, lgpl.txt

The MySQL JDBC driver provides database storage for Sesame RDF repositories as an alternative to the basic file system storage in this release. The MySQL driver has special ‘FLOSS’ licence exceptions, see mysql-exceptions.txt

http://dev.mysql.com/doc/refman/5.0/en/java-connector.html

Hansel Code Coverage Testing Framework

Hansel is used to measure JUnit test coverage for the MKSearch system. The Hansel coverage tests are an optional Ant build target and are not required to compile any part of the MKSearch system. Hansel is dependent on the Apache BCEL package and both are provided in binary form in the ‘lib-opt’ directory. Hansel is distributed under a BSD-style license, named hansel.txt. BCEL is distributed under the Apache 2.0. license, named apache2.txt.

http://hansel.sourceforge.net/

http://jakarta.apache.org/bcel/

<< | Up

This document was last modified by Philip Shaw on 2005-11-02 10:48:10
Copyright MKDoc Ltd. and others.
The Free Documentation License http://www.gnu.org/copyleft/fdl.html