MKSearch

A new kind of search engine

MKSearch is a research project to develop a metadata search engine. The system is composed of two linked systems; an indexing Web crawler and a public query interface. The indexing component extracts Dublin Core metadata from Web documents and stores them in RDF format. The query interface matches documents in the index using an RDF query language and can return the results in a variety of formats including standard HTML and as a standing RSS feed.

Project status

On 2 November 2005 the MKSearch project made its first beta release. The system is available in pre-compiled binary format and source-only in the downloads section. We welcome your comments and questions on the MKSearch mailing list.

The MKSearch system is being developed using the Java programming language and is licenced under the GNU General Public Licence. All software is compiled and tested using both the Sun and GNU Java compilers. All project source material is available through the public MKSearch Subversion repository.

The MKSearch Java documentation is periodically updated in the project's Subversion repository.

System composition

The MKSearch system is composed of several other free software components. Further details are provided in the MKSearch development plans.

JSpider
JSpider is a Java Web crawler engine that has pluggable interfaces that can be used to add custom processing and content handling. MKSearch uses custom SAX-based content handlers for extracting metadata from Web documents.
Sesame
Sesame is a set of RDF processing and storage APIs and applications that includes RDF data query facilities. MKSearch uses Sesame to store indexed metadata in RDF format and to search the repository via the public query interface.
JTidy
JTidy is a utility for correcting common HTML markup errors and is used to convert HTML documents to XHTML so they can be processed using SAX.

Document Links

MKSearch development plans
Development plans for MKSearch
http://mksearch.mkdoc.org.archived.website/plans/
MKSearch Subversion repository
The Web interface to the MKSearch Subversion repository
https://svn.mkdoc.com/mksearch/
MKSearch mailing list
The MKSearch developers' email list
http://www.email-lists.org/mailman/listinfo/mksearch-dev
MKSearch Java documentation
Java documentation for the MKSearch system
https://svn.mkdoc.com/mksearch/trunk/doc/javadoc/
downloads section
Download the MKSearch system
http://mksearch.mkdoc.org.archived.website/downloads
This document was last modified on 2005-11-21 08:16:49.
Copyright MKDoc Ltd. and others.
The Free Documentation License http://www.gnu.org/copyleft/fdl.html