Spiders

Introduction

The spiders reviewed here range from relatively simple to vastly complex and versatile. Because of this projects' requirement for GPL licencing, some spiders have been discounted from the original long list, see excluded spiders for details. What remains can be classified as follows.

Simple spiders
Basic single- and multi-threaded spiders designed for limited purposes, includes HouseSpider, Spindle and Arachnid.
Advanced spiders
Complex spiders that support multiple content types, variable post-processing options and advanced HTTP handling, includes JoBo Metis and Heretrix.
Spidering engine
Not an application in its own right, but a framework for configuring spidering tasks. The only candidate in this class is J-Spider, which is recommended for the MKSearch project.
Link mappers
Link mappers traverse links like spiders but have a more limited or specific purpose, the HyperSpider and WebWader tools have been re-classified in this category.
RDF crawlers
This group of tools has very specific RDF-related document processing features. At present, none match the (X)HTML processing requirements for MKSearch.

Spider reviews

OCRA
A brief review of OCRA, the ontology crawler.
2004-12-02 01:36:31
DAML Crawler
A brief review of DAML Crawler.
2004-12-02 01:27:22
RDF Crawler
A brief review of RDF Crawler.
2004-12-02 00:54:47
Acme Spider
A review of the Acme Spider Web spider
2004-11-03 08:20:23
WebWader
A review of the WebWader Web spider
2004-11-03 08:16:27
Excluded spiders
A review of Web spiders that were excluded from consideration for the MKSearch project, primarily because of licence issues
2004-11-03 06:54:56
WebLech
A review of the WebLech Web spider
2004-11-03 06:52:15
JoBo
A review of the JoBo Web spider, to date the second strongest candidate for the MKSearch project.
2004-11-03 06:50:27
J-Spider
A review of the J-Spider Web spider, to date the strongest candidate for the MKSearch project.
2004-11-03 06:48:28
HyperSpider
A review of the HyperSpider Web spider.
2004-11-03 06:46:32
This document was last modified on 2004-12-02 01:57:42.
Copyright MKDoc Ltd. and others.
The Free Documentation License http://www.gnu.org/copyleft/fdl.html