Excluded spiders

These spiders have been excluded because their licence terms are not explicit or not compatible with GPL, or the project is not sufficiently mature to be considered.

Arale

Arale is declared open source, but has no explicit licence terms, it seems to be intended primarily for personal use.

LARM

Project is in its very early days, not suitable.

Nutch
Nutch is based on Apache Lucene through the Apache Software License, and appears to have its own supplementary licence terms.
Oxyus
Oxyus is a search engine based on Apache Lucene indexer and is released under a version of the Apache Software License, see the OpenSymphony Software License page for details.
Spider

Spider is designed to handle custom post-processing of data acquired through spidering. It is released under an OSI Approved licence. It has significant dependencies on Apache software, so is not suitable.

  • The class com.tempeststrings.spider.util.SpiderHostnameVerifier depends on the com.sun.net.ssl.HostnameVerifier class.
  • Many classes depend on Apache packages released under the Apache Software License version 2.0:
    • The Avalon framework, org.apache.avalon.*.
    • The Command Line Interface (CLI) package, org.apache.commons.cli.
    • The Commons Digester package, org.apache.commons.digester.
    • The Commons HTTP Client package, org.apache.commons.httpclient.
    • The Commons Lang package, org.apache.commons.lang
    • The Commons Logging package, org.apache.commons.logging.
    • The Excalibur package, org.apache.excalibur.*.
    • The Commons ORO package, org.apache.oro.*.
    • The Xerces XML parser packages, org.apache.xerces.* and org.apache.xml.*.
  • JUnit tests depend on the junit.framework package.
  • The class com.tempeststrings.spider.manager.FeedManagerSpiderImpl depends on the class java.security.Security, which may not be fully implemented by GNU Classpath.
  • Two classes depend on the javax.servlet.http.Cookie class, which should be compatible with the GNU Servlet API.
WebSphinx

WebSphinx is released under an "Apache-style" licence, see the master version for details.

This document was last modified on 2004-11-03 06:58:53.
Copyright MKDoc Ltd. and others.
The Free Documentation License http://www.gnu.org/copyleft/fdl.html