HyperSpider
HyperSpider is a tool for mapping the link structure of a Web site and can output the structure in various formats including RDF. Sourceforge page says it is in the public domain, apparently it originated from a news group posting. Also includes code for connection with MySQL database.
- The class
info.navigable.hyperspider.Databaser
depends oncom.mysql.jdbc.Driver
.
Hyperspider also depends on standard Java extensions that may not be implemented by GNU Classpath:
- Several classes in the
info.navigable.hyperspider
package depend onjavax.swing
,javax.swing.text
,javax.swing.text.html
andjavax.swing.tree
packages.
Initial review notes
HyperSpider does not fulfil the need to process Web content, it only maps the link structure of a Web site. Nonetheless, it is interesting because it uses the Swing HTML parser package and callback system, and has output to serialized RDF and XML Topic Maps. Much of the code suggests it would not be flexible enough to easily adapt to other purposes and it is not clear how well the Swing HTML parser deals with invalid markup.
Overall, not suitable for MKSearch.
Copyright MKDoc Ltd. and others.
The Free Documentation License http://www.gnu.org/copyleft/fdl.html