edu.jhu.nlp.wikipedia
Class WikiXMLParser

java.lang.Object
  extended by edu.jhu.nlp.wikipedia.WikiXMLParser
Direct Known Subclasses:
WikiXMLDOMParser, WikiXMLSAXParser

public abstract class WikiXMLParser
extends java.lang.Object


Field Summary
protected  WikiPage currentPage
           
 
Constructor Summary
WikiXMLParser(java.lang.String fileName)
           
 
Method Summary
protected  org.xml.sax.InputSource getInputSource()
           
abstract  WikiPageIterator getIterator()
           
protected  void notifyPage(WikiPage page)
           
abstract  void parse()
          The main parse method.
abstract  void setPageCallback(PageCallbackHandler handler)
          Set a callback handler.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

currentPage

protected WikiPage currentPage
Constructor Detail

WikiXMLParser

public WikiXMLParser(java.lang.String fileName)
Method Detail

setPageCallback

public abstract void setPageCallback(PageCallbackHandler handler)
                              throws java.lang.Exception
Set a callback handler. The callback is executed every time a page instance is detected in the stream. Custom handlers are implementations of PageCallbackHandler

Parameters:
handler -
Throws:
java.lang.Exception

parse

public abstract void parse()
                    throws java.lang.Exception
The main parse method.

Throws:
java.lang.Exception

getIterator

public abstract WikiPageIterator getIterator()
                                      throws java.lang.Exception
Returns:
an iterator to the list of pages
Throws:
java.lang.Exception

getInputSource

protected org.xml.sax.InputSource getInputSource()
                                          throws java.lang.Exception
Returns:
An InputSource created from wikiXMLFile
Throws:
java.lang.Exception

notifyPage

protected void notifyPage(WikiPage page)