edu.jhu.nlp.wikipedia
Class WikiXMLParser
java.lang.Object
edu.jhu.nlp.wikipedia.WikiXMLParser
- Direct Known Subclasses:
- WikiXMLDOMParser, WikiXMLSAXParser
public abstract class WikiXMLParser
- extends java.lang.Object
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
currentPage
protected WikiPage currentPage
WikiXMLParser
public WikiXMLParser(java.lang.String fileName)
setPageCallback
public abstract void setPageCallback(PageCallbackHandler handler)
throws java.lang.Exception
- Set a callback handler. The callback is executed every time a
page instance is detected in the stream. Custom handlers are
implementations of
PageCallbackHandler
- Parameters:
handler
-
- Throws:
java.lang.Exception
parse
public abstract void parse()
throws java.lang.Exception
- The main parse method.
- Throws:
java.lang.Exception
getIterator
public abstract WikiPageIterator getIterator()
throws java.lang.Exception
- Returns:
- an iterator to the list of pages
- Throws:
java.lang.Exception
getInputSource
protected org.xml.sax.InputSource getInputSource()
throws java.lang.Exception
- Returns:
- An InputSource created from wikiXMLFile
- Throws:
java.lang.Exception
notifyPage
protected void notifyPage(WikiPage page)