SAX (Simple API to XML) is an event-driven parser API, which supports most of the widely available XML parsers. It does not provide all the information that an editor might want, such as comments, physical structure information (such as unexpanded entities), CDATA sections, or DTD support.

SAX promotes a modular design of XML systems, since it places no requirement on parsers to provide particular data structures (such W3C's DOM) representing XML documents. Since parsers plug in to the API using a driver style API, different parsers can be used for different application environments.

For more information on SAX, see the SAX page.

Basic Use of a SAX Parser

To use SAX, first instantiate a Parser instance; you will normally use the sax.helpers.ParserFactory class according to the org.xml.sax.driver system property.

Then provide an implementation of the DocumentHandler, to receive your parsing events. (For example, you can build a DOM parse tree by using a com.sun.xml.tree.XmlDocumentBuilder instance as your document handler). This will need to be able to make copies of AttributeList data provided by the parser.

Finally, package your XML data as an InputSource (or get its URL) and call Parser.parse to send a stream of parse events to your document handler.

Advanced Features

Many applications will want to customize the parser by providing an ErrorHandler to handle errors, perhaps providing diagnostics using the Locator given to the parser. For example, when using a validating parser you will often want to stop processing documents which have errors.

Applications can also provide an EntityResolver to participate in resolving the entities required by the XML document being parsed, arranging to use local or replicated copies. Some resolvers may have intelligence to access catalogs mapping XML public identifiers to URIs other than the system ID (perhaps stored in a local repository or a Java resource). They may also use all the available MIME type information, such as character encodings.

Some XML documents refer to unparsed entities or to notations in their attributes. When working with such documents, you will need to provide a DTDHandler object in order to be notified of the notations and unparsed entities which were defined in the document type definition (DTD).