How to Parse an XML Document using DOMAs we know that information contained in an XML document is available for use after parsing the document. So it is an important process to parse an XML document for any organization which is going to use the data store in XML. Here we are going to explore the process of parsing an XML document using DOM. DOM stands for Document object model. It is a standard framework for manipulating an XML document. Here we will discuss DOM parser which contains functions to access, manipulate and read XML data. First step done by parser is to load the XML document as an XML DOM object. Actually DOM parser uses five classes which are responsible for the navigation and parsing of the document.
We will understand these classes one by one. First is Node class. Any object of the XML document can be referred as a node. Node can be an attribute, notation, entity or a tag name. So Node class is a super class. This class has some methods by which we can parse a node data. First is getNodeName() which returns the information about the node name. Then we have getNodeType() and getNodeValue() which return the type of node(attribute,tag etc.) and node value i.e. information contained in each node respectively. Next class of DOM parser is Document class. This class contains functionality for generating and retrieving the elements of an XML document and it is derived from the existing Node class. Methods in this class are createAttribute(), createComment(), getDocumentElement(), getElementsBTagNames(). Next classes of DOM parser manipulate the document at a more detailed and finer level. These are Element class, Attr class and CharcterData class. Element class and Attr class deal with the manipulation of attributes. The common manipulations done on attributes are changing the value of attribute, removing the value of attribute and retrieving the value of attribute. Element class has methods setAttribute(), removeAttribute() and getAttribute() for performing these tasks respectively. In addition to these methods there are methods of Attr class also which can be used for the purpose of changing the value of attributes and retrieving the value of attributes. These are setValue() and getValue() respectively. Now we will discuss the CharcterData class which deals with the information contained in the elements of an XML document. The methods of this class enable us to insert data into an element, removing data from the element, calculating the length of data and appending data. The methods are insertData(), deleteData(), getLength() and appendData() respectively. So these are the classes and methods inside a DOM parser which are used in parsing an XML document. DOM parser is generally used for small XML documents. The technique involves the use parse() method which retrieves the document and return an instance of Document object. After this step, we have the access of whole document and then we move to the retrieval of root element of the document. This is accomplished
by the getDocumentElement() method of Element class described above. The
other elements can be accessed by the method getChildNodes(). So all the methods of five DOM classes play a vital role in parsing the XML document.
|