Java Programming update the common methods of XML documents
This paper briefly discussed the updated Java programming language in XML documents of the four common methods, and analysis of the pros and cons of these four methods. Secondly, the paper also Java programs on how to control the output of the XML document format on the start.
JAXP is Java API for XML Processing the first initials of the word, the Chinese meaning is: for the use of XML documents dealing with the preparation of the Java programming language interface. JAXP supports DOM, SAX, XSLT standards. In order to enhance the flexibility of JAXP use, development, particularly for the design of a JAXP Pluggability Layer, in Pluggability Layer support, and specific JAXP can achieve DOM API, SAX API of XML parser (XML Parser, for example, Apache Xerces) joint work, but also the implementation of XSLT and specific standards XSLT processor (XSLT Processor, such as Apache Xalan) joint work. Application of the benefits is Pluggability Layer: JAXP we only need to know the definition of the various programming interface can be, without the need for the concrete used in the XML parser, XSLT processors have a deeper understanding. For example, in a Java program, called by JAXP XML parser Apache Crimson processing of XML documents, if we wish to use other XML parser (such as Apache Xerces), in order to improve the performance of the procedure, then the original code may does not warrant any change, can be used directly (you need to do is to include Apache Xerces code jar files into the CLASSPATH environment variable, and will include Apache Crimson code jar files in the CLASSPATH environment variable deleted).
Currently JAXP application has been very common, can be said to be dealt with in the Java language XML document standard API. Some beginners in learning to use the process of JAXP often asked the question: I prepared for the procedure done a DOM Tree update, but when exit procedures after the original XML documents and has not changed, Fushilaoyangzi, how to achieve XML documents and the original synchronous updates DOM Tree? yesterday a view, the JAXP did not appear to provide the necessary interfaces / methods / category, and many beginners are confused about the issue. The thrust of this paper is to address this problem, a brief introduction of several commonly used simultaneously update the original XML documents and DOM Tree method. In order to narrow the scope of the discussion, the paper involved in the XML parser only include Apache Crimson and Apache Xerces, and XSLT processor used only Apache Xalan.
Method 1: direct read and write XML documents
This is perhaps the most stupid most primitive solution. When DOM Tree acquisition procedures, the application of DOM Node Interface model the various methods on DOM Tree updated, the next step should be to the original XML documents updated. We can use recursive methods or application TreeWalker category, traversing the entire DOM Tree at the same time, DOM Tree each node / write element in the pre-opening of the original XML document, be times when DOM Tree Calendar completely, DOM Tree and the original XML document on the realization synchronous updates. In reality, this method rarely used, but if you want to realize their own programming XML parser, which means it is still possible to the useful.
Method 2: Use XmlDocument class
Using XmlDocument class? JAXP not clear in this class! Author is not wrong? Not wrong! Category is the use of XmlDocument, exact, the use of XmlDocument class write () method.
Already mentioned above, can JAXP XML parser and a variety of joint use, we selected the XML parser is Apache Crimson. XmlDocument (org.apache.crimson.tree.XmlDocument) Apache Crimson is a category not included in the standard JAXP, no wonder the documents in the JAXP XmlDocument class invitation and could not find it. Out of the question now, how to achieve application XmlDocument class update XML documents? XmlDocument in the category provided the following three write () method (based on the latest version of Crimson —— Apache Crimson 1.1.3):
Public void write (OutputStream out) throws IOException
Public void write (Writer out) throws IOException
Public void write (Writer out, String encoding) throws IOException
The three write () method is the primary role of DOM Tree output of the contents of the output to a specific medium, such as paper output streams, application console, and so on. So how can the use of these three write () method? Look at the Java code fragment:
String name = "fancy";
DocumentBuilder parser;
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance ();
Try
(
Factory.newDocumentBuilder parser = ();
Document doc = parser.parse ( "user.xml");
Element newlink = doc.createElement (name);
Doc.getDocumentElement (). AppendChild (newlink);
((XmlDocument) doc). Write (new FileOutputStream (new File ( "xuser1.xml ")));
)
Catch (Exception e)
(
/ / To log it
)
In the above code, the first creation of a Document Object doc, for a complete DOM Tree, and then Application Node Interface appendChild () method, in the final DOM Tree has been added as a new node (fancy), and finally call XmlDocument class write (OutputStream out), the DOM Tree output to the contents of xuser.xml (in fact can also output to user.xml, updating the original XML documents, in order to facilitate done here compared Therefore, the paper output to xuser.xml ). Needs attention is not directly on the Document Object doc called directly write () method because JAXP Document Interface and the absence of a definition of any write () method, it must be mandatory doc from the Document Object converted to XmlDocument object, and then call to write ( ), in the above code is used in the write (OutputStream out) method, this method using the default UTF-8 encoded output DOM Tree specific to the content of the output medium, if DOM Tree contains Chinese characters, then output of results may be garbled, which is the existence of so-called "Chinese characters", a solution is to use write (Writer out, String encoding), Explicit designated output encoding, such as the second parameter set to " GB2312 ", which does not exist at this time," Chinese characters ", the output results show that normal Chinese characters.
Complete examples refer to the following documents: AddRecord.java (see Annex), user.xml (see annex). The example of the operation of the environment: Windows XP Professional, JDK 1.3.1. In order to compile the normal operation AddRecord.java this procedure, you will need to download the website http://xml.apache.org/dist/crimson/ Apache Crimson, and documents obtained crimson.jar joined in the CLASSPATH environment variable.
NOTE:
Apache Crimson is the predecessor of Sun Project X Parser, but I do not know why, X Parser evolved into Apache Crimson, Apache Crimson since many of the code are directly from the X Parser the transplant. For example used above the XmlDocument class, it is com.sun.xml.XmlDocument X Parser, in the Apache Crimson suddenly, it becomes a org.apache.crimson.tree.XmlDocument category, in fact the vast majority of them code is the same as the package may import statements, as well as statements and the beginning of the document section lience be different. Early JAXP and is bundled with the X Parser, some old procedures used com.sun.xml package, if you recompile them, may not be passed, certainly because of this reason. Later, the JAXP and Apache Crimson tied together, such as JAXP 1.1, if you use JAXP 1.1, so no additional download Apache Crimson, but also the normal operation of the above example compiler (AddRecord.java). As JAXP 1.2 EA (Early Access) to change its ways, a better performance by Apache Xalan and Apache Xerces respectively as XSLT processor and the XML parser, and can not directly support the Apache Crimson, so if you use a development environment or JAXP 1.2 EA is Java XML Pack (includes JAXP 1.2 EA), it will not be able to run directly compile the example above (AddRecord.java), you need to download and install additional Apache Crimson.
Method 3: Use of TransformerFactory and Transformer
In JAXP the standards provided in the original XML document updates the method is called XSLT engine, which is used TransformerFactory and Transformer category. Look at the Java code snippet:
/ / Create a DOMSource first target, the constructor function parameters can be a Document Object
/ / Doc representative of the changes DOM Tree.
DOMSource doms = new DOMSource (doc);
/ / Create a File object, DOM Tree representative of the data contained in the output media, it is an XML file.
File f = new File ( "XMLOutput.xml");
/ / Create a StreamResult object, the constructor function parameters can be chosen for the File object.
StreamResult sr = new StreamResult (f);
/ / Below call JAXP in XSLT engine to achieve DOM Tree output the data in the XML file to the function.
/ / XSLT engine DOMSource target for the input and output for StreamResut object.
Try
(
/ / Create a TransformerFactory first object, thus creating further Transformer object. Transformer
/ / XSLT engine equivalent of a category. Usually, we use it to deal with XSL files, but here we have made
/ / Use it to output XML document.
TransformerFactory tf = TransformerFactory.newInstance ();
Transformer t = tf.newTransformer ();
/ / Key step, calling Transformer object (XSLT engine) transform () method, the method is the first
/ / Parameter is DOMSource object, and the second parameter is StreamResult object.
T.transform (doms, sr);
)
Catch (TransformerConfigurationException tce)
(
System.out.println ( "Transformer Configuration Exception \ n —–");
Tce.printStackTrace ();
)
Catch (TransformerException te)
(
System.out.println ( "Transformer Exception \ n ———–");
Te.printStackTrace ();
)
In practical application, we can use the traditional DOM API XML documents obtained from the DOM Tree, and then on the basis of the actual needs of the implementation of DOM Tree operation by the end of the Document object, which can then create DOMSource Document Object Object, and the remaining thing is copy the code above, running after XMLOutput.xml you need is the result (of course, you can be arbitrarily changed StreamResult type constructor function parameters specify a different output media, and not necessarily size-fits-all XML documents).
The greatest advantage of this method lies in the arbitrary control of the contents of DOM Tree output to the output medium format, but the category alone TransformerFactory Transformer category and can not be achieved and this function, but also need to rely on the help of OutputKeys category. Complete examples refer to the following documents: AddRecord2.java (see Annex), user.xml (see annex). The example of the operation of the environment: Windows XP Professional, JDK 1.3.1. In order to compile the normal operation AddRecord2.java this procedure, you will need to download and install at http://java.sun.com to JAXP 1.1 or Java XML Pack (Java XML Pack has been the intron JAXP).
OutputKeys category
Javax.xml.transform.OutputKeys category with the use of java.util.Properties and can be controlled JAXP XSLT engine (Transformer) output XML document format. See the following code fragment:
/ / Create a TransformerFactory first object, thus creating further Transformer object.
TransformerFactory tf = TransformerFactory.newInstance ();
Transformer t = tf.newTransformer ();
/ / Output access Transformser object attributes, which is the default XSLT engine output attributes, this is a
/ / Java.util.Properties object.
Properties properties = t.getOutputProperties ();
/ / Set up a new output attributes: output for the GB2312 character encoding, and this may support Chinese characters, XSLT engine output
/ / If the XML document containing the Chinese characters can be normal, not a so-called "Chinese problem."
/ / Please note of the string constants OutputKeys OutputKeys.ENCODING.
Properties.setProperty (OutputKeys.ENCODING, "GB2312");
/ Updates XSLT engine output attributes.
T.setOutputProperties (properties);
/ / Call XSLT engine, according to the settings in output attributes, DOM Tree output of the contents to the output medium.
T.transform (DOMSource_Object, StreamResult_Object);
From the above code, we can easily see that by setting XSLT engine (Transformer) output attributes, DOM Tree can control the content of the output format, which is our custom output is very helpful. Well JAXP XSLT engine (Transformer) output of those attributes can be set? Javax.xml.transform.OutputKeys definition of a lot of string constants, and they are free to set output attributes, the output of common attribute as follows:
Public static final java.lang.String METHOD
Can be set to "xml", "html", "text" equivalent.
Public static final java.lang.String VERSION
Followed by the standard version, if METHOD as a "xml", then its value should be set to "1.0" If METHOD as a "html", then its value should be set to "4.0" If METHOD Set " text, "then the output attributes will be ignored.
Public static final java.lang.String ENCODING
Output settings used by the coding, such as "GB2312", "UTF-8" and so on, if it is set to "GB2312" will be a solution to the so-called "Chinese problem."
Public static final java.lang.String OMIT_XML_DECLARATION
Set the output to XML documents, XML is overlooked statement, which is similar to:
<? Xml version = "1.0" standalone = "yes" encoding = "utf-8?">
Such code. The value of its options "yes" and "no."
Public static final java.lang.String INDENT
IDENT set XSLT engine output XML document, is automatically add extra space, it optional for the value of "yes" and "no."
Public static final java.lang.String MEDIA_TYPE
MEDIA_TYPE set the output file MIME type.
If you set XSLT engine output attributes? Below us summarize:
The first is access to XSLT engine (Transformer) output default attribute set, which requires the use of such getOutputProperties Transformer (), the return value is a java.util.Properties object.
Properties properties = transformer.getOutputProperties ();
Then set a new output attributes, such as:
Properties.setProperty (OutputKeys.ENCODING, "GB2312");
Properties.setProperty (OutputKeys.METHOD, "html");
Properties.setProperty (OutputKeys.VERSION, "4.0");
………………………………………………………
Finally, there is updated XSLT engine (Transformer) output default attribute set, which requires the use of the Transformer setOutputProperties () method, parameter object is a java.util.Properties.
We prepared a new procedure, which has been applied OutputKeys category, XSLT engine to control the output of attributes, and the structure of the program before a procedure (AddRecord3.java) roughly the same, but slightly different output results. Integrity of the code refer to the following documents: AddRecord3.java (see Annex), user.xml (see annex). The example of the operation of the environment: Windows XP Professional, JDK 1.3.1. In order to compile the normal operation AddRecord3.java this procedure, you will need to download and install at http://java.sun.com to JAXP 1.1 or Java XML Pack (Java XML Pack includes a JAXP).
Method 4: Using Xalan XML Serializer
Method 4 is a variant of the three methods, it requires Apache Xalan and Apache Xerces of support to run. Examples of the code as follows:
/ / Create a DOMSource first target, the constructor function parameters can be a Document Object
/ / Doc representative of the changes DOM Tree.
DOMSource domSource = new DOMSource (doc);
/ / Create a DOMResult object, temporary preservation XSLT engine output.
DOMResult domResult = new DOMResult ();
/ / Below call JAXP in XSLT engine to achieve DOM Tree output the data in the XML file to the function.
/ / XSLT engine DOMSource target for the input and output for DOMResut object.
Try
(
/ / Create a TransformerFactory first object, thus creating further Transformer object. Transformer
/ / XSLT engine equivalent of a category. Usually, we use it to deal with XSL files, but here we have made
/ / Use it to output XML document.
TransformerFactory tf = TransformerFactory.newInstance ();
Transformer t = tf.newTransformer ();
/ / Set XSLT engine attributes (essential, otherwise they will have a "Chinese problem").
Properties properties = t.getOutputProperties ();
Properties.setProperty (OutputKeys.ENCODING, "GB2312");
T.setOutputProperties (properties);
/ / Key step, calling Transformer object (XSLT engine) transform () method, the method is the first
/ / Parameter is DOMSource object, and the second parameter is DOMResult object.
T.transform (domSource, domResult);
/ / Create Xalan XML Serializer default, it will use temporary storage object in DOMResult
/ / (DomResult) to the content of the output stream in the form of output to the output medium.
Serializer serializer = SerializerFactory.getSerializer
(OutputProperties.getDefaultMethodProperties ( "xml"));
/ / Set Xalan XML Serializer attribute the output of this step necessary, otherwise may have
/ / The so-called "Chinese problem."
Properties prop = serializer.getOutputFormat ();
Prop.setProperty ( "encoding", "GB2312");
Serializer.setOutputFormat (prop);
/ / Create a File object, DOM Tree representative of the data contained in the output media, it is an XML file.
File f = new File ( "xuser3.xml");
/ / Create file output stream object fos, please pay attention to structural function parameters.
FileOutputStream fos = new FileOutputStream (f);
/ / Set Xalan XML Serializer the output stream.
Serializer.setOutputStream (fos);
/ / Serial output of the results.
Serializer.asDOMSerializer (). Serialize (domResult.getNode ());
)
Catch (Exception tce)
(
Tce.printStackTrace ();
)
This method is not commonly used, but also seems a bit superfluous, so we have not started a discussion. Complete examples refer to the following documents: AddRecord4.java (see Annex), user.xml (see annex). The example of the operation of the environment: Windows XP Professional, JDK 1.3.1. In order to compile the normal operation AddRecord4.java this procedure, you will need to download and install at http://xml.apache.org/dist/ to Apache Xalan and Apache Xerces.
Http://java.sun.com/xml/download.html or the website to download and install Java XML Pack. Because of the latest Java XML Pack (Winter 01 edition) includes Apache Xalan technology, and Apache Xerces.
Conclusion:
This paper briefly discussed the updated Java programming language in XML documents of the four methods. The first method is direct read and write XML documents, this method is cumbersome, and more error-prone, rarely used, unless you need to develop their own XML Parser, otherwise they would not use such methods. The second method is to use the Apache Crimson XmlDocument class, it is extremely simple and easy to use, if you choose Apache Crimson as the XML parser, it may wish to use this method, but this method seems efficiency is not high enough (from efficiency low Apache Crimson), In addition, the higher version of JAXP or Java XML Pack, JWSDP does not directly support Apache Crimson, which is not such a common approach. The third method is to use JAXP XSLT engine (Transformer) to output XML document, this method is the standard method may be, to use a very flexible, in particular, it can easily control the output format, we recommend using this method. The fourth method is a variant third method, used Xalan XML Serializer, the introduction of a serial operation, a large number of documents to amend / output have superiority, it is a pity that repeat settings XSLT engine attributes and XML Serializer Output Attributes more trouble, and rely on Apache Xalan and Apache Xerces, slightly less than universal.
In addition to the four methods discussed above, in other applications API (such as JDOM, Castor, XML4J, Oracle XML Parser V2) there are also many ways to update XML documents, the limited space, and here on January 1 not discussed.
Tags: XML








0 Comments to “Java Programming update the common methods of XML documents”
No Comments. Send your comment.
Leave a Reply
You must be logged in to post a comment.