How to Modify XML File in Java

1. Introduction

Let us learn how to modify an XML file to remove unwanted information.

One method to remove XML nodes is to use the XML DOM Api to search the XML structure and remove unwanted nodes. While this sounds easy, using the DOM Api is quite hard especially for anything more than trivial searches as this article demonstrates.

An easier method to navigate and remove unwanted Nodes is to use XPath. Even complex search and removal is quite easy as we shall see.

See this article for details on parsing an XML file to obtain the XML Document.

2. Using removeChild() to remove Nodes

Once a particular node is identified for removal, it can be removed quite easily by invoking removeChild() on the parent Node.

static private void removeNode(Node node)
  Node parent = node.getParentNode();
  if ( parent != null ) parent.removeChild(node);

2.1 Saving the Modified XML Document

After the required modifications are done, the XML Document can be saved by using a Transformer.

Initialize the Transformer as shown:

tform = TransformerFactory.newInstance().newTransformer();
tform.setOutputProperty(OutputKeys.INDENT, "yes");
tform.setOutputProperty("{}indent-amount", "2");

Save the modified XML document quite easily using the transformer instance.

tform.transform(new DOMSource(document), new StreamResult(System.out));

3. Searching the XML Document

The XML data set we are using is the TSA airport and checkpoint data available here. We would like to search this data set for the airport in Mobile, AL (identified as <shortcode>MOB</shortcode> in the data set). The following code checks each node whether it matches the query.

static private boolean testNode(Node node)
    NodeList nlist = node.getChildNodes();
    for (int i = 0 ; i < nlist.getLength() ; i++) {
	Node n = nlist.item(i);
	String name = n.getLocalName();
	if ( name != null && name.equals("shortcode") ) {
	    return n.getTextContent().equals("MOB");
    return false;

Collect the nodes to be removed by searching from the document root.

List<Node> nodes = new ArrayList<>();
NodeList nlist = document.getFirstChild().getChildNodes();
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node node = nlist.item(i);
    if ( testNode(node) ) nodes.add(node);

As you can see from the implementation of testNode(), complex XML search is hard using just the DOM API.

4. Using XPath to Find and Remove Nodes

XPath can be used to easily query for nodes within an XML document.

An initial setup process is required for using XPath to search.

XPathFactory xfact = XPathFactory.newInstance();
XPath xpath = xfact.newXPath();

Here is a method to query for nodes and remove them from the document.

static private void queryRemoveNodes(String xpathStr)
Object res = xpath.evaluate(xpathStr, document, PathConstants.NODESET);
NodeList nlist = (NodeList)res;
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node node = nlist.item(i);
    Node parent = node.getParentNode();
    if ( parent != null ) parent.removeChild(node);

The previous example to remove the airport for Mobile, AL is written as:

queryRemoveNode("/airports/airport[shortcode = 'MOB']");

The removed node is:

  <name>Mobile Regional</name>

Furthermore, to remove just the <checkpoints> element from the above node, use the following:

queryRemoveNodes("/airports/airport[shortcode = "MOB"]/checkpoints");

Easily remove a bunch of nodes matching an expression.

queryRemoveNodes("/airports/airport[latitude < 20]");


There are two ways of removing nodes from an XML document. The direct method is to search for nodes using the DOM Api and remove them. An easier way is to use XPath to query and remove the nodes matching even complex queries.

How to Extract Data from XML in Java

1. Introduction

In a previous article, we looked into parsing an XML file and converting it to DOM (Document Object Model). The XML DOM object itself is not very useful in an application unless it can be used to extract required data. In this article, let us see how to extract data from XML in Java.

We demonstrate two approaches to extracting data from the XML document. One is a straightforward navigation of the DOM structure to extract fragments of data. Another way is to use XPath to describe and extract the exact information needed with an expression.

2. Accessing the XML Root Element

The most commonly used class in the DOM API is the Node class. All other types of XML artifacts are represented as a Node. These include elements, attributes, text within elements, CDATA, etc.

The most common type of Node we will be concerned with is the element. An element node has attributes, zero or more child elements, text nodes, etc.

A Document is a special type of Node which is obtained as a result of parsing the XML. Use the getFirstChild() method of a Document to get the XML root element.

Node rootElement = document.getFirstChild();

3. Accessing XML Element Children

Access the list of children of an element with the getChildNodes() method. A list of child nodes including elements, text nodes, CDATA, comments, etc are returned. It can be processed like this:

NodeList nlist = node.getChildNodes();
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node child = nlist.item(i);
    // process the child node here

A child element and its text contents can be checked as follows: A child element <shortcode>PBI</shortcode> is selected for processing here.

String name = child.getLocalName();
if ( name != null && name.equals("shortcode") ) {
    if ( child.getTextContent().equals("PBI") ) {
        // process element here

4. Generating XML Output

Print the whole XML fragment from a node once it is selected. This includes all the child nodes, text, attributes, etc.

Create a Transformer object from the factory object:

Transformer tform = TransformerFactory.newInstance().newTransformer();

Pretty-printing the XML helps in visualizing the structure. You can enable pretty-printing as shown. Here an indentation of 2 spaces is being specified.

tform.setOutputProperty(OutputKeys.INDENT, "yes");
tform.setOutputProperty("{}indent-amount", "2");

And generate the XML output from a Node object for printing:

tform.transform(new DOMSource(node), new StreamResult(System.out));

5. A More Complex Example

Let us look at a more complex example of XML data extraction with some real-world data. The XML data set we are using is the publicly available TSA airport and checkpoint data available here (warning: large file download). This data includes airport information including GPS coordinates and checkpoints.

Let us search this XML data set for information within specified GPS coordinates: locate airports within latitudes range of (25, 30), longitude range of (-90, -80). We search for matching nodes from the root node of the XML.

List<Node> res = new ArrayList<>();
NodeList nlist = rootNode.getChildNodes();
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node node = nlist.item(i);
    NodeList children = node.getChildNodes();
    boolean foundLat = false, foundLong = false;
    for (int j = 0 ; j < children.getLength() ; j++) {
	Node child = children.item(j);
	String name = child.getLocalName();
	if ( name == null ) continue;
	if ( name.equals("latitude") ) {
	    float lat = Float.parseFloat(child.getTextContent());
	    if ( lat > 25 && lat < 30 ) foundLat = true;
	} else if ( name.equals("longitude") ) {
	    float lng = Float.parseFloat(child.getTextContent());
	    if ( lng > -90 && lng < -80 ) foundLong = true;
    if ( foundLat && foundLong ) res.add(node);

The code above loops through all elements under the root node and selects those children which match the specified conditions: latitude between (25, 30) and longitude between (-90, -80).

As you can see, the code is quite complex and prone to errors. And this is just for finding nodes for some rather simple conditions.

6. Using XPath to Extract Information

Java provides an XPath API which can be used in conjunction with the XML DOM to extract information from XML in an easy manner. XPath in initialized with the application as follows:

XPathFactory xfact = XPathFactory.newInstance();
XPath xpath = xfact.newXPath();

To extract possibly multiple nodes which match an XPath expression, the following method can be used.

Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODESET);

If you know that a single node will match the expression, you can use this method instead.

Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODE);

Maybe you are trying to extract application configuration information from XML? In that case, you might prefer fetching String values in a single call.

String value = xpath.evaluate(xpathStr, document);

7. Some Examples

To compare with the earlier examples, let us find the airport node where <shortcode> equals “PBI“:

String xpathStr = "/airports/airport[shortcode = 'PBI']";
Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODESET);

Results in the output shown below (partially):

  <city>West Palm Beach</city>

And here is the second example: find airports with latitude between (25, 30) and longitude between (-90, -80).

String xpathStr = "/airports/airport[latitude > 25 and latitude < 30 and longitude > -90 and longitude < -80]";
Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODESET);


This article demonstrated a couple of ways of extracting data from XML documents. A direct way is to navigate the DOM structure and perform the extraction. This is error prone and sensitive to changes in XML structure. An easier way is to use XPath expression search to extract required information.

How do I Create a Java String from the Contents of a File?

1. Introduction

Here we present a few ways to read the whole contents of a file into a String in Java.

2. Using java.nio.file in Java 7+

Quite simple to load the whole file into a String using the java.nio.file package:

String str = new String(Files.readAllBytes(Paths.get(pathname)),

Here is how the code works. Read all the bytes into a byte buffer using the Files and Paths available in the java.nio.file package.

byte[] buf = Files.readAllBytes(Paths.get(pathname));

Convert the byte buffer into a String by specifying the character set.

String str = new String(buf, StandardCharsets.UTF_8);

3. Scan for end-of-input

Another way to read the whole file into a String is to use the Scanner class. Create a scanner with the file as input, set the appropriate delimiter and read the next token.

Note: the actual delimiter used in the code below is the beginning-of-input marker which will not match anywhere other than the beginning of input.

Scanner scanner = null;
try {
    scanner = new Scanner(new File(pathname), "UTF-8");
    return scanner.useDelimiter("\\A").next();
} finally {
    if ( scanner != null ) scanner.close();

4. Memory Mapped File Reading

This method maps the file contents directly into memory using the MappedByteBuffer class. Memory mapping the contents directly might lead one to expect enhanced performance. However this advantage is only available if the buffer is used directly. In our case, since we are creating a String from the contents of the file, the speed advantage of the memory mapped buffers is probably not visible.

static private String readFile3(String pathname)
    File f = new File(pathname);
    RandomAccessFile file = new RandomAccessFile(pathname, "r");
    MappedByteBuffer buffer = file.getChannel().map(MapMode.READ_ONLY,
    return new StringBuilder(StandardCharsets.UTF_8.decode(buffer))

ByteBuffer provides a method asCharBuffer() which returns a “view” of the byte buffer as a character buffer. However, there is no way to specify the encoding for converting bytes to characters with this method — probably an oversight in the Java API. The correct way to convert a ByteBuffer to CharBuffer is to use CharSet.decode(ByteBuffer) with the appropriate CharSet instance.

5. Simple Way Using

Of course, there is always the “old” way (pre-Java 1.7) of reading a whole file into a String: reading the characters in a loop and appending to a buffer.

static private String readFile3(String pathname)
    FileReader in = null;
    try {
	in = new FileReader(pathname);
	char[] buf = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len =, 0, buf.length)) != -1) {
	    sbuf.append(buf, 0, len);
	return sbuf.toString();
    } finally {
	if ( in != null ) in.close();

6. Benchmarking Various Approaches

Since we have several methods of reading a whole file into a string, it is interesting to see how the methods stack up against one another in performance. To this end, we implemented a simple benchmarking method using System.currentTimeMillis(). The following is the average time for each run over 1000 runs of the method.

simple       235 ms for 1000 iters: 0.235000 ms/op
nio          213 ms for 1000 iters: 0.213000 ms/op
scanner      629 ms for 1000 iters: 0.629000 ms/op
mmap         285 ms for 1000 iters: 0.285000 ms/op

For the set of conditions under which the application ran, we can conclude that the NIO method is the fastest followed by the Simple method. Slowest is the Scanner method which is somewhat expected since a regular expression search is involved. A note of warning: do not use these benchmark numbers to pick the method to use. Rather use the method closest to your paradigm of problem solving.


You are now aware of various methods of reading the whole contents of a file into a String. Pick whatever suits you best and use it!

How to Convert UTF-16 Text File to UTF-8 in Java?

1. Introduction

In this article, we show how to convert a text file from UTF-16 encoding to UTF-8. Such a conversion might be required because certain tools can only read UTF-8 text. Furthermore, the conversion procedure demonstrated here can be applied to convert a text file from any supported encoding to another.

UTF-8 is a character encoding that can represent all characters (or code points) defined by Unicode. It is designed to be backward compatible with legacy encodings such as ASCII.

UTF-16 is another character encoding that encodes characters in one or two 16-bit code units whereas UTF-8 encodes characters in a variable number of 8-bit code units.

2. Supported Character Sets

You can find the characters sets supported by the JVM using the class java.nio.charset.Charset as follows:

for (Map.Entry e : Charset.availableCharsets().entrySet()) {

// prints the following

3. Conversion Using Classes

Java provides class as a bridge between byte streams to character streams. Open the file using this class to be able to read character buffers in the specified encoding:

Reader in = new InputStreamReader(new FileInputStream(infile), "UTF-16");

Analogously, the class acts as a bridge between characters streams and bytes streams. Create a Writer with this class to be able to write bytes to the file:

Writer out = new OutputStreamWriter(new FileOutputStream(outfile), "UTF-8");

With the Reader and Writer in place, it is trivial to copy data from the input file to the output file:

char cbuf[] = new char[2048];
int len;
while ((len =, 0, cbuf.length)) != -1) {
    out.write(cbuf, 0, len);

And that’s it! You have successfully read and converted data from UTF-16 to UTF-8. You can use this code to perform the conversion between any two character sets supported by your JVM.

4. Using String for Converting Bytes

Sometimes, you may have a byte array which you need converted and output in a specific encoding. You can use the String class for these cases as shown below. First convert the byte array into a String:

String str = new String(bytes, 0, len, "UTF-16");

Next, obtain the bytes in the required encoding by using the String.getBytes(String) method:

byte[] outbytes = str.getBytes("UTF-8");

Write the byte array to an OutputStream:

OutputStream out = new FileOutputStream(outfile);

Note that while you could use the String class as shown to convert bytes, you should prefer using Reader/Writer combination when possible to avoid problems with multi-byte characters. Specifically, the byte array you have read may contain an incomplete multi-byte character at the beginning or the end. This may lead to character encoding errors.


When you need to convert text from one character encoding to another in Java, you have several options:

  • Using InputStreamReader and OutputStreamWriter bridge classes for conversion.
  • Using the String class directly with specified encoding.
  • A more advanced option is to use CharsetEncoder and CharsetDecoder class (not presented in this article).

Difference Between HashMap and Hashtable in Java

1. Introduction

Java provides several ways of storing key-value maps (also known as dictionaries). The most common ones are java.util.HashMap and java.util.Hashtable. Let us explore the difference between these two classes.

2. Synchronization

Synchronization is a mechanism in Java for preventing multiple threads from interfering with each other and eliminating memory consistency errors.

When one variable (a resource) is visible to multiple threads at the same time, consistency issues arise when one thread attempts to modify the value while another thread is accessing it. To prevent these issues, some form of synchronization must be used.

While synchronization helps in eliminating consistency errors, it adds an overhead when used. In a single-threaded program (or when you can guarantee that a single thread will access the resource), you can use HashMap to eliminate this overhead. Create a HashMap as follows:

HashMap<String,Object> map = new HashMap<>();
map.put("currentTime", new Date());

However, when accessing or modifying a dictionary shared between multiple threads, you must use Hashtable. The following shows how to create a Hashtable:

Hashtable<String,Integer> tbl = new Hashtable();
tbl.put("count", 32);

3. Using Null Keys or Values

When your dictionary needs to contain null keys or values, you cannot use Hashtable since this is not allowed. You must use HashMap in this case.

If you need multiple threads reading or writing the HashMap, you can wrap the HashMap using Collections.synchronizedMap() as follows:

HashMap<String,Object> map = Collections.synchronizedMap(new HashMap<>());

A HashMap can contain one null key and any number of nulls for values.

4. Predictable Iteration Order

A subclass of HashMap is LinkedHashMap which maintains a doubly-linked list of the entries in the Map. This allows traversal of the Map entries in a predictable order (in the order that the entries were inserted into the Map). If you need such a predictable ordering of the entries, then you can easily replace the HashMap with a LinkedHashMap as follows:

HashMap<String,Object> map = new LinkedHashMap<>();

When using a Hashtable, such a predictable iteration order is not possible. If this is required, use a LinkedHashMap with a Collections.synchronizedMap() wrapper as above.

5. Iterating using Enumerator

While both HashMap and Hashtable support iteration over the entries using the entrySet(), Hashtable also provides an Enumeration of the entries using the Hashtable.elements() method. In addition, a Hashtable.keys() method also returns an Enumeration over the keys of the Hashtable.

Hashtable<String,Object> tbl = ...;
for(Enumeration<String> keys = tbl.keys() ; tbl.hasMoreElements() ; ) {


Here is how you can decide when to use HashMap or Hashtable:

  • For using as a shared resource between multiple threads in a single program, a Hashtable is preferred.
  • When the dictionary needs to contain null keys or values, a HashMap must be used.
  • A HashMap can be used in a multi-threaded environment by wrapping it with Collections.synchronizedMap().

Converting String to Int in Java

1. Introduction

There are several ways of converting a String to an integer in Java. We present a few of them here with a discussion of the pros and cons of each.

2. Integer.parseInt()

Integer.parseInt(String) can parse a string and return a plain int value. The string value is parsed assuming a radix of 10. The string can contain “+” and “-” characters at the start to indicate a positive or negative number.

int value = Integer.parseInt("25"); // returns 25

int value = Integer.parseInt("-43"); // return -43

int value = Integer.parseInt("+9061"); // returns 9061

Illegal characters within the string (including period “.“) result in a NumberFormatException. Additionally a string containing a number larger than Integer.MAX_VALUE (231 - 1) also result in a NumberFormatException.

// throws NumberFormatException -- contains "."
int value = Integer.parseInt("25.0");

// throws NumberFormatException -- contains text
int value = Integer.parseInt("93hello");

// throws NumberFormatException -- too large
int value = Integer.parseInt("2367423890");

To explicitly specify the radix, use Integer.parseInt(String,int) and pass the radix as the second argument.

// returns 443
int value = Integer.parseInt("673", 8);

// throws NumberFormatException -- contains character "9" not valid for radix 8
int value = Integer.parseInt("9061", 8);

// returns 70966758 -- "h" and "i" are valid characters for radix 20.
int value = Integer.parseInt("123aghi", 20);

3. Integer.valueOf()

The static method Integer.valueOf() works similar to Integer.parseInt() with the difference that the method returns an Integer object rather than an int value.

// returns 733
Integer value = Integer.valueOf("733");

Use this method when you need an Integer object rather than a bare int. This method invokes Integer.parseInt(String) and creates an Integer from the result.

4. Integer.decode()

For parsing an integer starting with these prefixes: “0” for octal, “0x”, “0X” and “#” for hex, you can use the method Integer.decode(String). An optional “+” or “-” sign can precede the number. Al the following formats are supported by this method:

Signopt DecimalNumeral
Signopt 0x HexDigits
Signopt 0X HexDigits
Signopt # HexDigits
Signopt 0 OctalDigits

As with Integer.valueOf(), this method returns an Integer object rather than a plain int. Some examples follow:

// returns 53
Integer value = Integer.decode("0x35");

// returns 1194684
Integer value = Integer.decode("#123abc');

// throws NumberFormatException -- value too large
Integer value = Integer.decode("#123abcdef");

// returns -231
Integer value = Integer.decode("-0347");

5. Convert Large Values into Long

When the value being parsed does not fit in an integer (231-1), a NumberFormatException is thrown. In these cases, you can use analogous methods of the Long class: Long.parseLong(String), Long.valueOf(String) and Long.decode(String). These work similar to their Integer counterparts but return a Long object (or a long in the case of Long.parseLong(String)). The limit of a long is Long.LONG_MAX (defined to be 263-1).

// returns 378943640350
long value = Long.parseLong("378943640350");

// returns 3935157603823
Long value = Long.decode("0x39439abcdef");

6. Use BigInteger for Larger Numbers

Java provides another numeric type: java.math.BigInteger with an arbitrary precision. A disadvantage of using BigInteger is that common operations like adding, subtracting, etc require method invocation and cannot be used with operators like “+“, “-“, etc.

To convert a String to a BigInteger, just use the constructor as follows:

BigInteger bi = new BigInteger("3489534895489358943");

To specify a radix different than 10, use a different constructor:

BigInteger bi = new BigInteger("324789045345498589", 12);

7. Parse for Numbers Within Text

To parse for numbers interspersed with arbitrary text, you can use the java.util.Scanner class as follows:

String str = "hello123: we have a 1000 worlds out there.";
Scanner scanner = new Scanner(str).useDemiliter("\\D+");
while (s.hasNextInt())
  System.out.printf("(%1$d) ", s.nextInt());
// prints "(123) (1000)"

This method offers a powerful way of parsing for numbers, although it comes with the expense of using a regular expression scanner.


To summarize, there are various methods of converting a string to an int in java.

  • Integer.parseInt() is the simplest and returns an int.
  • Integer.valueOf() is similar but returns an Integer object.
  • Integer.decode() can parse numbers starting with “0x” and “0” as hex and octal respectively.
  • For larger numbers, use the corresponding methods in the Long class.
  • And for arbitrary precision integers, use the BigInteger class.
  • Finally, to parse arbitrary text for numbers, we can use the java.util.Scanner class with a regular expression.

InputStream to String Conversion in Java

1. Introduction

There are several ways of converting an InputStream to a String in java. Maybe you want to read the data and write it to a log file or do further processing. Here we look at several ways of accomplishing this task.

2. With InputStreamReader

Here is a simple implementation which uses the InputStreamReader to convert from bytes to characters. The code uses the platform default charset to decode the bytes. It reads input in chunks and appends the converted string to a StringBuilder.

private static String inputStreamToString(InputStream in)
    BufferedReader br = null;
    try {
	InputStreamReader isr = new InputStreamReader(in);
	br = new BufferedReader(isr);
	char cbuf[] = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len =, 0, cbuf.length)) != -1)
	    sbuf.append(cbuf, 0, len);
	return sbuf.toString();
    } finally {
	if ( br != null ) br.close();

3. Character Set Conversion

When converting input from a character set that is different from the platform default, you must specify the character set as follows:

private static String inputStreamToString(InputStream in,String charsetName)
    BufferedReader br = null;
    try {
	InputStreamReader isr = new InputStreamReader(in, charsetName);
	br = new BufferedReader(isr);
	char cbuf[] = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len =, 0, cbuf.length)) != -1)
	    sbuf.append(cbuf, 0, len);
	return sbuf.toString();
    } finally {
	if ( br != null ) br.close();

4. Using try-with-resources

When using a JDK 1.7 or later, you can use the try-with-resources block to eliminate some boilerplate code for exception handling:

private static String inputStreamToString(InputStream in,String charsetName)
    try (BufferedReader br = new BufferedReader(new InputStreamReader(in, charsetName))) {
	char cbuf[] = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len =, 0, cbuf.length)) != -1)
		sbuf.append(cbuf, 0, len);
	return sbuf.toString();

The try-with-resources block is used to automatically close resources when the block exits (whether normally or due to an exception).

try (BufferedReader br =
      new BufferedReader(new InputStreamReader(in, charsetName))) {
    // use the resource br here

5. With ByteArrayOutputStream

Another option for converting InputStream to String uses the ByteArrayOutputStream. Here you can accumulate the bytes read from the InputStream and perform the final conversion to the desired character set.

private static String inputStreamToString(InputStream in,String charsetName)
    try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
	byte buf[] = new byte[2048];
	int len;
	while ((len = != -1) out.write(buf, 0, len);
	return out.toString(charsetName);

6. Using Apache Commons IO

Converting an InputStream to String can be achieved in a single line by using Apache Commons IO:

private static String inputStreamToString(InputStream in,String charsetName)
    return IOUtils.toString(in, charsetName);

If you are using Maven as your build system, you need the following dependency:



In this article, you learned several ways of converting an InputStream to String. Depending on your circumstances, you can pick the most appropriate one for your needs.