Java Concurrency and Executors

1. Introduction

Java provides a package java.util.concurrent which includes many facilities to ease concurrent programming. In this article, we take a look at java threads and Executors – what they are and how they work, etc.

2. Runnable Task as a Lambda

Since the early days of Java 1.0, Java provides a Runnable interface which must be implemented to run a task in a separate thread. An sample implementation of a Runnable task is shown below. Note that rather than creating a class implementing the Runnable interface, the code creates a Runnable lambda. The task is then executed in the main thread as well as a second thread.

static private void example_1(String[] args) throws Exception
{
    Runnable task = () -> {
	String threadName = Thread.currentThread().getName();
	System.out.println("Hello " + threadName);
    };

    task.run();

    Thread thread = new Thread(task);
    thread.start();
    System.out.println("Done.");
}

The output is shown below. Note that the main thread completes before the second thread but waits for it to complete before the program terminates.

Hello main
Done.
Hello Thread-0

3. Creating an Executor

With the new Executor Framework in Java 8, java provides an Executor interface with a single method execute(Runnable) for executing a Runnable task. As opposed to creating a Thread with a Runnable (as shown above), you can create an Executor and run one or more tasks as follows:

Executor executor = ...;
executor.execute(task1);
executor.execute(task2);

The advantage of this approach is that the Executor implementation takes care of thread creation and management. To this end, java provides several implementations of this interface supporting various techniques such as executing tasks in a thread pool, executing tasks sequentially in a worker thread, etc.

In addition to the Executor interface, Java also provides an ExecutorService which is a sub-interface of Executor. This interface provides additional facilities over Executor including convenience methods to execute multiple tasks, submit tasks for execution, etc.

Executors is a convenience class providing factory methods to create various kinds of ExecutorService implementation objects. The method newSingleThreadExecutor() creates a thread and executes pending tasks sequentially. A task is submitted for execution using the submit() method.

static private void example_2(String[] args) throws Exception
{
    ExecutorService esvc = Executors.newSingleThreadExecutor();
    Runnable task = () -> {
	try {
	    String threadName = Thread.currentThread().getName();
	    System.out.println("Thread " + threadName + " started");
	    TimeUnit.SECONDS.sleep(2);
	    System.out.println("Thread " + threadName + " ended");
	} catch(InterruptedException ex) {
	    System.err.println("Task interrupted.");
	}
    };

    esvc.submit(task);
}

On running this example, the ExecutorService executes the submitted task and continues to wait for more tasks without exiting. This is the default for all ExecutorService objects. Hit Control-C to halt the program.

4. Serial Execution of Tasks

In addition to executing a Runnable, an ExecutorService can also run a Callable<?>. The following class implements Callable<String> which returns a Future<String> after the execution is completed. It can be scheduled for execution using an ExecutorService.

static private class DelayTask implements Callable<String>
{
    private String name;
    private int secsDelay;

    public DelayTask(String name,int secsDelay) {
	this.name = name;
	this.secsDelay = secsDelay;
    }

    @Override
	public String call() {
	System.out.println(name + " started");
	try { TimeUnit.SECONDS.sleep(secsDelay); }
	catch(InterruptedException ex) {
	    System.err.println(name + ": interrupted");
	}
	System.out.println(name + " ended");
	return name;
    }
}

Let us create a bunch of these tasks and execute them using an ExecutorService created from newSingleThreadExecutor(). This ExecutorService executes the tasks sequentially. At the end, we add another task to shutdown the ExecutorService. Due to Future<?>.get(), the code waits for all the tasks to terminate and then cleanly shuts down the ExecutorService which cleans up all the resources.

static private void example_3(String[] args) throws Exception
{
    ExecutorService esvc = Executors.newSingleThreadExecutor();
    List<Callable<String>> tasks =
	Arrays.asList(new DelayTask("task 1", 2),
		      new DelayTask("task 2", 3),
		      new DelayTask("task 3", 1),
		      () -> {
			  esvc.shutdown();
			  return "shutdown";
		      });
    esvc.invokeAll(tasks)
	.stream()
	.map(future -> {
		try { return future.get(); }
		catch(Exception ex) {
		    return "exception: " + ex.getMessage();
		}
	    })
	.forEach(System.out::println);
}

Here is the output from the code above. Notice that tasks are executed one after another followed by the shutdown task. This is because we used the newSingleThreadExecutor() method which creates an ExecutorService with that characteristic.

task 1 started
task 1 ended
task 2 started
task 2 ended
task 3 started
task 3 ended
task 1
task 2
task 3
shutdown

5. Executing Tasks with a Thread Pool

Let us now examine how a thread pool returned from newCachedThreadPool() behaves. The source code is shown below; it uses the DelayTask class defined above.

static private void example_4(String[] args) throws Exception
{
    ExecutorService esvc = Executors.newCachedThreadPool();
    List<Callable<String>> tasks =
	Arrays.asList(new DelayTask("task 1", 2),
		      new DelayTask("task 2", 3),
		      new DelayTask("task 3", 10),
		      () -> {
			  System.err.println("Requesting shutdown ..");
			  esvc.shutdown();
			  return "shutdown";
		      });
    esvc.invokeAll(tasks)
	.stream()
	.map(future -> {
		try { return future.get(); }
		catch(Exception ex) {
		    return "exception: " + ex.getMessage();
		}
	    })
	.forEach(System.out::println);
}

Here is the output from this code. Notice that a shutdown is requested of the ExecutorService after the tasks have been scheduled. The ExecutorService allows the tasks to complete execution before shutting down. It does not, however, allow any more tasks to be scheduled after shutdown() has been invoked.

Task task 1 started
Task task 2 started
Task task 3 started
Requesting shutdown ..
Task task 1 ended
Task task 2 ended
Task task 3 ended
task 1
task 2
task 3
shutdown

6. Shutdown Thread Pool Immediately

When it is necessary to shutdown the thread pool immediately as opposed to waiting for all tasks to complete, we can use the shutdownNow() method.

static private void example_5(String[] args) throws Exception
{
    ExecutorService esvc = Executors.newCachedThreadPool();
    List<Callable<String>> tasks =
	Arrays.asList(new DelayTask("task 1", 2),
		      new DelayTask("task 2", 3),
		      new DelayTask("task 3", 10),
		      () -> {
			  System.err.println("Requesting shutdown ..");
			  esvc.shutdownNow();
			  return "shutdown";
		      });
    esvc.invokeAll(tasks)
	.stream()
	.map(future -> {
		try { return future.get(); }
		catch(Exception ex) {
		    return "exception: " + ex.getMessage();
		}
	    })
	.forEach(System.out::println);
}

As the output below shows, the tasks that have not yet been completed are interrupted and the thread pool is terminated.

Task task 1 started
Task task 2 started
Task task 3 started
Requesting shutdown ..
task 3: interrupted
Task task 3 ended
task 1: interrupted
Task task 1 ended
task 2: interrupted
Task task 2 ended
task 1
task 2
task 3
shutdown

Summary

This article provided an introduction to threading in Java as well as the new Executor framework in Java 8. The Executor framework simplifies a lot of plumbing code related to thread creation and management. It also brings new capabilities to threading in Java including thread pools, worker threads which can schedule tasks sequentially. As the code above demonstrated, it is also possible to implement code to shutdown the thread pool cleanly after all the tasks have completed.

How to Generate RSA Keys in Java

Learn how to generate RSA keys and digitally sign files in java.

1. Introduction

Let us learn the basics of generating and using RSA keys in Java.

Java provides classes for the generation of RSA public and private key pairs with the package java.security. You can use RSA keys pairs in public key cryptography.

Public key cryptography uses a pair of keys for encryption. Distribute the public key to whoever needs it but safely secure the private key.

Public key cryptography can be used in two modes:

Encryption: Only the private key can decrypt the data encrypted with the public key.

Authentication: Data encrypted with the private key can only be decrypted with the public key thus proving who the data came from.

2. Generating a Key Pair

First step in creating an RSA Key Pair is to create a KeyPairGenerator from a factory method by specifying the algorithm (“RSA” in this instance):

KeyPairGenerator kpg = KeyPairGenerator.getInstance("RSA");

Initialize the KeyPairGenerator with the key size. Use a key size of 1024 or 2048. Currently recommended key size for SSL certificates used in e-commerce is 2048 so that is what we use here.

kpg.initialize(2048);
KeyPair kp = kpg.generateKeyPair();

From the KeyPair object, get the public key using getPublic() and the private key using getPrivate().

Key pub = kp.getPublic();
Key pvt = kp.getPrivate();

3. Saving the Keys in Binary Format

Save the keys to hard disk once they are obtained. This allows re-using the keys for encryption, decryption and authentication.

String outFile = ...;
out = new FileOutputStream(outFile + ".key");
out.write(pvt.getEncoded());
out.close();

out = new FileOutputStream(outFile + ".pub");
out.write(pvt.getEncoded());
out.close();

What is the format of the saved files? The key information is encoded in different formats for different types of keys. Here is how you can find what format the key was saved in. On my machine, the private key was saved in PKCS#8 format and the public key in X.509 format. We need this information below to load the keys.

System.err.println("Private key format: " + pvt.getFormat());
// prints "Private key format: PKCS#8" on my machine

System.err.println("Public key format: " + pub.getFormat());
// prints "Public key format: X.509" on my machine

3.1. Load Private Key from File

After saving the private key to a file (or a database), you might need to load it at a later time. You can do that using the following code. Note that you need to know what format the data was saved in: PKCS#8 in our case.

/* Read all bytes from the private key file */
Path path = Paths.get(keyFile);
byte[] bytes = Files.readAllBytes(path);

/* Generate private key. */
PKCS8EncodedKeySpec ks = new PKCS8EncodedKeySpec(bytes);
KeyFactory kf = KeyFactory.getInstance("RSA");
PrivateKey pvt = kf.generatePrivate(ks);

3.2 Load Public Key from File

Load the public key from a file as follows. The public key has been saved in X.509 format so we use the X509EncodedKeySpec class to convert it.

/* Read all the public key bytes */
Path path = Paths.get(keyFile);
byte[] bytes = Files.readAllBytes(path);

/* Generate public key. */
X509EncodedKeySpec ks = new X509EncodedKeySpec(bytes);
KeyFactory kf = KeyFactory.getInstance("RSA");
PublicKey pub = kf.generatePublic(ks);

4. Use Base64 for Saving Keys as Text

Save the keys in text format by encoding the data in Base64. Java 8 provides a Base64 class which can be used for the purpose. Save the private key with a comment as follows:

Base64.Encoder encoder = Base64.getEncoder();

String outFile = ...;
Writer out = new FileWriter(outFile + ".key");
out.write("-----BEGIN RSA PRIVATE KEY-----\n");
out.write(encoder.encodeToString(pvt.getEncoded()));
out.write("\n-----END RSA PRIVATE KEY-----\n");
out.close();

And the public key too (with a comment):

out = new FileWriter(outFile + ".pub");
out.write("-----BEGIN RSA PUBLIC KEY-----\n");
out.write(encoder.encodeToString(kp.getPublic()));
out.write("\n-----END RSA PUBLIC KEY-----\n");
out.close();

5. Generating a Digital Signature

As mentioned above, one of the purposes of public key cryptography is digital signature i.e. you generate a digital signature from a file contents, sign it with your private key and send the signature along with the file. The recipient can then use your public key to verify that the signature matches the file contents.

Here is how you can do it. Use the signature algorithm “SHA256withRSA” which is guaranteed to be supported on all JVMs. Use the private key (either generated or load from file as shown above) to initialize the Signature object for signing. It is then updated with contents from the data file and the signature is generated and written to the output file. This output file contains the digital signature and must be sent to the recipient for verification.

Signature sign = Signature.getInstance("SHA256withRSA");
sign.initSign(pvt);

InputStream in = null;
try {
    in = new FileInputStream(dataFile);
    byte[] buf = new byte[2048];
    int len;
    while ((len = in.read(buf)) != -1) {
    sign.update(buf, 0, len);
    }
} finally {
    if ( in != null ) in.close();
}

OutputStream out = null;
try {
    out = new FileOutputStream(signFile);
    byte[] signature = sign.sign();
    out.write(signature);
} finally {
    if ( out != null ) out.close();
}

6. Verifying the Digital Signature

The recipient uses the digital signature sent with a data file to verify that the data file has not been tampered with. It requires access to the sender’s public key and can be loaded from a file if necessary as presented above.

The code below updates the Signature object with data from the data file. It then loads the signature from file and uses Signature.verify() to check if the signature is valid.

Signature sign = Signature.getInstance("SHA256withRSA");
sign.initVerify(pub);

InputStream in = null;
try {
    in = new FileInputStream(dataFile);
    byte[] buf = new byte[2048];
    int len;
    while ((len = in.read(buf)) != -1) {
    sign.update(buf, 0, len);
    }
} finally {
    if ( in != null ) in.close();
}

/* Read the signature bytes from file */
path = Paths.get(signFile);
bytes = Files.readAllBytes(path);
System.out.println(dataFile + ": Signature " +
   (sign.verify(bytes) ? "OK" : "Not OK"));

And that in a nutshell is how you can use RSA public and private keys for digital signature and verification.

Source Code

Go here for the source code.

Java – Read File Line by Line Using Java 8 Streams

1. Introduction

Java 8 Streams provide a cool facility to apply functional-style operations on Collection-based classes such as List and Map. These functional-style operations are very expressive and allow elimination of much boiler-plate code for processing pipelines which operate on  elements in these collections.

2. A Streams Example

An example is shown below. Here we process a list of airport names as follows:

1. Select airports that start with “B”

2. Convert the airport name to uppercase

3. Sort the list

4. And print out the element

List<String> airports =
    Arrays.asList("Birmingham-Shuttlesworth International",
		  "Anchorage International",
		  "Deadhorse",
		  "Phoenix Sky Harbor International",
		  "Tucson International",
		  "Los Angeles International",
		  "San Francisco International",
		  "Burbank Bob Hope Airport",
		  "Long Beach Airport",
		  "Oakland International");

airports
    .stream()
    .filter(a -> a.startsWith("B"))
    .map(String::toUpperCase)
    .sorted()
    .forEach(System.out::println);

// prints the following
// BIRMINGHAM-SHUTTLESWORTH INTERNATIONAL
// BURBANK BOB HOPE AIRPORT

Wouldn’t it be nice to apply these processing pipelines to other sequences such as the lines of a file? In this article, we show how to do exactly that.

4. Reading a File Line By Line

To read a file line by line in Java, we use a BufferedReader instance and read in a loop till all the lines are exhausted.

try (BufferedReader in = new BufferedReader(new FileReader(textFile))) {
    String line;
    while ((line = in.readLine()) != null) {
        // process line here
    }
}

3. Implement a Spliterator Class

To turn a BufferedReader into a class capable of being used with the Java 8 Streams API, we need to provide an implementation of the Spliterator interface. Shown is the LineReaderSpliterator class which implements Spliterator<String> and turns a BufferedReader into a stream of lines.

public class LineReaderSpliterator implements Spliterator<String>
{
    private final BufferedReader reader;
    private java.io.IOException exception;

    public LineReaderSpliterator(BufferedReader reader) {
	this.reader = reader;
    }

    public java.io.IOException ioException() { return exception; }

    public int characteristics() {
	return DISTINCT | NONNULL | IMMUTABLE;
    }

    public long estimateSize() {
	return Long.MAX_VALUE;
    }

    public boolean tryAdvance(Consumer<? super String> action) {
	try {
	    String line = reader.readLine();
	    if ( line != null ) {
		action.accept(line);
		return true;
	    } else return false;
	} catch(java.io.IOException ex) {
	    this.exception = ex;
	    return false;
	}
    }

    public Spliterator<String> trySplit() { return null; }
}

3.1 Constructor

The LineReaderSpliterator is initialized with an instance of BufferedReader which serves as the input line source.

public LineReaderSpliterator(BufferedReader reader) {
    this.reader = reader;
}

3.2 Characteristics

The characteristics of the Spliterator must be indicated with the implementation of the characteristics() method. The result must be an OR-ed values from the following:

ORDERED: indicates that the order of elements is defined. The ordering is expected to be preserved in parallel computations.

DISTINCT: Each element is distinct from another element.

SORTED: The sequence is sorted. In our case, the lines may not be sorted so this bit is not set.

SIZED: This bit must be set to indicate that the estimate of size returned by estimateSize() is correct. For our case, we do not know the number of lines in a file so this bit is not set.

NONNULL: Elements are guaranteed to be non-null.

IMMUTABLE: Requires that element source should not be modified to add, replace or remove elements.

CONCURRENT: Indicates that element source can be modified concurrently with additions, replacements and removals from multiple threads.

SUBSIZED: If the spliterator can be split and child spliterators are SIZED and SUBSIZED.

In our case, the spliterator is specified as DISTINCT, NONNULL and IMMUTABLE.

3.3 Size estimation

Our spliterator does not know the size of the collection since the number of lines in the BufferedReader is not known. If the size is now known or is unbounded, the method must return Long.MAX_VALUE.

public long estimateSize() {
    return Long.MAX_VALUE;
}

3.4 Process Next Element

The method to process the next element is the tryAdvance() method which accepts a functional interface Consumer<? super T>. Our implementation attempts to read a line and if successful (EOF not reached) invokes action.accept(). If EOF is reached or an exception occurs, the method return false to indicate end-of-sequence.

public boolean tryAdvance(Consumer<? super String> action) {
    try {
	String line = reader.readLine();
	if ( line != null ) {
	    action.accept(line);
	    return true;
	} else return false;
    } catch(java.io.IOException ex) {
	this.exception = ex;
	return false;
    }
}

3.5 Can the Spliterator split?

If the spliterator can partitioned to return separate ranges of elements, a new Spliterator must be returned. We cannot partition the input into separate sequences so we return null.

public Spliterator<String> trySplit() { return null; }

 4. Using the Stream

We can now use the Spliterator implementation to convert a BufferedReader into a Java 8 stream as follows:

static private Stream<String> createStreamReader(BufferedReader reader)
{
    LineReaderSpliterator s = new LineReaderSpliterator(reader);
    return StreamSupport.stream(s, false);
}

The earlier streams example can now be written as shown.

BufferedReader reader = null;
try {
    reader = new BufferedReader(new FileReader(textFile));
    createStreamReader(reader)
	.filter(a -> a.startsWith("B"))
	.map(String::toUpperCase)
	.sorted()
	.forEach(System.out::println);
} finally {
    if ( reader != null ) reader.close();
}

Check the output shown below:

BALTIMORE-WASHINGTON INTERNATIONAL
BANGOR INTERNATIONAL
BIRMINGHAM-SHUTTLESWORTH INTERNATIONAL
BRADLEY INTERNATIONAL
BURBANK BOB HOPE AIRPORT
BURLINGTON INTERNATIONAL

The example does not store the names in a data structure. Rather, names are directly read from the file and the processing pipeline is applied to the sequence of elements.

Summary

We have demonstrated how to read lines from a file and process it using Java 8 streams. This requires implementation of a Spliterator class for delivering a “stream” view of any sequence. The advantage of such an approach is the ease of filtering and processing text files.

How to Modify XML File in Java

1. Introduction

Let us learn how to modify an XML file to remove unwanted information.

One method to remove XML nodes is to use the XML DOM Api to search the XML structure and remove unwanted nodes. While this sounds easy, using the DOM Api is quite hard especially for anything more than trivial searches as this article demonstrates.

An easier method to navigate and remove unwanted Nodes is to use XPath. Even complex search and removal is quite easy as we shall see.

See this article for details on parsing an XML file to obtain the XML Document.

2. Using removeChild() to remove Nodes

Once a particular node is identified for removal, it can be removed quite easily by invoking removeChild() on the parent Node.

static private void removeNode(Node node)
{
  Node parent = node.getParentNode();
  if ( parent != null ) parent.removeChild(node);
}

2.1 Saving the Modified XML Document

After the required modifications are done, the XML Document can be saved by using a Transformer.

Initialize the Transformer as shown:

tform = TransformerFactory.newInstance().newTransformer();
tform.setOutputProperty(OutputKeys.INDENT, "yes");
tform.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");

Save the modified XML document quite easily using the transformer instance.

tform.transform(new DOMSource(document), new StreamResult(System.out));

3. Searching the XML Document

The XML data set we are using is the TSA airport and checkpoint data available here. We would like to search this data set for the airport in Mobile, AL (identified as <shortcode>MOB</shortcode> in the data set). The following code checks each node whether it matches the query.

static private boolean testNode(Node node)
{
    NodeList nlist = node.getChildNodes();
    for (int i = 0 ; i < nlist.getLength() ; i++) {
	Node n = nlist.item(i);
	String name = n.getLocalName();
	if ( name != null && name.equals("shortcode") ) {
	    return n.getTextContent().equals("MOB");
	}
    }
    return false;
}

Collect the nodes to be removed by searching from the document root.

List<Node> nodes = new ArrayList<>();
NodeList nlist = document.getFirstChild().getChildNodes();
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node node = nlist.item(i);
    if ( testNode(node) ) nodes.add(node);
}

As you can see from the implementation of testNode(), complex XML search is hard using just the DOM API.

4. Using XPath to Find and Remove Nodes

XPath can be used to easily query for nodes within an XML document.

An initial setup process is required for using XPath to search.

XPathFactory xfact = XPathFactory.newInstance();
XPath xpath = xfact.newXPath();

Here is a method to query for nodes and remove them from the document.

static private void queryRemoveNodes(String xpathStr)
{
Object res = xpath.evaluate(xpathStr, document, PathConstants.NODESET);
NodeList nlist = (NodeList)res;
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node node = nlist.item(i);
    Node parent = node.getParentNode();
    if ( parent != null ) parent.removeChild(node);
}
}

The previous example to remove the airport for Mobile, AL is written as:

queryRemoveNode("/airports/airport[shortcode = 'MOB']");

The removed node is:

<airport>
  <name>Mobile Regional</name>
  <shortcode>MOB</shortcode>
  <city>Mobile</city>
  <state>AL</state>
  <latitude>30.6813</latitude>
  <longitude>-88.2443</longitude>
  <utc>-6</utc>
  <dst>True</dst>
  <precheck>true</precheck>
  <checkpoints>
    <checkpoint>
      <id>1</id>
      <longname>MOB-A</longname>
      <shortname>MOB-A</shortname>
    </checkpoint>
  </checkpoints>
</airport>

Furthermore, to remove just the <checkpoints> element from the above node, use the following:

queryRemoveNodes("/airports/airport[shortcode = "MOB"]/checkpoints");

Easily remove a bunch of nodes matching an expression.

queryRemoveNodes("/airports/airport[latitude < 20]");

Summary

There are two ways of removing nodes from an XML document. The direct method is to search for nodes using the DOM Api and remove them. An easier way is to use XPath to query and remove the nodes matching even complex queries.

How to Extract Data from XML in Java

1. Introduction

In a previous article, we looked into parsing an XML file and converting it to DOM (Document Object Model). The XML DOM object itself is not very useful in an application unless it can be used to extract required data. In this article, let us see how to extract data from XML in Java.

We demonstrate two approaches to extracting data from the XML document. One is a straightforward navigation of the DOM structure to extract fragments of data. Another way is to use XPath to describe and extract the exact information needed with an expression.

2. Accessing the XML Root Element

The most commonly used class in the DOM API is the Node class. All other types of XML artifacts are represented as a Node. These include elements, attributes, text within elements, CDATA, etc.

The most common type of Node we will be concerned with is the element. An element node has attributes, zero or more child elements, text nodes, etc.

A Document is a special type of Node which is obtained as a result of parsing the XML. Use the getFirstChild() method of a Document to get the XML root element.

Node rootElement = document.getFirstChild();

3. Accessing XML Element Children

Access the list of children of an element with the getChildNodes() method. A list of child nodes including elements, text nodes, CDATA, comments, etc are returned. It can be processed like this:

NodeList nlist = node.getChildNodes();
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node child = nlist.item(i);
    // process the child node here
}

A child element and its text contents can be checked as follows: A child element <shortcode>PBI</shortcode> is selected for processing here.

String name = child.getLocalName();
if ( name != null && name.equals("shortcode") ) {
    if ( child.getTextContent().equals("PBI") ) {
        // process element here
    }
}

4. Generating XML Output

Print the whole XML fragment from a node once it is selected. This includes all the child nodes, text, attributes, etc.

Create a Transformer object from the factory object:

Transformer tform = TransformerFactory.newInstance().newTransformer();

Pretty-printing the XML helps in visualizing the structure. You can enable pretty-printing as shown. Here an indentation of 2 spaces is being specified.

tform.setOutputProperty(OutputKeys.INDENT, "yes");
tform.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");

And generate the XML output from a Node object for printing:

tform.transform(new DOMSource(node), new StreamResult(System.out));

5. A More Complex Example

Let us look at a more complex example of XML data extraction with some real-world data. The XML data set we are using is the publicly available TSA airport and checkpoint data available here (warning: large file download). This data includes airport information including GPS coordinates and checkpoints.

Let us search this XML data set for information within specified GPS coordinates: locate airports within latitudes range of (25, 30), longitude range of (-90, -80). We search for matching nodes from the root node of the XML.

List<Node> res = new ArrayList<>();
NodeList nlist = rootNode.getChildNodes();
for (int i = 0 ; i < nlist.getLength() ; i++) {
    Node node = nlist.item(i);
    NodeList children = node.getChildNodes();
    boolean foundLat = false, foundLong = false;
    for (int j = 0 ; j < children.getLength() ; j++) {
	Node child = children.item(j);
	String name = child.getLocalName();
	if ( name == null ) continue;
	if ( name.equals("latitude") ) {
	    float lat = Float.parseFloat(child.getTextContent());
	    if ( lat > 25 && lat < 30 ) foundLat = true;
	} else if ( name.equals("longitude") ) {
	    float lng = Float.parseFloat(child.getTextContent());
	    if ( lng > -90 && lng < -80 ) foundLong = true;
	}
    }
    if ( foundLat && foundLong ) res.add(node);
}

The code above loops through all elements under the root node and selects those children which match the specified conditions: latitude between (25, 30) and longitude between (-90, -80).

As you can see, the code is quite complex and prone to errors. And this is just for finding nodes for some rather simple conditions.

6. Using XPath to Extract Information

Java provides an XPath API which can be used in conjunction with the XML DOM to extract information from XML in an easy manner. XPath in initialized with the application as follows:

XPathFactory xfact = XPathFactory.newInstance();
XPath xpath = xfact.newXPath();

To extract possibly multiple nodes which match an XPath expression, the following method can be used.

Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODESET);

If you know that a single node will match the expression, you can use this method instead.

Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODE);

Maybe you are trying to extract application configuration information from XML? In that case, you might prefer fetching String values in a single call.

String value = xpath.evaluate(xpathStr, document);

7. Some Examples

To compare with the earlier examples, let us find the airport node where <shortcode> equals “PBI“:

String xpathStr = "/airports/airport[shortcode = 'PBI']";
Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODESET);
show((NodeList)res);

Results in the output shown below (partially):

...
  <shortcode>PBI</shortcode>
  <city>West Palm Beach</city>
  <state>FL</state>
  <latitude>26.6831606</latitude>
  <longitude>-80.0955892</longitude>
  <utc>-5</utc>
  <dst>True</dst>
...

And here is the second example: find airports with latitude between (25, 30) and longitude between (-90, -80).

String xpathStr = "/airports/airport[latitude > 25 and latitude < 30 and longitude > -90 and longitude < -80]";
Object res = xpath.evaluate(xpathStr, document, XPathConstants.NODESET);
show((NodeList)res);

Summary

This article demonstrated a couple of ways of extracting data from XML documents. A direct way is to navigate the DOM structure and perform the extraction. This is error prone and sensitive to changes in XML structure. An easier way is to use XPath expression search to extract required information.

How do I Create a Java String from the Contents of a File?

1. Introduction

Here we present a few ways to read the whole contents of a file into a String in Java.

2. Using java.nio.file in Java 7+

Quite simple to load the whole file into a String using the java.nio.file package:

String str = new String(Files.readAllBytes(Paths.get(pathname)),
                        StandardCharsets.UTF_8);

Here is how the code works. Read all the bytes into a byte buffer using the Files and Paths available in the java.nio.file package.

byte[] buf = Files.readAllBytes(Paths.get(pathname));

Convert the byte buffer into a String by specifying the character set.

String str = new String(buf, StandardCharsets.UTF_8);

3. Scan for end-of-input

Another way to read the whole file into a String is to use the Scanner class. Create a scanner with the file as input, set the appropriate delimiter and read the next token.

Note: the actual delimiter used in the code below is the beginning-of-input marker which will not match anywhere other than the beginning of input.

Scanner scanner = null;
try {
    scanner = new Scanner(new File(pathname), "UTF-8");
    return scanner.useDelimiter("\\A").next();
} finally {
    if ( scanner != null ) scanner.close();
}

4. Memory Mapped File Reading

This method maps the file contents directly into memory using the MappedByteBuffer class. Memory mapping the contents directly might lead one to expect enhanced performance. However this advantage is only available if the buffer is used directly. In our case, since we are creating a String from the contents of the file, the speed advantage of the memory mapped buffers is probably not visible.

static private String readFile3(String pathname)
    throws java.io.IOException
{
    File f = new File(pathname);
    RandomAccessFile file = new RandomAccessFile(pathname, "r");
    MappedByteBuffer buffer = file.getChannel().map(MapMode.READ_ONLY,
						    0,
						    f.length());
    file.close();
    return new StringBuilder(StandardCharsets.UTF_8.decode(buffer))
	.toString();
}

ByteBuffer provides a method asCharBuffer() which returns a “view” of the byte buffer as a character buffer. However, there is no way to specify the encoding for converting bytes to characters with this method — probably an oversight in the Java API. The correct way to convert a ByteBuffer to CharBuffer is to use CharSet.decode(ByteBuffer) with the appropriate CharSet instance.

5. Simple Way Using java.io

Of course, there is always the “old” way (pre-Java 1.7) of reading a whole file into a String: reading the characters in a loop and appending to a buffer.

static private String readFile3(String pathname)
    throws java.io.IOException
{
    FileReader in = null;
    try {
	in = new FileReader(pathname);
	char[] buf = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len = in.read(buf, 0, buf.length)) != -1) {
	    sbuf.append(buf, 0, len);
	}
	return sbuf.toString();
    } finally {
	if ( in != null ) in.close();
    }
}

6. Benchmarking Various Approaches

Since we have several methods of reading a whole file into a string, it is interesting to see how the methods stack up against one another in performance. To this end, we implemented a simple benchmarking method using System.currentTimeMillis(). The following is the average time for each run over 1000 runs of the method.

simple       235 ms for 1000 iters: 0.235000 ms/op
nio          213 ms for 1000 iters: 0.213000 ms/op
scanner      629 ms for 1000 iters: 0.629000 ms/op
mmap         285 ms for 1000 iters: 0.285000 ms/op

For the set of conditions under which the application ran, we can conclude that the NIO method is the fastest followed by the Simple method. Slowest is the Scanner method which is somewhat expected since a regular expression search is involved. A note of warning: do not use these benchmark numbers to pick the method to use. Rather use the method closest to your paradigm of problem solving.

Conclusion

You are now aware of various methods of reading the whole contents of a file into a String. Pick whatever suits you best and use it!

How to Convert UTF-16 Text File to UTF-8 in Java?

1. Introduction

In this article, we show how to convert a text file from UTF-16 encoding to UTF-8. Such a conversion might be required because certain tools can only read UTF-8 text. Furthermore, the conversion procedure demonstrated here can be applied to convert a text file from any supported encoding to another.

UTF-8 is a character encoding that can represent all characters (or code points) defined by Unicode. It is designed to be backward compatible with legacy encodings such as ASCII.

UTF-16 is another character encoding that encodes characters in one or two 16-bit code units whereas UTF-8 encodes characters in a variable number of 8-bit code units.

2. Supported Character Sets

You can find the characters sets supported by the JVM using the class java.nio.charset.Charset as follows:

for (Map.Entry e : Charset.availableCharsets().entrySet()) {
     System.out.println(e.getKey());
}

// prints the following
Big5
Big5-HKSCS
CESU-8
IBM-Thai
...
US-ASCII
UTF-16
UTF-16BE
UTF-16LE
UTF-32
UTF-32BE
UTF-32LE
UTF-8
...

3. Conversion Using java.io Classes

Java provides java.io.InputStreamReader class as a bridge between byte streams to character streams. Open the file using this class to be able to read character buffers in the specified encoding:

Reader in = new InputStreamReader(new FileInputStream(infile), "UTF-16");

Analogously, the class java.io.OutputStreamWriter acts as a bridge between characters streams and bytes streams. Create a Writer with this class to be able to write bytes to the file:

Writer out = new OutputStreamWriter(new FileOutputStream(outfile), "UTF-8");

With the Reader and Writer in place, it is trivial to copy data from the input file to the output file:

char cbuf[] = new char[2048];
int len;
while ((len = in.read(cbuf, 0, cbuf.length)) != -1) {
    out.write(cbuf, 0, len);
}

And that’s it! You have successfully read and converted data from UTF-16 to UTF-8. You can use this code to perform the conversion between any two character sets supported by your JVM.

4. Using String for Converting Bytes

Sometimes, you may have a byte array which you need converted and output in a specific encoding. You can use the String class for these cases as shown below. First convert the byte array into a String:

String str = new String(bytes, 0, len, "UTF-16");

Next, obtain the bytes in the required encoding by using the String.getBytes(String) method:

byte[] outbytes = str.getBytes("UTF-8");

Write the byte array to an OutputStream:

OutputStream out = new FileOutputStream(outfile);
out.write(outbytes);
out.close();

Note that while you could use the String class as shown to convert bytes, you should prefer using Reader/Writer combination when possible to avoid problems with multi-byte characters. Specifically, the byte array you have read may contain an incomplete multi-byte character at the beginning or the end. This may lead to character encoding errors.

Conclusion

When you need to convert text from one character encoding to another in Java, you have several options:

  • Using InputStreamReader and OutputStreamWriter bridge classes for conversion.
  • Using the String class directly with specified encoding.
  • A more advanced option is to use CharsetEncoder and CharsetDecoder class (not presented in this article).

Difference Between HashMap and Hashtable in Java

1. Introduction

Java provides several ways of storing key-value maps (also known as dictionaries). The most common ones are java.util.HashMap and java.util.Hashtable. Let us explore the difference between these two classes.

2. Synchronization

Synchronization is a mechanism in Java for preventing multiple threads from interfering with each other and eliminating memory consistency errors.

When one variable (a resource) is visible to multiple threads at the same time, consistency issues arise when one thread attempts to modify the value while another thread is accessing it. To prevent these issues, some form of synchronization must be used.

While synchronization helps in eliminating consistency errors, it adds an overhead when used. In a single-threaded program (or when you can guarantee that a single thread will access the resource), you can use HashMap to eliminate this overhead. Create a HashMap as follows:

HashMap<String,Object> map = new HashMap<>();
map.put("currentTime", new Date());

However, when accessing or modifying a dictionary shared between multiple threads, you must use Hashtable. The following shows how to create a Hashtable:

Hashtable<String,Integer> tbl = new Hashtable();
tbl.put("count", 32);

3. Using Null Keys or Values

When your dictionary needs to contain null keys or values, you cannot use Hashtable since this is not allowed. You must use HashMap in this case.

If you need multiple threads reading or writing the HashMap, you can wrap the HashMap using Collections.synchronizedMap() as follows:

HashMap<String,Object> map = Collections.synchronizedMap(new HashMap<>());
map.put(...);

A HashMap can contain one null key and any number of nulls for values.

4. Predictable Iteration Order

A subclass of HashMap is LinkedHashMap which maintains a doubly-linked list of the entries in the Map. This allows traversal of the Map entries in a predictable order (in the order that the entries were inserted into the Map). If you need such a predictable ordering of the entries, then you can easily replace the HashMap with a LinkedHashMap as follows:

HashMap<String,Object> map = new LinkedHashMap<>();
map.put(...);

When using a Hashtable, such a predictable iteration order is not possible. If this is required, use a LinkedHashMap with a Collections.synchronizedMap() wrapper as above.

5. Iterating using Enumerator

While both HashMap and Hashtable support iteration over the entries using the entrySet(), Hashtable also provides an Enumeration of the entries using the Hashtable.elements() method. In addition, a Hashtable.keys() method also returns an Enumeration over the keys of the Hashtable.

Hashtable<String,Object> tbl = ...;
for(Enumeration<String> keys = tbl.keys() ; tbl.hasMoreElements() ; ) {
  System.out.println(keys.nextElement());
}

Conclusion

Here is how you can decide when to use HashMap or Hashtable:

  • For using as a shared resource between multiple threads in a single program, a Hashtable is preferred.
  • When the dictionary needs to contain null keys or values, a HashMap must be used.
  • A HashMap can be used in a multi-threaded environment by wrapping it with Collections.synchronizedMap().

Converting String to Int in Java

1. Introduction

There are several ways of converting a String to an integer in Java. We present a few of them here with a discussion of the pros and cons of each.

2. Integer.parseInt()

Integer.parseInt(String) can parse a string and return a plain int value. The string value is parsed assuming a radix of 10. The string can contain “+” and “-” characters at the start to indicate a positive or negative number.

int value = Integer.parseInt("25"); // returns 25

int value = Integer.parseInt("-43"); // return -43

int value = Integer.parseInt("+9061"); // returns 9061

Illegal characters within the string (including period “.“) result in a NumberFormatException. Additionally a string containing a number larger than Integer.MAX_VALUE (231 - 1) also result in a NumberFormatException.

// throws NumberFormatException -- contains "."
int value = Integer.parseInt("25.0");

// throws NumberFormatException -- contains text
int value = Integer.parseInt("93hello");

// throws NumberFormatException -- too large
int value = Integer.parseInt("2367423890");

To explicitly specify the radix, use Integer.parseInt(String,int) and pass the radix as the second argument.

// returns 443
int value = Integer.parseInt("673", 8);

// throws NumberFormatException -- contains character "9" not valid for radix 8
int value = Integer.parseInt("9061", 8);

// returns 70966758 -- "h" and "i" are valid characters for radix 20.
int value = Integer.parseInt("123aghi", 20);

3. Integer.valueOf()

The static method Integer.valueOf() works similar to Integer.parseInt() with the difference that the method returns an Integer object rather than an int value.

// returns 733
Integer value = Integer.valueOf("733");

Use this method when you need an Integer object rather than a bare int. This method invokes Integer.parseInt(String) and creates an Integer from the result.

4. Integer.decode()

For parsing an integer starting with these prefixes: “0” for octal, “0x”, “0X” and “#” for hex, you can use the method Integer.decode(String). An optional “+” or “-” sign can precede the number. Al the following formats are supported by this method:

Signopt DecimalNumeral
Signopt 0x HexDigits
Signopt 0X HexDigits
Signopt # HexDigits
Signopt 0 OctalDigits

As with Integer.valueOf(), this method returns an Integer object rather than a plain int. Some examples follow:

// returns 53
Integer value = Integer.decode("0x35");

// returns 1194684
Integer value = Integer.decode("#123abc');

// throws NumberFormatException -- value too large
Integer value = Integer.decode("#123abcdef");

// returns -231
Integer value = Integer.decode("-0347");

5. Convert Large Values into Long

When the value being parsed does not fit in an integer (231-1), a NumberFormatException is thrown. In these cases, you can use analogous methods of the Long class: Long.parseLong(String), Long.valueOf(String) and Long.decode(String). These work similar to their Integer counterparts but return a Long object (or a long in the case of Long.parseLong(String)). The limit of a long is Long.LONG_MAX (defined to be 263-1).

// returns 378943640350
long value = Long.parseLong("378943640350");

// returns 3935157603823
Long value = Long.decode("0x39439abcdef");

6. Use BigInteger for Larger Numbers

Java provides another numeric type: java.math.BigInteger with an arbitrary precision. A disadvantage of using BigInteger is that common operations like adding, subtracting, etc require method invocation and cannot be used with operators like “+“, “-“, etc.

To convert a String to a BigInteger, just use the constructor as follows:

BigInteger bi = new BigInteger("3489534895489358943");

To specify a radix different than 10, use a different constructor:

BigInteger bi = new BigInteger("324789045345498589", 12);

7. Parse for Numbers Within Text

To parse for numbers interspersed with arbitrary text, you can use the java.util.Scanner class as follows:

String str = "hello123: we have a 1000 worlds out there.";
Scanner scanner = new Scanner(str).useDemiliter("\\D+");
while (s.hasNextInt())
  System.out.printf("(%1$d) ", s.nextInt());
// prints "(123) (1000)"

This method offers a powerful way of parsing for numbers, although it comes with the expense of using a regular expression scanner.

Summary

To summarize, there are various methods of converting a string to an int in java.

  • Integer.parseInt() is the simplest and returns an int.
  • Integer.valueOf() is similar but returns an Integer object.
  • Integer.decode() can parse numbers starting with “0x” and “0” as hex and octal respectively.
  • For larger numbers, use the corresponding methods in the Long class.
  • And for arbitrary precision integers, use the BigInteger class.
  • Finally, to parse arbitrary text for numbers, we can use the java.util.Scanner class with a regular expression.

InputStream to String Conversion in Java

1. Introduction

There are several ways of converting an InputStream to a String in java. Maybe you want to read the data and write it to a log file or do further processing. Here we look at several ways of accomplishing this task.

2. With InputStreamReader

Here is a simple implementation which uses the InputStreamReader to convert from bytes to characters. The code uses the platform default charset to decode the bytes. It reads input in chunks and appends the converted string to a StringBuilder.

private static String inputStreamToString(InputStream in)
    throws java.io.IOException
{
    BufferedReader br = null;
    try {
	InputStreamReader isr = new InputStreamReader(in);
	br = new BufferedReader(isr);
	char cbuf[] = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len = br.read(cbuf, 0, cbuf.length)) != -1)
	    sbuf.append(cbuf, 0, len);
	return sbuf.toString();
    } finally {
	if ( br != null ) br.close();
    }
}

3. Character Set Conversion

When converting input from a character set that is different from the platform default, you must specify the character set as follows:

private static String inputStreamToString(InputStream in,String charsetName)
    throws java.io.IOException
{
    BufferedReader br = null;
    try {
	InputStreamReader isr = new InputStreamReader(in, charsetName);
	br = new BufferedReader(isr);
	char cbuf[] = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len = br.read(cbuf, 0, cbuf.length)) != -1)
	    sbuf.append(cbuf, 0, len);
	return sbuf.toString();
    } finally {
	if ( br != null ) br.close();
    }
}

4. Using try-with-resources

When using a JDK 1.7 or later, you can use the try-with-resources block to eliminate some boilerplate code for exception handling:

private static String inputStreamToString(InputStream in,String charsetName)
    throws java.io.IOException
{
    try (BufferedReader br = new BufferedReader(new InputStreamReader(in, charsetName))) {
	char cbuf[] = new char[2048];
	int len;
	StringBuilder sbuf = new StringBuilder();
	while ((len = br.read(cbuf, 0, cbuf.length)) != -1)
		sbuf.append(cbuf, 0, len);
	return sbuf.toString();
    }
}

The try-with-resources block is used to automatically close resources when the block exits (whether normally or due to an exception).

try (BufferedReader br =
      new BufferedReader(new InputStreamReader(in, charsetName))) {
    // use the resource br here
    }

5. With ByteArrayOutputStream

Another option for converting InputStream to String uses the ByteArrayOutputStream. Here you can accumulate the bytes read from the InputStream and perform the final conversion to the desired character set.

private static String inputStreamToString(InputStream in,String charsetName)
    throws java.io.IOException
{
    try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
	byte buf[] = new byte[2048];
	int len;
	while ((len = in.read(buf)) != -1) out.write(buf, 0, len);
	return out.toString(charsetName);
    }
}

6. Using Apache Commons IO

Converting an InputStream to String can be achieved in a single line by using Apache Commons IO:

private static String inputStreamToString(InputStream in,String charsetName)
    throws java.io.IOException
{
    return IOUtils.toString(in, charsetName);
}

If you are using Maven as your build system, you need the following dependency:

<dependencies>
  <dependency>
    <groupId>commons-io</groupId>
    <artifactId>commons-io</artifactId>
    <version>2.4</version>
  </dependency>
</dependencies>

Conclusion

In this article, you learned several ways of converting an InputStream to String. Depending on your circumstances, you can pick the most appropriate one for your needs.