Java – Read File Line by Line Using Java 8 Streams

1. Introduction

Java 8 Streams provide a cool facility to apply functional-style operations on Collection-based classes such as List and Map. These functional-style operations are very expressive and allow elimination of much boiler-plate code for processing pipelines which operate on  elements in these collections.

2. A Streams Example

An example is shown below. Here we process a list of airport names as follows:

1. Select airports that start with “B”

2. Convert the airport name to uppercase

3. Sort the list

4. And print out the element

List<String> airports =
    Arrays.asList("Birmingham-Shuttlesworth International",
		  "Anchorage International",
		  "Deadhorse",
		  "Phoenix Sky Harbor International",
		  "Tucson International",
		  "Los Angeles International",
		  "San Francisco International",
		  "Burbank Bob Hope Airport",
		  "Long Beach Airport",
		  "Oakland International");

airports
    .stream()
    .filter(a -> a.startsWith("B"))
    .map(String::toUpperCase)
    .sorted()
    .forEach(System.out::println);

// prints the following
// BIRMINGHAM-SHUTTLESWORTH INTERNATIONAL
// BURBANK BOB HOPE AIRPORT

Wouldn’t it be nice to apply these processing pipelines to other sequences such as the lines of a file? In this article, we show how to do exactly that.

4. Reading a File Line By Line

To read a file line by line in Java, we use a BufferedReader instance and read in a loop till all the lines are exhausted.

try (BufferedReader in = new BufferedReader(new FileReader(textFile))) {
    String line;
    while ((line = in.readLine()) != null) {
        // process line here
    }
}

3. Implement a Spliterator Class

To turn a BufferedReader into a class capable of being used with the Java 8 Streams API, we need to provide an implementation of the Spliterator interface. Shown is the LineReaderSpliterator class which implements Spliterator<String> and turns a BufferedReader into a stream of lines.

public class LineReaderSpliterator implements Spliterator<String>
{
    private final BufferedReader reader;
    private java.io.IOException exception;

    public LineReaderSpliterator(BufferedReader reader) {
	this.reader = reader;
    }

    public java.io.IOException ioException() { return exception; }

    public int characteristics() {
	return DISTINCT | NONNULL | IMMUTABLE;
    }

    public long estimateSize() {
	return Long.MAX_VALUE;
    }

    public boolean tryAdvance(Consumer<? super String> action) {
	try {
	    String line = reader.readLine();
	    if ( line != null ) {
		action.accept(line);
		return true;
	    } else return false;
	} catch(java.io.IOException ex) {
	    this.exception = ex;
	    return false;
	}
    }

    public Spliterator<String> trySplit() { return null; }
}

3.1 Constructor

The LineReaderSpliterator is initialized with an instance of BufferedReader which serves as the input line source.

public LineReaderSpliterator(BufferedReader reader) {
    this.reader = reader;
}

3.2 Characteristics

The characteristics of the Spliterator must be indicated with the implementation of the characteristics() method. The result must be an OR-ed values from the following:

ORDERED: indicates that the order of elements is defined. The ordering is expected to be preserved in parallel computations.

DISTINCT: Each element is distinct from another element.

SORTED: The sequence is sorted. In our case, the lines may not be sorted so this bit is not set.

SIZED: This bit must be set to indicate that the estimate of size returned by estimateSize() is correct. For our case, we do not know the number of lines in a file so this bit is not set.

NONNULL: Elements are guaranteed to be non-null.

IMMUTABLE: Requires that element source should not be modified to add, replace or remove elements.

CONCURRENT: Indicates that element source can be modified concurrently with additions, replacements and removals from multiple threads.

SUBSIZED: If the spliterator can be split and child spliterators are SIZED and SUBSIZED.

In our case, the spliterator is specified as DISTINCT, NONNULL and IMMUTABLE.

3.3 Size estimation

Our spliterator does not know the size of the collection since the number of lines in the BufferedReader is not known. If the size is now known or is unbounded, the method must return Long.MAX_VALUE.

public long estimateSize() {
    return Long.MAX_VALUE;
}

3.4 Process Next Element

The method to process the next element is the tryAdvance() method which accepts a functional interface Consumer<? super T>. Our implementation attempts to read a line and if successful (EOF not reached) invokes action.accept(). If EOF is reached or an exception occurs, the method return false to indicate end-of-sequence.

public boolean tryAdvance(Consumer<? super String> action) {
    try {
	String line = reader.readLine();
	if ( line != null ) {
	    action.accept(line);
	    return true;
	} else return false;
    } catch(java.io.IOException ex) {
	this.exception = ex;
	return false;
    }
}

3.5 Can the Spliterator split?

If the spliterator can partitioned to return separate ranges of elements, a new Spliterator must be returned. We cannot partition the input into separate sequences so we return null.

public Spliterator<String> trySplit() { return null; }

 4. Using the Stream

We can now use the Spliterator implementation to convert a BufferedReader into a Java 8 stream as follows:

static private Stream<String> createStreamReader(BufferedReader reader)
{
    LineReaderSpliterator s = new LineReaderSpliterator(reader);
    return StreamSupport.stream(s, false);
}

The earlier streams example can now be written as shown.

BufferedReader reader = null;
try {
    reader = new BufferedReader(new FileReader(textFile));
    createStreamReader(reader)
	.filter(a -> a.startsWith("B"))
	.map(String::toUpperCase)
	.sorted()
	.forEach(System.out::println);
} finally {
    if ( reader != null ) reader.close();
}

Check the output shown below:

BALTIMORE-WASHINGTON INTERNATIONAL
BANGOR INTERNATIONAL
BIRMINGHAM-SHUTTLESWORTH INTERNATIONAL
BRADLEY INTERNATIONAL
BURBANK BOB HOPE AIRPORT
BURLINGTON INTERNATIONAL

The example does not store the names in a data structure. Rather, names are directly read from the file and the processing pipeline is applied to the sequence of elements.

Summary

We have demonstrated how to read lines from a file and process it using Java 8 streams. This requires implementation of a Spliterator class for delivering a “stream” view of any sequence. The advantage of such an approach is the ease of filtering and processing text files.

Leave a Reply

Your email address will not be published. Required fields are marked *