Contents [hide]
1. Introduction
Java 8 Streams provide a cool facility to apply functional-style operations on Collection-based classes such as List and Map. These functional-style operations are very expressive and allow elimination of much boiler-plate code for processing pipelines which operate on elements in these collections.
2. A Streams Example
An example is shown below. Here we process a list of airport names as follows:
1. Select airports that start with “B”
2. Convert the airport name to uppercase
3. Sort the list
4. And print out the element
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | List<String> airports = Arrays.asList( "Birmingham-Shuttlesworth International" , "Anchorage International" , "Deadhorse" , "Phoenix Sky Harbor International" , "Tucson International" , "Los Angeles International" , "San Francisco International" , "Burbank Bob Hope Airport" , "Long Beach Airport" , "Oakland International" ); airports .stream() .filter(a -> a.startsWith( "B" )) .map(String::toUpperCase) .sorted() .forEach(System.out::println); // prints the following // BIRMINGHAM-SHUTTLESWORTH INTERNATIONAL // BURBANK BOB HOPE AIRPORT |
Wouldn’t it be nice to apply these processing pipelines to other sequences such as the lines of a file? In this article, we show how to do exactly that.
4. Reading a File Line By Line
To read a file line by line in Java, we use a BufferedReader instance and read in a loop till all the lines are exhausted.
1 2 3 4 5 6 | try (BufferedReader in = new BufferedReader( new FileReader(textFile))) { String line; while ((line = in.readLine()) != null ) { // process line here } } |
3. Implement a Spliterator Class
To turn a BufferedReader into a class capable of being used with the Java 8 Streams API, we need to provide an implementation of the Spliterator interface. Shown is the LineReaderSpliterator class which implements Spliterator<String> and turns a BufferedReader into a stream of lines.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | public class LineReaderSpliterator implements Spliterator<String> { private final BufferedReader reader; private java.io.IOException exception; public LineReaderSpliterator(BufferedReader reader) { this .reader = reader; } public java.io.IOException ioException() { return exception; } public int characteristics() { return DISTINCT | NONNULL | IMMUTABLE; } public long estimateSize() { return Long.MAX_VALUE; } public boolean tryAdvance(Consumer<? super String> action) { try { String line = reader.readLine(); if ( line != null ) { action.accept(line); return true ; } else return false ; } catch (java.io.IOException ex) { this .exception = ex; return false ; } } public Spliterator<String> trySplit() { return null ; } } |
3.1 Constructor
The LineReaderSpliterator is initialized with an instance of BufferedReader which serves as the input line source.
1 2 3 | public LineReaderSpliterator(BufferedReader reader) { this .reader = reader; } |
3.2 Characteristics
The characteristics of the Spliterator must be indicated with the implementation of the characteristics() method. The result must be an OR-ed values from the following:
ORDERED: indicates that the order of elements is defined. The ordering is expected to be preserved in parallel computations.
DISTINCT: Each element is distinct from another element.
SORTED: The sequence is sorted. In our case, the lines may not be sorted so this bit is not set.
SIZED: This bit must be set to indicate that the estimate of size returned by estimateSize() is correct. For our case, we do not know the number of lines in a file so this bit is not set.
NONNULL: Elements are guaranteed to be non-null.
IMMUTABLE: Requires that element source should not be modified to add, replace or remove elements.
CONCURRENT: Indicates that element source can be modified concurrently with additions, replacements and removals from multiple threads.
SUBSIZED: If the spliterator can be split and child spliterators are SIZED and SUBSIZED.
In our case, the spliterator is specified as DISTINCT, NONNULL and IMMUTABLE.
3.3 Size estimation
Our spliterator does not know the size of the collection since the number of lines in the BufferedReader is not known. If the size is now known or is unbounded, the method must return Long.MAX_VALUE.
1 2 3 | public long estimateSize() { return Long.MAX_VALUE; } |
3.4 Process Next Element
The method to process the next element is the tryAdvance() method which accepts a functional interface Consumer<? super T>. Our implementation attempts to read a line and if successful (EOF not reached) invokes action.accept(). If EOF is reached or an exception occurs, the method return false to indicate end-of-sequence.
1 2 3 4 5 6 7 8 9 10 11 12 | public boolean tryAdvance(Consumer<? super String> action) { try { String line = reader.readLine(); if ( line != null ) { action.accept(line); return true ; } else return false ; } catch (java.io.IOException ex) { this .exception = ex; return false ; } } |
3.5 Can the Spliterator split?
If the spliterator can partitioned to return separate ranges of elements, a new Spliterator must be returned. We cannot partition the input into separate sequences so we return null.
1 | public Spliterator<String> trySplit() { return null ; } |
4. Using the Stream
We can now use the Spliterator implementation to convert a BufferedReader into a Java 8 stream as follows:
1 2 3 4 5 | static private Stream<String> createStreamReader(BufferedReader reader) { LineReaderSpliterator s = new LineReaderSpliterator(reader); return StreamSupport.stream(s, false ); } |
The earlier streams example can now be written as shown.
1 2 3 4 5 6 7 8 9 10 11 | BufferedReader reader = null ; try { reader = new BufferedReader( new FileReader(textFile)); createStreamReader(reader) .filter(a -> a.startsWith( "B" )) .map(String::toUpperCase) .sorted() .forEach(System.out::println); } finally { if ( reader != null ) reader.close(); } |
Check the output shown below:
1 2 3 4 5 6 | BALTIMORE-WASHINGTON INTERNATIONAL BANGOR INTERNATIONAL BIRMINGHAM-SHUTTLESWORTH INTERNATIONAL BRADLEY INTERNATIONAL BURBANK BOB HOPE AIRPORT BURLINGTON INTERNATIONAL |
The example does not store the names in a data structure. Rather, names are directly read from the file and the processing pipeline is applied to the sequence of elements.
Summary
We have demonstrated how to read lines from a file and process it using Java 8 streams. This requires implementation of a Spliterator class for delivering a “stream” view of any sequence. The advantage of such an approach is the ease of filtering and processing text files.