Java Collections – Set

A Set is a container which does not allow duplicate elements. Common Set implementations include HashSet, LinkedHashSet and TreeSet.

“Quotation, n: The act of repeating erroneously the words of another.”
― Ambrose Bierce, The Unabridged Devil’s Dictionary

Check out the earlier parts of this Java Collections Series: Introduction and Part 1.

1. Introduction

A Set is a container which does not contain duplicate elements. The item method equals() is used to determine whether the object is already present in the Set. At most one null element is allowed in the Set.

Commonly used concrete implementations of the Set interface include HashSet, LinkedHashSet and TreeSet, which are explained in detail below.

2. HashSet Example

A HashSet is the most common of the Set family. Use it when you do not need any ordering of the Set items. Specifically, it does not provide any ordering of items when iterating over the items.

Set<String> names = new HashSet<>();
names.addAll(Arrays.asList("Mary",
                           "Annie",
                           "Anna",
                           "Margaret",
                           "Helen",
                           "Elsie",
                           "Lucy",
                           "Dorothy",
                           "Mary",
                           "Margaret"));
names.forEach(System.out::println);

// prints:
Margaret
Dorothy
Elsie
Annie
Lucy
Helen
Mary
Anna

Note that the order of items extracted from the Set demonstrates no predictable order. Also note the property of the Set: no duplicate elements; even though “Mary” and “Margaret” were inserted twice.

3. LinkedHashSet Example

Let us now look at the same example with a LinkedHashSet. When extracting using forEach() you can see that the order of items is the order they were inserted into the Set. The LinkedHashSet provides this capability by maintaining a linked list of items.

Set<String> names = new LinkedHashSet<>();
names.addAll(Arrays.asList("Mary",
                           "Annie",
                           "Anna",
                           "Margaret",
                           "Helen",
                           "Elsie",
                           "Lucy",
                           "Dorothy",
                           "Mary",
                           "Margaret"));
names.forEach(System.out::println);

// prints:
Mary
Annie
Anna
Margaret
Helen
Elsie
Lucy
Dorothy

4. TreeSet Example

A TreeSet provides an implementation of a Set where the items are ordered with compareTo(). Alternatively, a Comparator implementation can be provided while constructing the TreeSet.

4.1. Natural Ordering

The following example illustrates the natural ordering of elements with the TreeSet, which for a String is dictionary ordering.

Set<String> names = new TreeSet<>();
names.addAll(Arrays.asList("Mary",
                           "Annie",
                           "Anna",
                           "Margaret",
                           "Helen",
                           "Elsie",
                           "Lucy",
                           "Dorothy",
                           "Mary",
                           "Margaret"));
names.forEach(System.out::println);

// prints:
Anna
Annie
Dorothy
Elsie
Helen
Lucy
Margaret
Mary

Here is another example where we generate some random numbers in a stream and store it in a TreeSet. The numbers are ordered in the “natural ordering” which for numbers is ascending order.

Random random = new Random();
TreeSet<Long> linenos = random
    .longs(0, ntotal)
    .limit(nlines)
    .collect(TreeSet::new, Set::add, Set::addAll);
System.out.println();
linenos.stream().forEach(System.out::println);

// prints
372
575
1342
4362
5357
6436
12666
13882

4.2. Providing a Comparator

How about providing a Comparator to the TreeSet so the numbers are sorted in the reverse order?

Random random = new Random();
TreeSet<Long> linenos = random
    .longs(0, ntotal)
    .limit(nlines)
    .collect(() -> {
            return new TreeSet<Long>((x, y) -> y.compareTo(x));
        }, Set::add, Set::addAll);
System.out.println();
linenos.stream().forEach(System.out::println);

// prints:
26295
25479
17203
14047
10805
9201
7161
1090

By the way, the Comparator is provided in the above example as a Lambda. With pre-java-8, you would provide something like this:

Comparator<Long> cmp = new Comparator<Long>() {
        @Override
        public int compare(Long o1,Long o2) {
            return o2.compareTo(o1);
        }
    };
...
    .collect(() -> {
            return new TreeSet<Long>(cmp);
        }, Set::add, Set::addAll);
...

5. TreeSet Navigation

TreeSet stores the items added to it in a particular order — using the provided Comparator (or natural ordering if none provided). This means you can refer to an item relative to another item by using the navigational methods provided by TreeSet.

The class shown here is used in some of the following examples.

public class Player
{
    private int year;
    private String teamId;
    private String leagueId;
    private String playerId;
    private int salary;

    public Player(int year,
                  String teamId,
                  String leagueId,
                  String playerId,
                  int salary)
    {
        this.year = year;
        this.teamId = teamId;
        this.leagueId = leagueId;
        this.playerId = playerId;
        this.salary = salary;
    }

    // standard getters and setters.

    public String toString()
    {
        StringBuilder sbuf = new StringBuilder();
        sbuf.append(year).append(',')
            .append(teamId).append(',')
            .append(leagueId).append(',')
            .append(playerId).append(',')
            .append(salary);
        return sbuf.toString();
    }
};

5.1. Reversing a TreeSet

The method descendingSet() returns a view of the TreeSet with the items reversed.

TreeSet<Player> players = new TreeSet<>((x, y) -> {
    return Integer.compare(x.getYear(), y.getYear());
});
players.add(new Player(1986, "SLN", "NL", "lavalmi01",70000));
...
players.descendingSet().forEach(System.out::println);

// prints:
1986,SLN,NL,lavalmi01,70000
1988,KCA,AL,wilsowi02,1383712
1989,CHA,AL,manrifr01,192500
1992,CHA,AL,karkoro01,650000
1995,KCA,AL,gubicma01,750000
1997,MON,NL,stankan01,220000
2000,BAL,AL,hairsje02,205000
2004,SFN,NL,willije02,308000
2005,MIL,NL,leeca01,8000000
2006,ATL,NL,jonesan01,13500000
2007,ARI,NL,valvejo01,2000000
2008,SDN,NL,ledezwi01,620000
2010,CHA,AL,lucydo01,400000
2011,TEX,AL,morelmi01,426000
2012,BOS,AL,doubrfe01,484000

5.2. Relative Ceiling and Floor

To obtain a ceiling item — the “smallest” item greater than or equal to the given — use ceiling().

Player p = new Player(1997, "MON" , "NL", "stankan01", 220000);
...
System.out.println(players.ceiling(p));

// prints:
1997,MON,NL,stankan01,220000

The TreeSet also provides an analogous method floor() to extract the “largest” item greater than or equal to the given.

Player p = new Player(2008, "TOR", "AL", "coatsbu01", 392100);
...
System.out.println(players.floor(p));

// prints:
2008,SDN,NL,ledezwi01,620000

The given element need not be present in the Set. It is used only for comparison.

Player p = new Player(2006, "COL", "NL", "hollima01", 500000);
...
System.out.println(players.ceiling(p));

// prints:
2006,ATL,NL,jonesan01,13500000

5.3. Head Set and Tail Set

The head set is the set of all items in the Set less than the given item.

Player p = new Player(1997, "MON" , "NL", "stankan01", 220000);
...
players.headSet(p).forEach(System.out::println);

// prints:
1986,SLN,NL,lavalmi01,70000
1988,KCA,AL,wilsowi02,1383712
1989,CHA,AL,manrifr01,192500
1992,CHA,AL,karkoro01,650000
1995,KCA,AL,gubicma01,750000

And of course tailSet() returns all items greater than the given. Note that this includes the given element if it is present in the Set.

Player p = new Player(2007, "ARI", "NL", "valvejo01", 2000000);
...
players.tailSet(p).forEach(System.out::println);

// prints:
2007,ARI,NL,valvejo01,2000000
2008,SDN,NL,ledezwi01,620000
2010,CHA,AL,lucydo01,400000
2011,TEX,AL,morelmi01,426000
2012,BOS,AL,doubrfe01,484000

5.4. First and Last Item

Obtain the first and last items in the Set using first() and last(). These are the methods specified by SortedSet.

System.out.println("first: " + players.first());
System.out.println(" last: " + players.last());

// prints:
first: 1986,SLN,NL,lavalmi01,70000
 last: 2012,BOS,AL,doubrfe01,484000

Summary

Three common concrete implementations of the Set interface include a) HashSet which provides no ordering b) LinkedHashSet which maintains insertion order and c) TreeSet which orders items using a comparator while storing. In addition, the TreeSet provides methods for navigation, examples for which were presented.

See Also