Excel Pivot Table using Apache POI

Create an Excel Pivot table from Java using Apache POI.

“A foolish faith in authority is the worst enemy of truth.”
― Albert Einstein

1. Introduction

A Pivot Table is a tool used in Excel for summarizing data. It helps group data using user-selected criteria and compute group summaries using functions such as total, average, count, etc.

While Excel offers a GUI for creating a Pivot table over selected data, it is useful to learn how to do the same from Java. Maybe you need to extract data from a database and export it to a report in Excel. Wouldn’t it be nice to include a summary of the data with a pivot table? In this article, we show you how to create such a pivot table.

We use the Apache POI toolkit for working with Excel and creating the pivot table.

2. Sample Data

While you can export any type of tabular data to Excel, we use data from a CSV file. We create the needed spreadsheet from the CSV and then create a pivot table for summarizing the data.

We use the CSV parser module presented in a related article.

The data we are using is the annual salaries of baseball players and the teams and leagues they played for. A sample is shown below. The full data set contains 26,428 rows.

yearID,teamID,lgID,playerID,salary
1985,ATL,NL,barkele01,870000
1985,ATL,NL,bedrost01,550000
1985,ATL,NL,benedbr01,545000
...

3. Creating an Excel Spreadsheet

We have covered creating an Excel spreadsheet using Apache POI here and here. Check those articles for full details and the code. Here we present a brief overview of the code.

Creating the workbook and the spreadsheet:

Workbook wb = new XSSFWorkbook();
CreationHelper ch = wb.getCreationHelper();
String safeName = WorkbookUtil.createSafeSheetName("Sample Sheet");
XSSFSheet sheet = (XSSFSheet)wb.createSheet(safeName);

Reading the CSV file and writing the data. The first row is treated as column headers.

int rowNum = 0;
List<String> colNames = null;
try (InputStream in = new FileInputStream(csvFile);) {
    CSV csv = new CSV(true, ',', in);
    if ( csv.hasNext() ) {
	colNames = new ArrayList<String>(csv.next());
	Row row = sheet.createRow((short)0);
	for (int i = 0 ; i < colNames.size() ; i++) {
	    String name = colNames.get(i);
	    row.createCell(i).setCellValue(name);
	}
    }

    while (csv.hasNext()) {
	List<String> fields = csv.next();
	rowNum++;
	Row row = sheet.createRow((short)rowNum);
	for (int i = 0 ; i < fields.size() ; i++) {
	    /* Attempt to set as double. If that fails, set as
	     * text. */
	    try {
		double value = Double.parseDouble(fields.get(i));
		row.createCell(i).setCellValue(value);
	    } catch(NumberFormatException ex) {
		String value = fields.get(i);
		row.createCell(i).setCellValue(value);
	    }
	}
    }
}

4. Creating the Pivot Table

Let us now create the pivot table. We want the whole of data included in the pivot table. So we use the following ranges to create cell references.

int firstRow = sheet.getFirstRowNum();
int lastRow = sheet.getLastRowNum();
int firstCol = sheet.getRow(0).getFirstCellNum();
int lastCol = sheet.getRow(0).getLastCellNum();

The cell references specify the top left and the bottom right of the table data.

CellReference topLeft = new CellReference(firstRow, firstCol);
CellReference botRight = new CellReference(lastRow, lastCol - 1);

And the area reference which marks out the data table.

AreaReference aref = new AreaReference(topLeft, botRight);

We insert the pivot table at this location; a couple of rows offset from the top of the sheet and to the right of the table.

CellReference pos = new CellReference(firstRow + 4, lastCol + 1);

Finally we create the pivot table from the area reference and the cell position.

XSSFPivotTable pivotTable = sheet.createPivotTable(aref, pos);

Configure the pivot table to add Row Labels. Use the summarizing functions as needed for your application.

In this example, we are grouping data by yearID and teamID. These are columns 0 and 1 respectively. And for the summary, we choose to sum the salaries over these columns.

pivotTable.addRowLabel(0);
pivotTable.addRowLabel(1);
pivotTable.addColumnLabel(DataConsolidateFunction.SUM,
			  4, "Sum of " + colNames.get(4));

Finally, save the Excel spreadsheet.

FileOutputStream fileOut = new FileOutputStream(xlsFile);
wb.write(fileOut);
fileOut.close();

And here is the spreadsheet showing the various components of the pivot table.

Excel pivot table created from Java

5. A Few Gotchas

You may have noticed a couple of shortcomings in the code and the pivot table above.

First is: we added columns to the “Row Labels” section of the pivot table, but none to the “Column Labels” section. There is a reason for this. Apache POI does provide an addDataColumn() method which supposedly provides this functionality. But as of version 3.15 of Apache POI, it is not working as advertised.

The second issue is this: the method addColumnLabel() does NOT add a column to the “Column Labels” section of the pivot table. It adds a summary column to the “Values” section. The method is misnamed. It should be (and probably will be) renamed to addValues().

Summary

This article presented code to export a pivot table using Apache POI. We used CSV sample data to export to Excel and created the pivot table from the data.

One thought on “Excel Pivot Table using Apache POI”

Leave a Reply

Your email address will not be published. Required fields are marked *