Python Pandas Tutorial – DataFrame

Learn the basics of creating a DataFrame in this tutorial series on pandas.

1. Introduction

This is the next part of the pandas tutorial. In a previous article, we covered the pandas Series class. Today we are getting started with the main pandas data structure, the DataFrame.

Continue reading “Python Pandas Tutorial – DataFrame”

Java – Pivot Table using Streams

Implement a Pivot Table in Java using Java 8 Streams and Collections.

“Money may not buy happiness, but I’d rather cry in a Jaguar than on a bus.”
― Françoise Sagan

1. Introduction

Today let us see how we can implement a pivot table using java 8 streams. Raw data by itself does not deliver much insight to humans. We need some kind of data aggregation to discern patterns in raw data. A pivot table is one such instrument. Other more visual methods of aggregation include graphs and charts.

Continue reading “Java – Pivot Table using Streams”

Sort Large CSV File using SQLite

Sorting a large CSV file by loading it into SQLite. Much faster and easier to process.

“When you’re at the end of your rope, tie a knot and hold on.”
― Theodore Roosevelt

1. Review

We are trying to sort a large CSV file. The file contains a couple of million rows – not large by “big-data” standards, but large enough to face problems working with it.

Continue reading “Sort Large CSV File using SQLite”

Sorting a Large CSV File

Large CSV files present a challenge when need arises to sort. Learn how to do that using a database.

“All of life is a constant education.”
― Eleanor Roosevelt, The Wisdom of Eleanor Roosevelt

1. Introduction

Let us explore some ways of sorting large data sets.

By large, I don’t mean typical “big-data” sizes – which might consist of billions of rows. Such data sets fall into the realm of “big data” which we are not exploring today. Instead I am talking of sorting a rather large CSV file – maybe a couple of million rows.

Continue reading “Sorting a Large CSV File”

Excel Pivot Table using Apache POI

Create an Excel Pivot table from Java using Apache POI.

“A foolish faith in authority is the worst enemy of truth.”
― Albert Einstein

1. Introduction

A Pivot Table is a tool used in Excel for summarizing data. It helps group data using user-selected criteria and compute group summaries using functions such as total, average, count, etc.

Continue reading “Excel Pivot Table using Apache POI”

Convert Excel to CSV (UTF-8)

Export data from Excel spreadsheet to CSV using Java. Properly handles exporting Unicode data in the spreadsheet.

1. Introduction

Let us look into how to convert Excel to CSV.

CSV stands for Comma-Separated-Values and is a very common format used for exchanging data between diverse applications. While the Excel Spreadsheet file format is complex (since it has to accommodate a lot more!), CSV is a simpler format representing just tabular data.

Continue reading “Convert Excel to CSV (UTF-8)”

Excel Color Coding Cells

Conditional Formatting in Excel is very useful to highlight cells based on cell value. In this easy tutorial, learn the basics of how to do so.

1. Introduction

Color coding of cells based on a condition is very useful in Excel for highlighting areas and data points. It is one of the top tools in the arsenal of an Excel expert in making the spreadsheet look snazzy and convey crucial information. In this article, we show a simple example of how to apply conditional formatting to cells based on values.

Continue reading “Excel Color Coding Cells”

Apache POI Excel Example – Part 2

More formatting options using Java with Apache POI for Microsoft Excel spreadsheets.

1. Introduction

In Part 1 of this Apache POI Excel guide, we examined how to create an Excel spreadsheet and add data to it. We also looked at properly storing data into cells to avoid “Number Stored as Text” errors.

In this chapter, let us look at some more options for formatting data within an Excel spreadsheet.

Continue reading “Apache POI Excel Example – Part 2”

How to Read CSV File in Java

Reading a CSV file in Java including handling BOM (Byte-Order-Marker), quoted fields, multi-line fields and more.

“The reason I talk to myself is because I’m the only one whose answers I accept.”
― George Carlin

1. Introduction

CSV files are extensively used in data interchange between applications. Especially useful when the only structure to the data being exchanged is rows and columns. This format is particularly popular as the data can be imported into Microsoft Excel and used for charts and visualization.

Continue reading “How to Read CSV File in Java”