Skip non-CSV head
Some CSV files contain one or more lines of text before the actual CSV data starts. For example, it could look like this:
This is an example of a CSV file that containsthree lines before the actual CSV records.
header 1,header 2value 1,value 2
Strictly speaking, such a file is not a valid CSV file as defined by the CSV specification (RFC 4180).
The main problem with those files is:
- An exception would be thrown unless the options
ignoreDifferentFieldCount()
andskipEmptyLines()
are set. - When working with named fields, the very first line (
This is an example of a CSV file that contains
) would be interpreted as the actual header line.
FastCSV comes with two features to handle such files:
skipLines(int lineCount)
: Skip a specific number of lines (lineCount
) regardless of their content.skipLines(Predicate<String> predicate, int maxLines)
: Skip lines until a specific line (e.g., the header) is found. Stop skipping after a specific number of lines (maxLines
).
Example
This example demonstrates how to skip non-CSV head lines when reading such a CSV file with FastCSV.
package example;
import java.io.IOException;import java.util.function.Predicate;
import de.siegmar.fastcsv.reader.CsvReader;
/// Example for reading CSV data with non-CSV data before the actual CSV header.public class ExampleCsvReaderWithNonCsvAtStart {
private static final String DATA = """ Your CSV file contains some non-CSV data before the actual CSV header? And you don't want to (mis)interpret them as CSV header? No problem!
header1,header2 foo,bar """;
public static void main(final String[] args) throws IOException { alternative1(); alternative2(); }
private static void alternative1() throws IOException { System.out.println("Alternative 1 - ignore specific number of lines");
try (var csv = CsvReader.builder().ofNamedCsvRecord(DATA)) { // Skip the first 3 lines System.out.println("Skipping the first 3 lines"); csv.skipLines(3);
// Read the CSV data csv.forEach(System.out::println); } }
private static void alternative2() throws IOException { System.out.println("Alternative 2 - wait for a specific line");
final Predicate<String> isHeader = line -> line.contains("header1");
try (var csv = CsvReader.builder().ofNamedCsvRecord(DATA)) { // Skip until the header line is found, but not more than 10 lines final int actualSkipped = csv.skipLines(isHeader, 10); System.out.println("Found header line after skipping " + actualSkipped + " lines");
// Read the CSV data csv.forEach(System.out::println); } }
}
You also find this source code example in the FastCSV GitHub repository.