csv解析java_Java CSV解析器

csv解析java

Welcome to the Java CSV Parser tutorial. CSV files are one of the most widely used format to pass data from one system to another. Since CSV files are supported in Microsoft Excel, it can be easily used by non-techies also.

欢迎使用Java CSV解析器教程。 CSV文件是将数据从一个系统传递到另一个系统的最广泛使用的格式之一。 由于Microsoft Excel支持CSV文件,因此非技术人员也可以轻松使用它。

Java CSV解析器 (Java CSV Parser)

Unfortunately, we don’t have any in-built Java CSV Parser.

不幸的是,我们没有任何内置的Java CSV解析器。

If the CSV file is really simple and don’t have any special characters, then we can use Java Scanner class to parse CSV files but most of the times it’s not the case. Rather than writing complicated logic for parsing, it’s better to use open-source tools we have for parsing and writing CSV files.

如果CSV文件真的很简单并且没有任何特殊字符,那么我们可以使用Java Scanner类来解析CSV文件,但大多数情况下并非如此。 与其编写复杂的逻辑进行解析,不如使用我们拥有的用于解析和编写CSV文件的开源工具。

There are three open-source APIs for working with CSV.

有三种用于CSV的开源API。

  1. OpenCSV

    OpenCSV
  2. Apache Commons CSV

    Apache Commons CSV
  3. Super CSV

    超级CSV

We will look into all these java CSV parsers one by one.

我们将一一研究所有这些Java CSV解析器。

Suppose we have a CSV file as:

假设我们有一个CSV文件,如下所示:

employees.csv

employees.csv

ID,Name,Role,Salary
1,Pankaj Kumar,CEO,"5,000USD"
2,Lisa,Manager,500USD
3,David,,1000USD

and we want to parse it to list of Employee object.

并且我们想将其解析为Employee对象的列表。

package com.journaldev.parser.csv;

public class Employee {

	private String id;
	private String name;
	private String role;
	private String salary;
	
	public String getId() {
		return id;
	}
	public void setId(String id) {
		this.id = id;
	}
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}
	public String getRole() {
		return role;
	}
	public void setRole(String role) {
		this.role = role;
	}
	public String getSalary() {
		return salary;
	}
	public void setSalary(String salary) {
		this.salary = salary;
	}
	
	@Override
	public String toString(){
		return "ID="+id+",Name="+name+",Role="+role+",Salary="+salary+"\n";
	}
}

1. OpenCSV (1. OpenCSV)

We will see how we can use OpenCSV java parser to read CSV file to java object and then write CSV from java object. Download OpenCSV libraries from SourceForge Website and include it in the classpath.

我们将看到如何使用OpenCSV Java解析器将CSV文件读取到Java对象,然后从Java对象写入CSV。 从SourceForge网站下载OpenCSV库,并将其包含在类路径中。

If you are using Maven then include it with below dependency.

如果您使用的是Maven,则将其包含在以下依赖项中。


    com.opencsv
    opencsv
    3.8

For parsing CSV file we can use CSVReader to parse each row to the list of objects. CSVParser also provides an option to read all the data at once and then parse it.

对于解析CSV文件,我们可以使用CSVReader将每一行解析到对象列表。 CSVParser还提供了一次读取所有数据然后进行解析的选项。

OpenCSV provides CsvToBean class that we can use with HeaderColumnNameMappingStrategy object to automatically map the CSV to list of objects.

OpenCSV提供了CsvToBean类,我们可以将其与HeaderColumnNameMappingStrategy对象一起使用,以自动将CSV映射到对象列表。

For writing CSV data, we need to create List of String array and then use CSVWriter class to write it to the file or any other writer object.

为了写入CSV数据,我们需要创建String数组列表,然后使用CSVWriter类将其写入文件或任何其他writer对象。

package com.journaldev.parser.csv;

import java.io.FileReader;
import java.io.IOException;
import java.io.StringWriter;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

import au.com.bytecode.opencsv.CSVReader;
import au.com.bytecode.opencsv.CSVWriter;
import au.com.bytecode.opencsv.bean.CsvToBean;
import au.com.bytecode.opencsv.bean.HeaderColumnNameTranslateMappingStrategy;

public class OpenCSVParserExample {

	public static void main(String[] args) throws IOException {

		List emps = parseCSVFileLineByLine();
		System.out.println("**********");
		parseCSVFileAsList();
		System.out.println("**********");
		parseCSVToBeanList();
		System.out.println("**********");
		writeCSVData(emps);
	}

	private static void parseCSVToBeanList() throws IOException {
		
		HeaderColumnNameTranslateMappingStrategy beanStrategy = new HeaderColumnNameTranslateMappingStrategy();
		beanStrategy.setType(Employee.class);
		
		Map columnMapping = new HashMap();
		columnMapping.put("ID", "id");
		columnMapping.put("Name", "name");
		columnMapping.put("Role", "role");
		//columnMapping.put("Salary", "salary");
		
		beanStrategy.setColumnMapping(columnMapping);
		
		CsvToBean csvToBean = new CsvToBean();
		CSVReader reader = new CSVReader(new FileReader("employees.csv"));
		List emps = csvToBean.parse(beanStrategy, reader);
		System.out.println(emps);
	}

	private static void writeCSVData(List emps) throws IOException {
		StringWriter writer = new StringWriter();
		CSVWriter csvWriter = new CSVWriter(writer,'#');
		List data  = toStringArray(emps);
		csvWriter.writeAll(data);
		csvWriter.close();
		System.out.println(writer);
	}

	private static List toStringArray(List emps) {
		List records = new ArrayList();
		//add header record
		records.add(new String[]{"ID","Name","Role","Salary"});
		Iterator it = emps.iterator();
		while(it.hasNext()){
			Employee emp = it.next();
			records.add(new String[]{emp.getId(),emp.getName(),emp.getRole(),emp.getSalary()});
		}
		return records;
	}

	private static List parseCSVFileLineByLine() throws IOException {
		//create CSVReader object
		CSVReader reader = new CSVReader(new FileReader("employees.csv"), ',');
		
		List emps = new ArrayList();
		//read line by line
		String[] record = null;
		//skip header row
		reader.readNext();
		
		while((record = reader.readNext()) != null){
			Employee emp = new Employee();
			emp.setId(record[0]);
			emp.setName(record[1]);
			emp.setRole(record[2]);
			emp.setSalary(record[3]);
			emps.add(emp);
		}
		
		reader.close();
		
		System.out.println(emps);
		return emps;
	}
	
	private static void parseCSVFileAsList() throws IOException {
		//create CSVReader object
		CSVReader reader = new CSVReader(new FileReader("employees.csv"), ',');

		List emps = new ArrayList();
		//read all lines at once
		List records = reader.readAll();
		
		Iterator iterator = records.iterator();
		//skip header row
		iterator.next();
		
		while(iterator.hasNext()){
			String[] record = iterator.next();
			Employee emp = new Employee();
			emp.setId(record[0]);
			emp.setName(record[1]);
			emp.setRole(record[2]);
			emp.setSalary(record[3]);
			emps.add(emp);
		}
		
		reader.close();
		
		System.out.println(emps);
	}

}

When we run above OpenCSV example program, we get the following output.

当我们在OpenCSV示例程序上运行时,将得到以下输出。

[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=,Salary=1000USD
]
**********
[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=,Salary=1000USD
]
**********
[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=null
, ID=2,Name=Lisa,Role=Manager,Salary=null
, ID=3,Name=David,Role=,Salary=null
]
**********
"ID"#"Name"#"Role"#"Salary"
"1"#"Pankaj Kumar"#"CEO"#"5,000USD"
"2"#"Lisa"#"Manager"#"500USD"
"3"#"David"#""#"1000USD"

As you can see that we can set the delimiters character also while parsing or writing CSV data in OpenCSV java parser.

如您所见,在OpenCSV Java解析器中解析或写入CSV数据时,我们也可以设置定界符。

2. Apache Commmons CSV (2. Apache Commmons CSV)

You can download the Apache Commons CSV binaries or include the dependencies using maven as shown below.

您可以下载Apache Commons CSV二进制文件,也可以使用maven包含依赖项,如下所示。


    org.apache.commons
    commons-csv
    1.3

Apache Commons CSV parser is simple to use and CSVParser class is used to parse the CSV data and CSVPrinter is used to write the data.

Apache Commons CSV解析器易于使用,并且CSVParser类用于解析CSV数据,而CSVPrinter用于写入数据。

Example code to parse above CSV file to the list of Employee objects is given below.

下面给出了将上述CSV文件解析为Employee对象列表的示例代码。

package com.journaldev.parser.csv;

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVPrinter;
import org.apache.commons.csv.CSVRecord;

public class ApacheCommonsCSVParserExample {

	public static void main(String[] args) throws FileNotFoundException, IOException {
		
		//Create the CSVFormat object
		CSVFormat format = CSVFormat.RFC4180.withHeader().withDelimiter(',');
		
		//initialize the CSVParser object
		CSVParser parser = new CSVParser(new FileReader("employees.csv"), format);
		
		List emps = new ArrayList();
		for(CSVRecord record : parser){
			Employee emp = new Employee();
			emp.setId(record.get("ID"));
			emp.setName(record.get("Name"));
			emp.setRole(record.get("Role"));
			emp.setSalary(record.get("Salary"));
			emps.add(emp);
		}
		//close the parser
		parser.close();
		
		System.out.println(emps);
		
		//CSV Write Example using CSVPrinter
		CSVPrinter printer = new CSVPrinter(System.out, format.withDelimiter('#'));
		System.out.println("********");
		printer.printRecord("ID","Name","Role","Salary");
		for(Employee emp : emps){
			List empData = new ArrayList();
			empData.add(emp.getId());
			empData.add(emp.getName());
			empData.add(emp.getRole());
			empData.add(emp.getSalary());
			printer.printRecord(empData);
		}
		//close the printer
		printer.close();
	}

}

When we run the above program, we get the following output.

当我们运行上面的程序时,我们得到以下输出。

[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=,Salary=1000USD
]
********
ID#Name#Role#Salary
1#Pankaj Kumar#CEO#5,000USD
2#Lisa#Manager#500USD
3#David##1000USD

3.超级CSV (3. Super CSV)

While searching for good CSV parsers, I saw so many developers recommending Super CSV in Stack Overflow. So I thought to give it a try. Download Super CSV libraries from SourceForge Website and include the jar file in the project build path.

在寻找良好的CSV解析器时,我看到很多开发人员建议在Stack Overflow中使用Super CSV。 所以我想试试看。 从SourceForge网站下载Super CSV库,并将jar文件包含在项目构建路径中。

If you are using Maven, just add below dependency.

如果您使用的是Maven,则只需添加以下依赖项即可。


    net.sf.supercsv
    super-csv
    2.4.0

For parsing CSV file to list of objects, we need to create instance of CsvBeanReader. We can set cell specific rules using CellProcessor array. We can use it to read directly from CSV file to java bean and vice versa.

为了将CSV文件解析为对象列表,我们需要创建CsvBeanReader实例。 我们可以使用CellProcessor数组设置特定于单元的规则。 我们可以使用它直接从CSV文件读取到Java bean,反之亦然。

If we have to write CSV data, process is similar and we have to use CsvBeanWriter class.

如果必须写入CSV数据,则过程类似,并且必须使用CsvBeanWriter类。

package com.journaldev.parser.csv;

import java.io.FileReader;
import java.io.IOException;
import java.io.StringWriter;
import java.util.ArrayList;
import java.util.List;

import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.constraint.UniqueHashCode;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanReader;
import org.supercsv.io.CsvBeanWriter;
import org.supercsv.io.ICsvBeanReader;
import org.supercsv.io.ICsvBeanWriter;
import org.supercsv.prefs.CsvPreference;

public class SuperCSVParserExample {

	public static void main(String[] args) throws IOException {

		List emps = readCSVToBean();
		System.out.println(emps);
		System.out.println("******");
		writeCSVData(emps);
	}

	private static void writeCSVData(List emps) throws IOException {
		ICsvBeanWriter beanWriter = null;
		StringWriter writer = new StringWriter();
		try{
			beanWriter = new CsvBeanWriter(writer, CsvPreference.STANDARD_PREFERENCE);
			final String[] header = new String[]{"id","name","role","salary"};
			final CellProcessor[] processors = getProcessors();
            
			// write the header
            beanWriter.writeHeader(header);
            
            //write the bean's data
            for(Employee emp: emps){
            	beanWriter.write(emp, header, processors);
            }
		}finally{
			if( beanWriter != null ) {
                beanWriter.close();
			}
		}
		System.out.println("CSV Data\n"+writer.toString());
	}

	private static List readCSVToBean() throws IOException {
		ICsvBeanReader beanReader = null;
		List emps = new ArrayList();
		try {
			beanReader = new CsvBeanReader(new FileReader("employees.csv"),
					CsvPreference.STANDARD_PREFERENCE);

			// the name mapping provide the basis for bean setters 
			final String[] nameMapping = new String[]{"id","name","role","salary"};
			//just read the header, so that it don't get mapped to Employee object
			final String[] header = beanReader.getHeader(true);
			final CellProcessor[] processors = getProcessors();

			Employee emp;
			
			while ((emp = beanReader.read(Employee.class, nameMapping,
					processors)) != null) {
				emps.add(emp);
			}

		} finally {
			if (beanReader != null) {
				beanReader.close();
			}
		}
		return emps;
	}

	private static CellProcessor[] getProcessors() {
		
		final CellProcessor[] processors = new CellProcessor[] { 
                new UniqueHashCode(), // ID (must be unique)
                new NotNull(), // Name
                new Optional(), // Role
                new NotNull() // Salary
        };
		return processors;
	}

}

When we run above Super CSV example program, we get below output.

当我们在Super CSV示例程序上方运行时,将得到以下输出。

[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=null,Salary=1000USD
]
******
CSV Data
id,name,role,salary
1,Pankaj Kumar,CEO,"5,000USD"
2,Lisa,Manager,500USD
3,David,,1000USD

As you can see that the Role field is set as Optional because for the third row, it’s empty. Now if we change that to NotNull, we get following exception.

如您所见,“角色”字段设置为“可选”,因为对于第三行,该字段为空。 现在,如果将其更改为NotNullNotNull得到以下异常。

Exception in thread "main" org.supercsv.exception.SuperCsvConstraintViolationException: null value encountered
processor=org.supercsv.cellprocessor.constraint.NotNull
context={lineNo=4, rowNo=4, columnNo=3, rowSource=[3, David, null, 1000USD]}
	at org.supercsv.cellprocessor.constraint.NotNull.execute(NotNull.java:71)
	at org.supercsv.util.Util.executeCellProcessors(Util.java:93)
	at org.supercsv.io.AbstractCsvReader.executeProcessors(AbstractCsvReader.java:203)
	at org.supercsv.io.CsvBeanReader.read(CsvBeanReader.java:206)
	at com.journaldev.parser.csv.SuperCSVParserExample.readCSVToBean(SuperCSVParserExample.java:66)
	at com.journaldev.parser.csv.SuperCSVParserExample.main(SuperCSVParserExample.java:23)

So SuperCSV provides us option to have conditional logic for the fields that are not available with other CSV parsers. It’s easy to use and the learning curve is also very small.

因此,SuperCSV为我们提供了对其他CSV解析器不可用的字段具有条件逻辑的选项。 它易于使用,学习曲线也很小。

That’s all for the Java CSV parser example tutorial. Whether to use OpenCSV, Apache Commons CSV or Super CSV depends on your requirement and they all seem to be easy to use.

这就是Java CSV解析器示例教程的全部内容。 是否使用OpenCSV,Apache Commons CSV或Super CSV取决于您的要求,它们似乎都易于使用。

翻译自: https://www.journaldev.com/2544/java-csv-parser

csv解析java

你可能感兴趣的:(java,stream,大数据,csv,xml)