pandas处理csv文件

pandas读取csv文件

简介

CSV是一种以","为分隔符来存储表格数据(数字或文本)的纯文本格式。这是一种简单通用的文件格式。

1. 将pd.DataFrame对象保存到csv文件

pd.DataFrame.to_csv()
"""
	参数:
		1. path_or_buf : str or file handle, default None,文件路径或文件对象;
		2. sep: str, defaults to ',',使用的分隔符;
		3. na_rep : str, default '',丢失数据的表示;
		4. float_format: str, default None,浮点数的格式字符串;
		5. columns: sequence, optional, 保存的列;
		6. header: bool or list of str, default True,是否将列标签写入文件,如果为list or str,则将用于替换列标签写入文件;
		7. index : bool, default True,是否写入行索引;
		8. index_label : str or sequence, or False, default None,索引列的标签;
		9. mode : str,python写模式,默认 'w';
		10. encoding : str, optional,默认,’utf-8‘;
"""

2. 读取csv文件

pd.read_csv()
"""
	参数:
		1. filepath_or_buffer: various,文件路径或文件对象;
		2. sep: str, defaults to ',',分隔符,默认 ',';
		3. header : int, list of int, default 'infer',选择第几行作为列名,默认为第一行;
		4. names : array-like, optional,列名列表;
		5. index_col: int, str, sequence of int / str, or False, default ``None``,使用给定的列作为行标签; 
		6. usecols : list-like or callable, optional,列名或序列列表,返回子表;
		7. na_values: scalar, str, list-like, or dict, default None,将对应值识别位NaN;
		8. na_filter: boolean, default True,是否识别缺失值为Na;
"""

例:

# 定义DataFrame对象
>>> df = df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
                                'foo', 'bar', 'foo', 'foo'],
                        'B': ['one', 'one', 'two', 'three',
                                'two', 'two', 'one', 'three'],
                        'C': np.random.randn(8),
                        'D': np.random.randn(8)})
>>> df
	A	B	C	D
0	foo	one	-1.437858	0.155025
1	bar	one	1.150565	-0.614996
2	foo	two	0.296236	0.538160
3	bar	three	-1.355619	1.229465
4	foo	two	-0.411405	-1.167204
5	bar	two	-0.178302	-0.451726
6	foo	one	1.127362	-0.407458
7	foo	three	-1.608615	-1.025847

# 将df保存到csv文件中
>>> df.to_csv('test.csv', )
>>> with open('test.csv') as fp:
		print(fp.read())
A,B,C,D
foo,one,-1.4378575583783093,0.15502459897313522
bar,one,1.1505651678375877,-0.6149963246704199
foo,two,0.2962358799876369,0.5381601328949968
bar,three,-1.3556193106958445,1.22946535082023
foo,two,-0.4114051795979103,-1.167204338223104
bar,two,-0.17830153868950416,-0.4517260094135266
foo,one,1.1273617862319971,-0.4074578794708698
foo,three,-1.6086147075218953,-1.025847234571291

# 读取csv文件
>>> pd.read_csv('test.csv')
	A	B	C	D
0	foo	one	-1.437858	0.155025
1	bar	one	1.150565	-0.614996
2	foo	two	0.296236	0.538160
3	bar	three	-1.355619	1.229465
4	foo	two	-0.411405	-1.167204
5	bar	two	-0.178302	-0.451726
6	foo	one	1.127362	-0.407458
7	foo	three	-1.608615	-1.025847

你可能感兴趣的:(数据分析,pandas,数据分析,python,深度学习,机器学习,pytorch)