数据分析过程(准备2)

csv format
csv comma seperated values
like a spreadsheet with no formulas
easy to process with code

python中的csv

在python中,csv文件通常呈现为一个由行组成的列表
1.each row is a list
the overall data structure is a list of lists

csv=[['A1','A2','A3'],
     ['B1','B2','B3']]

2.csv文件有标题行
each row is a dictionary
the keys of each dictionary can be column names and the fields can be values
the overall data structure is a list of dictionary

csv=[{'name1':'A1','name2':'A2','name3':'A3'},
     {'name1':'B1','name2':'B2','name3':'B3'}]

unicode读取csv

import unicodecsv
enrollments=[]
f=open('enrollments.csv')
reader = unicodecsv.DictReader(f)   #reader 并不是行列表,而是迭代器,可用迭代器编写获取各元素的循环,但这是一次性的

for row in reader :
    enrollments.append(row)

for row in reader:           #如果用第二个循环打印出reader中的所有行,那么输出结果为空,因为对每个迭代器只能进行一次循环
    print row 
f.close()
enrollments[0]

优化版:

import unicodecsv
with open('enrollments.csv','rb') as f:     #使用with语句避免最后还要关闭文件
    reader = unicodecsv.DictReader(f)
    enrollments = list(reader)              #将迭代器转化为列表
enrollments[0]

读取
enrollments.csv
daily_engagement.csv
project_submissions.csv
三个文件的数据,并打印第一行

import unicodecsv
def read_csv(filename):
    with open(filename,'rb') as f:
        reader = unicodecsv.DictReader(f)
        return list(reader)
enrollments = read_csv('enrollments.csv')
daily_engagement = read_csv('daily-engagement.csv')
project_submissions = read_csv('project-submissions.csv')
print enrollments[0]
print daily_engagement[0]
print project_submissions[0]

你可能感兴趣的:(数据分析过程(准备2))