Pandas-创建数据

创建数据

随机数据

创建一个Series,pandas可以生成一个默认的索引

s = pd.Series([1,3,5,np.nan,6,8])

通过numpy创建DataFrame,包含一个日期索引,以及标记的列

dates = pd.date_range('20170101', periods=6)
df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))

df
Out[4]: 
                   A         B         C         D
2016-10-10  0.630275  1.081899 -1.594402 -2.571683
2016-10-11 -0.211379 -0.166089 -0.480015 -0.346706
2016-10-12 -0.416171 -0.640860  0.944614 -0.756651
2016-10-13  0.652248  0.186364  0.943509  0.053282
2016-10-14 -0.430867 -0.494919 -0.280717 -1.327491
2016-10-15  0.306519 -2.103769 -0.019832  0.035211

其中,np.random.randn可以返回一个随机数组

通过Dict创建

df2 = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Timestamp('20130102'),
                     'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'D' : np.array([3] * 4,dtype='int32'),
                     'E' : pd.Categorical(["test","train","test","train"]),
                     'F' : 'foo' })
                     
Out[20]: 
     A          B    C  D      E    F
0  1.0 2013-01-02  1.0  3   test  foo
1  1.0 2013-01-02  1.0  3  train  foo
2  1.0 2013-01-02  1.0  3   test  foo
3  1.0 2013-01-02  1.0  3  train  foo

通过nparray创建

data = [[2000,1,2],
[2001,1,3]
]

df = pd.DataFrame(data,
        index=['one','two'],
        columns=['year','state','pop'])
        
        
# 也可以转置后创建
out = array([data_real_np, ydz_np]).T
df = pd.DataFrame(out)
df.to_csv('final.csv', encoding='utf-8', index=0, header=None)

创建TimeStamp

有几个方法可以构造一个Timestamp对象

  • pd.Timestamp
import pandas as pd
from datetime import datetime as dt
p1=pd.Timestamp(2017,6,19)
p2=pd.Timestamp(dt(2017,6,19,hour=9,minute=13,second=45))
p3=pd.Timestamp("2017-6-19 9:13:45")

print("type of p1:",type(p1))
print(p1)
print("type of p2:",type(p2))
print(p2)
print("type of p3:",type(p3))
print(p3)


('type of p1:', )
2017-06-19 00:00:00
('type of p2:', )
2017-06-19 09:13:45
('type of p3:', )
2017-06-19 09:13:45
  • to_datetime()
import pandas as pd
from datetime import datetime as dt

p4=pd.to_datetime("2017-6-19 9:13:45")
p5=pd.to_datetime(dt(2017,6,19,hour=9,minute=13,second=45))

print("type of p4:",type(p4))
print(p4)
print("type of p5:",type(p5))
print(p5)

('type of p4:', )
2017-06-19 09:13:45
('type of p5:', )
2017-06-19 09:13:45

你可能感兴趣的:(Pandas-创建数据)