Python之pandas新增列

1.导入模块
>>> import pandas as pd 
2.解决DataFrame中的行列显示不全问题
>>> pd.set_option('display.max_rows', 100,'display.max_columns', 1000,"display.max_colwidth",1000,'display.width',1000)
3.导入数据表格
>>> air_quality = pd.read_csv(r"C:\Users\Administrator\Desktop\air_quality_no2.csv", index_col=0, parse_dates=True)
>>> air_quality.head()
                     station_antwerp  station_paris  station_london
datetime                                                           
2019-05-07 02:00:00              NaN            NaN            23.0
2019-05-07 03:00:00             50.5           25.0            19.0
2019-05-07 04:00:00             45.0           27.7            19.0
2019-05-07 05:00:00              NaN           50.4            16.0
2019-05-07 06:00:00              NaN           61.9             NaN
4.两列相除得到新列,添加列"ratio_paris_antwerp",
>>> air_quality["ratio_paris_antwerp"] = air_quality["station_paris"] / air_quality["station_antwerp"]
>>> air_quality.head()
                     station_antwerp  station_paris  station_london  ratio_paris_antwerp
datetime                                                                                
2019-05-07 02:00:00              NaN            NaN            23.0                  NaN
2019-05-07 03:00:00             50.5           25.0            19.0             0.495050
2019-05-07 04:00:00             45.0           27.7            19.0             0.615556
2019-05-07 05:00:00              NaN           50.4            16.0                  NaN
2019-05-07 06:00:00              NaN           61.9             NaN                  NaN
5.单列结果乘指定值,添加列"london_mg_per_cubic"
>>> air_quality["london_mg_per_cubic"] = air_quality["station_london"] * 1.882
>>> air_quality.head()
                     station_antwerp  station_paris  station_london  ratio_paris_antwerp  london_mg_per_cubic
datetime                                                                                                     
2019-05-07 02:00:00              NaN            NaN            23.0                  NaN               43.286
2019-05-07 03:00:00             50.5           25.0            19.0             0.495050               35.758
2019-05-07 04:00:00             45.0           27.7            19.0             0.615556               35.758
2019-05-07 05:00:00              NaN           50.4            16.0                  NaN               30.112
2019-05-07 06:00:00              NaN           61.9             NaN                  NaN                  NaN

DataFrame中的元素支持数学运算符(+,-,*,/)或逻辑运算符(<,>,=,…)也在元素上起作用,和使用条件表达式过滤表的行。

6.直接新增一列
>>> air_quality["last_colum"] = "NaN"
>>> air_quality.head()
                     station_antwerp  station_paris  station_london  ratio_paris_antwerp  london_mg_per_cubic last_colum
datetime                                                                                                                
2019-05-07 02:00:00              NaN            NaN            23.0                  NaN               43.286        NaN
2019-05-07 03:00:00             50.5           25.0            19.0             0.495050               35.758        NaN
2019-05-07 04:00:00             45.0           27.7            19.0             0.615556               35.758        NaN
2019-05-07 05:00:00              NaN           50.4            16.0                  NaN               30.112        NaN
2019-05-07 06:00:00              NaN           61.9             NaN                  NaN                  NaN        NaN
7.修改指定列的名称
>>> air_quality_renamed = air_quality.rename(
                                            columns={"station_antwerp": "BETR801",
                                                    "station_paris": "FR04014",
                                                    "station_london": "London Westminster"})
... ... ... >>> air_quality_renamed.head()
                     BETR801  FR04014  London Westminster  ratio_paris_antwerp  london_mg_per_cubic
datetime                                                                                           
2019-05-07 02:00:00      NaN      NaN                23.0                  NaN               43.286
2019-05-07 03:00:00     50.5     25.0                19.0             0.495050               35.758
2019-05-07 04:00:00     45.0     27.7                19.0             0.615556               35.758
2019-05-07 05:00:00      NaN     50.4                16.0                  NaN               30.112
2019-05-07 06:00:00      NaN     61.9                 NaN                  NaN                  NaN
8.使用函数将列名转换为小写字母
>>> air_quality_renamed = air_quality_renamed.rename(columns=str.lower)
>>> air_quality_renamed.head()
                     betr801  fr04014  london westminster  ratio_paris_antwerp  london_mg_per_cubic
datetime                                                                                           
2019-05-07 02:00:00      NaN      NaN                23.0                  NaN               43.286
2019-05-07 03:00:00     50.5     25.0                19.0             0.495050               35.758
2019-05-07 04:00:00     45.0     27.7                19.0             0.615556               35.758
2019-05-07 05:00:00      NaN     50.4                16.0                  NaN               30.112
2019-05-07 06:00:00      NaN     61.9                 NaN                  NaN                  NaN

rename()功能可用于行标签和列标签,提供一个字典,其中包含键,当前名称和值,以及新名称,以更新相应的名称,还可以是使用函数。

你可能感兴趣的:(Python之pandas新增列)