Pandas学习笔记:如何处理Pandas中的SettingWithCopyWarning?

什么是复制操作警告(SettingWithCopyWarning)?

(这种常用词下文首次出现时用中英文,后面直接用英文)
要弄清楚如何处理这种警告,首先要弄清楚它的含义和出现的原因。
当过滤(filter)数据集(DataFrame)时,对数据集进行切片或者引用操作有可能会返回一个视图(view),也可能返回一个副本(copy),这取决于内在的程序设计或者各种执行细节。View顾名思义,就是对原始数据的观察,因此修改视图也可能会直接改变原数据。另一方面,副本(copy)是对原数据的复制,因此修改副本对于原数据没有影响。
情况一:

df = pd.DataFrame({'A': 'aaa bbb ccc ddd eee aaa bbb ccc'.split(),
    'B': 'one one one two two two two two'.split(),
    'C': [2,35,5,6,8,56,44,72], 'D': [23,36,55,78,81,65,57,99]})
df
     A    B   C   D
0  aaa  one   2  23
1  bbb  one  35  36
2  ccc  one   5  55
3  ddd  two   6  78
4  eee  two   8  81
5  aaa  two  56  65
6  bbb  two  44  57
7  ccc  two  72  99
df[df['A'] == 'aaa']['B'] = 'three'
:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

解决办法,使用loc命令:

df.loc[df['A'] == 'aaa','B'] = 'three'
df
     A      B   C   D
0  aaa  three   2  23
1  bbb    one  35  36
2  ccc    one   5  55
3  ddd    two   6  78
4  eee    two   8  81
5  aaa  three  56  65
6  bbb    two  44  57
7  ccc    two  72  99


情况二:

df1 = df[df['B'].str.contains('w')]
df1.loc[df1['A']=='bbb','C'] = 111
D:\pycharm\lib\site-packages\pandas\core\indexing.py:543: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s

原因是df1为df ‘B’ 列中包含‘w’字符的dataframe,返回的是为view,只要改为强制返回副本copy就可以:

df1 = df[df['B'].str.contains('w')].copy()
df1.loc[df1['A']=='bbb','C'] = 111
df1
     A    B    C   D    E
3  ddd  two    6  78  156
4  eee  two    8  81  162
6  bbb  two  111  57  114
7  ccc  two   72  99  198

 

你可能感兴趣的:(Python)