标签(空格分隔): python 数据分析
最近数据处理时需要对数据进行过滤,考虑下面的例子(波士顿房价):
需要去除掉部分异常值(GrLivArea大于4000 且 SalePrice低于300000 的值)
原先的写法:
data_df[(data_df['GrLivArea'] > 4000) and (data_df['SalePrice'] < 300000)]
这样写会报错:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
正确写法:
data_df[(data_df.GrLivArea > 4000) & (data_df.SalePrice < 300000)]
或者:
data_df[(data_df.GrLivArea > 4000) & (data_df.SalePrice < 300000)]
注意:两个条件需要括号括起来
参考链接:
https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o