在我进行pandas的groupby分组的时候,我们的数据集如下:
data.groupby('性别')
报错为:
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In [11], line 1
----> 1 data.groupby('性别')
File d:\Anaconda\envs\PyTorch\lib\site-packages\pandas\core\frame.py:7721, in DataFrame.groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, observed, dropna)
7716 axis = self._get_axis_number(axis)
7718 # https://github.com/python/mypy/issues/7642
7719 # error: Argument "squeeze" to "DataFrameGroupBy" has incompatible type
7720 # "Union[bool, NoDefault]"; expected "bool"
-> 7721 return DataFrameGroupBy(
7722 obj=self,
7723 keys=by,
7724 axis=axis,
7725 level=level,
7726 as_index=as_index,
7727 sort=sort,
7728 group_keys=group_keys,
7729 squeeze=squeeze, # type: ignore[arg-type]
7730 observed=observed,
7731 dropna=dropna,
7732 )
File d:\Anaconda\envs\PyTorch\lib\site-packages\pandas\core\groupby\groupby.py:882, in GroupBy.__init__(self, obj, keys, axis, level, grouper, exclusions, selection, as_index, sort, group_keys, squeeze, observed, mutated, dropna)
879 if grouper is None:
...
--> 877 raise ValueError(f"Grouper for '{name}' not 1-dimensional")
878 exclusions.add(name)
879 elif obj._is_level_reference(gpr, axis=axis):
ValueError: Grouper for '性别' not 1-dimensional
跑出这种错误有两种情况:
例如:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(6).reshape(3,2),columns=['bar','bar'])
df
bar bar
0 0 1
1 2 3
2 4 5
这种情况比较好解决,就是将列名更改成不一样就行。
df.columns = [['A','B']]
df
A B
0 0 1
1 2 3
2 4 5
这个DataFrame进行groupby也会报错:
df.groupby('A')
ValueError: Grouper for 'bar' not 1-dimensional
这个查看df的columns就知道了:
df.columns
MultiIndex(levels=[['A', 'B']],labels=[[0, 1]])
可以看到labels=[[0,1]]是一个二维的,只需要重新索引为一维即可:
df.columns = ['A','B']
df.groupby('A')
<pandas.core.groupby.groupby.DataFrameGroupBy object at 0x000002B3A74F2828>
我们把列名改成1维的:
lst2 = ['姓名', '性别', '班级', '身高']