Iterate over DataFrame rows as (index, Series) pairs.
迭代(iterate)覆盖整个DataFrame的行中,返回(index, Series)对,以实现对整个Data进行遍历。
官方文档的例子
>>> df = pd.DataFrame([[1, 1.5]], columns=['int', 'float'])
>>> row = next(df.iterrows())[1]
>>> row
int 1.0
float 1.5
Name: 0, dtype: float64
>>> print(row['int'].dtype)
float64
>>> print(df['int'].dtype)
int64
借鉴官方文档,测试以下代码
>>>df = pd.DataFrame([[1,2.1,3],[4,5,6],[7,8,9]],index =
['1','2','3'],columns = ['A','B','C'])
>>>df
A B C
1 1 2.1 3
2 4 5.0 6
3 7 8.0 9
>>>a = df.iterrows()
>>>next(a)
('1', A 1.0
B 2.1
C 3.0
Name: 1, dtype: float64)
>>>next(a)[1]
A 4.0
B 5.0
C 6.0
Name: 2, dtype: float64
>>>next(a)[1]
A 7.0
B 8.0
C 9.0
Name: 3, dtype: float64
iterrows生成的一个由(index, Series)对组成的一个迭代器,第一次next(a)时返回第一行,index = 1,Series =
A 4.0
B 5.0
C 6.0
由于DataFrame中2.1的存在,生成的迭代器会将所有value统一为float型,
Iterator over (column name, Series) pairs.
与iterrows类似,对整个DataFrame遍历,生成一个由(column name ,Series)对组成的一个迭代器
测试代码如下
>>>df = pd.DataFrame([[1,2.1,3],[4,5,6],[7,8,9]],index =
['1','2','3'],columns = ['A','B','C'])
>>>df
A B C
1 1 2.1 3
2 4 5.0 6
3 7 8.0 9
>>>b = df.iteritems()
>>>next(b)
('A', 1 1
2 4
3 7
Name: A, dtype: int64)
>>>next(b)[1]
1 2.1
2 5.0
3 8.0
Name: B, dtype: float64
>>>next(b)[1]
1 3
2 6
3 9
Name: C, dtype: int64
官方文档中提到“Because iterrows returns a Series for each row, it does not preserve dtypes across the rows (dtypes are preserved across columns for DataFrames). ”
DataFrame行方向数据格式统一,列方向的数据格式保持不变。
Iterate over DataFrame rows as namedtuples, with index value as first element of the tuple.
>>> df = pd.DataFrame({'col1': [1, 2], 'col2': [0.1, 0.2]},
index=['a', 'b'])
>>> df
col1 col2
a 1 0.1
b 2 0.2
>>> for row in df.itertuples():
... print(row)
...
Pandas(Index='a', col1=1, col2=0.10000000000000001)
Pandas(Index='b', col1=2, col2=0.20000000000000001)