Pandas11——excel重复数据

excel数据示例

要求:

1、以Name为参考,去除重复项
2、以Name为参考,输出重复项

import pandas as pd

employee=pd.read_excel("D:\\python_pandas\\sample\\demo13\\Students_Duplicates.xlsx")
table = employee.drop_duplicates(subset="Name",ignore_index=True)
print(table.head(30))
print("*"*40)
condition=employee.duplicated(subset="Name")
print(employee[condition])
输出结果:
    ID         Name  Test_1  Test_2  Test_3
0    1  Student_001      62      86      83
1    2  Student_002      77      97      78
2    3  Student_003      57      96      46
3    4  Student_004      57      87      80
4    5  Student_005      95      59      87
5    6  Student_006      56      97      61
6    7  Student_007      64      91      67
7    8  Student_008      96      70      48
8    9  Student_009      77      73      48
9   10  Student_010      90      94      67
10  11  Student_011      62      55      63
11  12  Student_012      83      76      81
12  13  Student_013      68      60      90
13  14  Student_014      82      68      98
14  15  Student_015      61      67      91
15  16  Student_016      59      63      46
16  17  Student_017      62      83      93
17  18  Student_018      90      75      80
18  19  Student_019     100      95      55
19  20  Student_020      61      87     100
****************************************
    ID         Name  Test_1  Test_2  Test_3
20  21  Student_001      62      86      83
21  22  Student_002      77      97      78
22  23  Student_003      57      96      46
23  24  Student_004      57      87      80
24  25  Student_005      95      59      87

推荐视频链接
官方介绍

你可能感兴趣的:(Pandas11——excel重复数据)