大数据分析与实践 使用Python以UCI心脏病数据集为例,进行数据简单分析

目录:

  • 模型介绍
  • 题目:
    • 处理一下数据
    • 1. 求心脏病患者年龄的平均值、中位数和众数,从结果里分析年龄与心脏病的关系
    • 2. 胆固醇正常值是0-200mg/dL,区分胆固醇不合格和不合格人员,用百分位数分析年龄和胆固醇的关系(哪个年龄段胆固醇不合格的多,对比两组进行分析)
    • 3. 求心脏病患者的胆固醇极差和四分位极差,并分析结果说明的问题
    • 4. 分析心脏病患者的胆固醇是否满足正太分布?
    • 5. 用相关系数或卡方计算12个属性和得心脏病的相关性,分析哪些因素对确诊心脏病作用大。

模型介绍

  1. 数据集:
63.0,1.0,1.0,145.0,233.0,1.0,2.0,150.0,0.0,2.3,3.0,0.0,6.0,0
67.0,1.0,4.0,160.0,286.0,0.0,2.0,108.0,1.0,1.5,2.0,3.0,3.0,2
67.0,1.0,4.0,120.0,229.0,0.0,2.0,129.0,1.0,2.6,2.0,2.0,7.0,1
37.0,1.0,3.0,130.0,250.0,0.0,0.0,187.0,0.0,3.5,3.0,0.0,3.0,0
41.0,0.0,2.0,130.0,204.0,0.0,2.0,172.0,0.0,1.4,1.0,0.0,3.0,0
56.0,1.0,2.0,120.0,236.0,0.0,0.0,178.0,0.0,0.8,1.0,0.0,3.0,0
62.0,0.0,4.0,140.0,268.0,0.0,2.0,160.0,0.0,3.6,3.0,2.0,3.0,3
57.0,0.0,4.0,120.0,354.0,0.0,0.0,163.0,1.0,0.6,1.0,0.0,3.0,0
63.0,1.0,4.0,130.0,254.0,0.0,2.0,147.0,0.0,1.4,2.0,1.0,7.0,2
53.0,1.0,4.0,140.0,203.0,1.0,2.0,155.0,1.0,3.1,3.0,0.0,7.0,1
57.0,1.0,4.0,140.0,192.0,0.0,0.0,148.0,0.0,0.4,2.0,0.0,6.0,0
56.0,0.0,2.0,140.0,294.0,0.0,2.0,153.0,0.0,1.3,2.0,0.0,3.0,0
56.0,1.0,3.0,130.0,256.0,1.0,2.0,142.0,1.0,0.6,2.0,1.0,6.0,2
44.0,1.0,2.0,120.0,263.0,0.0,0.0,173.0,0.0,0.0,1.0,0.0,7.0,0
52.0,1.0,3.0,172.0,199.0,1.0,0.0,162.0,0.0,0.5,1.0,0.0,7.0,0
57.0,1.0,3.0,150.0,168.0,0.0,0.0,174.0,0.0,1.6,1.0,0.0,3.0,0
48.0,1.0,2.0,110.0,229.0,0.0,0.0,168.0,0.0,1.0,3.0,0.0,7.0,1
54.0,1.0,4.0,140.0,239.0,0.0,0.0,160.0,0.0,1.2,1.0,0.0,3.0,0
48.0,0.0,3.0,130.0,275.0,0.0,0.0,139.0,0.0,0.2,1.0,0.0,3.0,0
49.0,1.0,2.0,130.0,266.0,0.0,0.0,171.0,0.0,0.6,1.0,0.0,3.0,0
64.0,1.0,1.0,110.0,211.0,0.0,2.0,144.0,1.0,1.8,2.0,0.0,3.0,0
58.0,0.0,1.0,150.0,283.0,1.0,2.0,162.0,0.0,1.0,1.0,0.0,3.0,0
58.0,1.0,2.0,120.0,284.0,0.0,2.0,160.0,0.0,1.8,2.0,0.0,3.0,1
58.0,1.0,3.0,132.0,224.0,0.0,2.0,173.0,0.0,3.2,1.0,2.0,7.0,3
60.0,1.0,4.0,130.0,206.0,0.0,2.0,132.0,1.0,2.4,2.0,2.0,7.0,4
50.0,0.0,3.0,120.0,219.0,0.0,0.0,158.0,0.0,1.6,2.0,0.0,3.0,0
58.0,0.0,3.0,120.0,340.0,0.0,0.0,172.0,0.0,0.0,1.0,0.0,3.0,0
66.0,0.0,1.0,150.0,226.0,0.0,0.0,114.0,0.0,2.6,3.0,0.0,3.0,0
43.0,1.0,4.0,150.0,247.0,0.0,0.0,171.0,0.0,1.5,1.0,0.0,3.0,0
40.0,1.0,4.0,110.0,167.0,0.0,2.0,114.0,1.0,2.0,2.0,0.0,7.0,3
69.0,0.0,1.0,140.0,239.0,0.0,0.0,151.0,0.0,1.8,1.0,2.0,3.0,0
60.0,1.0,4.0,117.0,230.0,1.0,0.0,160.0,1.0,1.4,1.0,2.0,7.0,2
64.0,1.0,3.0,140.0,335.0,0.0,0.0,158.0,0.0,0.0,1.0,0.0,3.0,1
59.0,1.0,4.0,135.0,234.0,0.0,0.0,161.0,0.0,0.5,2.0,0.0,7.0,0
44.0,1.0,3.0,130.0,233.0,0.0,0.0,179.0,1.0,0.4,1.0,0.0,3.0,0
42.0,1.0,4.0,140.0,226.0,0.0,0.0,178.0,0.0,0.0,1.0,0.0,3.0,0
43.0,1.0,4.0,120.0,177.0,0.0,2.0,120.0,1.0,2.5,2.0,0.0,7.0,3
57.0,1.0,4.0,150.0,276.0,0.0,2.0,112.0,1.0,0.6,2.0,1.0,6.0,1
55.0,1.0,4.0,132.0,353.0,0.0,0.0,132.0,1.0,1.2,2.0,1.0,7.0,3
61.0,1.0,3.0,150.0,243.0,1.0,0.0,137.0,1.0,1.0,2.0,0.0,3.0,0
65.0,0.0,4.0,150.0,225.0,0.0,2.0,114.0,0.0,1.0,2.0,3.0,7.0,4
40.0,1.0,1.0,140.0,199.0,0.0,0.0,178.0,1.0,1.4,1.0,0.0,7.0,0
71.0,0.0,2.0,160.0,302.0,0.0,0.0,162.0,0.0,0.4,1.0,2.0,3.0,0
59.0,1.0,3.0,150.0,212.0,1.0,0.0,157.0,0.0,1.6,1.0,0.0,3.0,0
61.0,0.0,4.0,130.0,330.0,0.0,2.0,169.0,0.0,0.0,1.0,0.0,3.0,1
58.0,1.0,3.0,112.0,230.0,0.0,2.0,165.0,0.0,2.5,2.0,1.0,7.0,4
51.0,1.0,3.0,110.0,175.0,0.0,0.0,123.0,0.0,0.6,1.0,0.0,3.0,0
50.0,1.0,4.0,150.0,243.0,0.0,2.0,128.0,0.0,2.6,2.0,0.0,7.0,4
65.0,0.0,3.0,140.0,417.0,1.0,2.0,157.0,0.0,0.8,1.0,1.0,3.0,0
53.0,1.0,3.0,130.0,197.0,1.0,2.0,152.0,0.0,1.2,3.0,0.0,3.0,0
41.0,0.0,2.0,105.0,198.0,0.0,0.0,168.0,0.0,0.0,1.0,1.0,3.0,0
65.0,1.0,4.0,120.0,177.0,0.0,0.0,140.0,0.0,0.4,1.0,0.0,7.0,0
44.0,1.0,4.0,112.0,290.0,0.0,2.0,153.0,0.0,0.0,1.0,1.0,3.0,2
44.0,1.0,2.0,130.0,219.0,0.0,2.0,188.0,0.0,0.0,1.0,0.0,3.0,0
60.0,1.0,4.0,130.0,253.0,0.0,0.0,144.0,1.0,1.4,1.0,1.0,7.0,1
54.0,1.0,4.0,124.0,266.0,0.0,2.0,109.0,1.0,2.2,2.0,1.0,7.0,1
50.0,1.0,3.0,140.0,233.0,0.0,0.0,163.0,0.0,0.6,2.0,1.0,7.0,1
41.0,1.0,4.0,110.0,172.0,0.0,2.0,158.0,0.0,0.0,1.0,0.0,7.0,1
54.0,1.0,3.0,125.0,273.0,0.0,2.0,152.0,0.0,0.5,3.0,1.0,3.0,0
51.0,1.0,1.0,125.0,213.0,0.0,2.0,125.0,1.0,1.4,1.0,1.0,3.0,0
51.0,0.0,4.0,130.0,305.0,0.0,0.0,142.0,1.0,1.2,2.0,0.0,7.0,2
46.0,0.0,3.0,142.0,177.0,0.0,2.0,160.0,1.0,1.4,3.0,0.0,3.0,0
58.0,1.0,4.0,128.0,216.0,0.0,2.0,131.0,1.0,2.2,2.0,3.0,7.0,1
54.0,0.0,3.0,135.0,304.0,1.0,0.0,170.0,0.0,0.0,1.0,0.0,3.0,0
54.0,1.0,4.0,120.0,188.0,0.0,0.0,113.0,0.0,1.4,2.0,1.0,7.0,2
60.0,1.0,4.0,145.0,282.0,0.0,2.0,142.0,1.0,2.8,2.0,2.0,7.0,2
60.0,1.0,3.0,140.0,185.0,0.0,2.0,155.0,0.0,3.0,2.0,0.0,3.0,1
54.0,1.0,3.0,150.0,232.0,0.0,2.0,165.0,0.0,1.6,1.0,0.0,7.0,0
59.0,1.0,4.0,170.0,326.0,0.0,2.0,140.0,1.0,3.4,3.0,0.0,7.0,2
46.0,1.0,3.0,150.0,231.0,0.0,0.0,147.0,0.0,3.6,2.0,0.0,3.0,1
65.0,0.0,3.0,155.0,269.0,0.0,0.0,148.0,0.0,0.8,1.0,0.0,3.0,0
67.0,1.0,4.0,125.0,254.0,1.0,0.0,163.0,0.0,0.2,2.0,2.0,7.0,3
62.0,1.0,4.0,120.0,267.0,0.0,0.0,99.0,1.0,1.8,2.0,2.0,7.0,1
65.0,1.0,4.0,110.0,248.0,0.0,2.0,158.0,0.0,0.6,1.0,2.0,6.0,1
44.0,1.0,4.0,110.0,197.0,0.0,2.0,177.0,0.0,0.0,1.0,1.0,3.0,1
65.0,0.0,3.0,160.0,360.0,0.0,2.0,151.0,0.0,0.8,1.0,0.0,3.0,0
60.0,1.0,4.0,125.0,258.0,0.0,2.0,141.0,1.0,2.8,2.0,1.0,7.0,1
51.0,0.0,3.0,140.0,308.0,0.0,2.0,142.0,0.0,1.5,1.0,1.0,3.0,0
48.0,1.0,2.0,130.0,245.0,0.0,2.0,180.0,0.0,0.2,2.0,0.0,3.0,0
58.0,1.0,4.0,150.0,270.0,0.0,2.0,111.0,1.0,0.8,1.0,0.0,7.0,3
45.0,1.0,4.0,104.0,208.0,0.0,2.0,148.0,1.0,3.0,2.0,0.0,3.0,0
53.0,0.0,4.0,130.0,264.0,0.0,2.0,143.0,0.0,0.4,2.0,0.0,3.0,0
39.0,1.0,3.0,140.0,321.0,0.0,2.0,182.0,0.0,0.0,1.0,0.0,3.0,0
68.0,1.0,3.0,180.0,274.0,1.0,2.0,150.0,1.0,1.6,2.0,0.0,7.0,3
52.0,1.0,2.0,120.0,325.0,0.0,0.0,172.0,0.0,0.2,1.0,0.0,3.0,0
44.0,1.0,3.0,140.0,235.0,0.0,2.0,180.0,0.0,0.0,1.0,0.0,3.0,0
47.0,1.0,3.0,138.0,257.0,0.0,2.0,156.0,0.0,0.0,1.0,0.0,3.0,0
53.0,0.0,3.0,128.0,216.0,0.0,2.0,115.0,0.0,0.0,1.0,0.0,?,0
53.0,0.0,4.0,138.0,234.0,0.0,2.0,160.0,0.0,0.0,1.0,0.0,3.0,0
51.0,0.0,3.0,130.0,256.0,0.0,2.0,149.0,0.0,0.5,1.0,0.0,3.0,0
66.0,1.0,4.0,120.0,302.0,0.0,2.0,151.0,0.0,0.4,2.0,0.0,3.0,0
62.0,0.0,4.0,160.0,164.0,0.0,2.0,145.0,0.0,6.2,3.0,3.0,7.0,3
62.0,1.0,3.0,130.0,231.0,0.0,0.0,146.0,0.0,1.8,2.0,3.0,7.0,0
44.0,0.0,3.0,108.0,141.0,0.0,0.0,175.0,0.0,0.6,2.0,0.0,3.0,0
63.0,0.0,3.0,135.0,252.0,0.0,2.0,172.0,0.0,0.0,1.0,0.0,3.0,0
52.0,1.0,4.0,128.0,255.0,0.0,0.0,161.0,1.0,0.0,1.0,1.0,7.0,1
59.0,1.0,4.0,110.0,239.0,0.0,2.0,142.0,1.0,1.2,2.0,1.0,7.0,2
60.0,0.0,4.0,150.0,258.0,0.0,2.0,157.0,0.0,2.6,2.0,2.0,7.0,3
52.0,1.0,2.0,134.0,201.0,0.0,0.0,158.0,0.0,0.8,1.0,1.0,3.0,0
48.0,1.0,4.0,122.0,222.0,0.0,2.0,186.0,0.0,0.0,1.0,0.0,3.0,0
45.0,1.0,4.0,115.0,260.0,0.0,2.0,185.0,0.0,0.0,1.0,0.0,3.0,0
34.0,1.0,1.0,118.0,182.0,0.0,2.0,174.0,0.0,0.0,1.0,0.0,3.0,0
57.0,0.0,4.0,128.0,303.0,0.0,2.0,159.0,0.0,0.0,1.0,1.0,3.0,0
71.0,0.0,3.0,110.0,265.0,1.0,2.0,130.0,0.0,0.0,1.0,1.0,3.0,0
49.0,1.0,3.0,120.0,188.0,0.0,0.0,139.0,0.0,2.0,2.0,3.0,7.0,3
54.0,1.0,2.0,108.0,309.0,0.0,0.0,156.0,0.0,0.0,1.0,0.0,7.0,0
59.0,1.0,4.0,140.0,177.0,0.0,0.0,162.0,1.0,0.0,1.0,1.0,7.0,2
57.0,1.0,3.0,128.0,229.0,0.0,2.0,150.0,0.0,0.4,2.0,1.0,7.0,1
61.0,1.0,4.0,120.0,260.0,0.0,0.0,140.0,1.0,3.6,2.0,1.0,7.0,2
39.0,1.0,4.0,118.0,219.0,0.0,0.0,140.0,0.0,1.2,2.0,0.0,7.0,3
61.0,0.0,4.0,145.0,307.0,0.0,2.0,146.0,1.0,1.0,2.0,0.0,7.0,1
56.0,1.0,4.0,125.0,249.0,1.0,2.0,144.0,1.0,1.2,2.0,1.0,3.0,1
52.0,1.0,1.0,118.0,186.0,0.0,2.0,190.0,0.0,0.0,2.0,0.0,6.0,0
43.0,0.0,4.0,132.0,341.0,1.0,2.0,136.0,1.0,3.0,2.0,0.0,7.0,2
62.0,0.0,3.0,130.0,263.0,0.0,0.0,97.0,0.0,1.2,2.0,1.0,7.0,2
41.0,1.0,2.0,135.0,203.0,0.0,0.0,132.0,0.0,0.0,2.0,0.0,6.0,0
58.0,1.0,3.0,140.0,211.0,1.0,2.0,165.0,0.0,0.0,1.0,0.0,3.0,0
35.0,0.0,4.0,138.0,183.0,0.0,0.0,182.0,0.0,1.4,1.0,0.0,3.0,0
63.0,1.0,4.0,130.0,330.0,1.0,2.0,132.0,1.0,1.8,1.0,3.0,7.0,3
65.0,1.0,4.0,135.0,254.0,0.0,2.0,127.0,0.0,2.8,2.0,1.0,7.0,2
48.0,1.0,4.0,130.0,256.0,1.0,2.0,150.0,1.0,0.0,1.0,2.0,7.0,3
63.0,0.0,4.0,150.0,407.0,0.0,2.0,154.0,0.0,4.0,2.0,3.0,7.0,4
51.0,1.0,3.0,100.0,222.0,0.0,0.0,143.0,1.0,1.2,2.0,0.0,3.0,0
55.0,1.0,4.0,140.0,217.0,0.0,0.0,111.0,1.0,5.6,3.0,0.0,7.0,3
65.0,1.0,1.0,138.0,282.0,1.0,2.0,174.0,0.0,1.4,2.0,1.0,3.0,1
45.0,0.0,2.0,130.0,234.0,0.0,2.0,175.0,0.0,0.6,2.0,0.0,3.0,0
56.0,0.0,4.0,200.0,288.0,1.0,2.0,133.0,1.0,4.0,3.0,2.0,7.0,3
54.0,1.0,4.0,110.0,239.0,0.0,0.0,126.0,1.0,2.8,2.0,1.0,7.0,3
44.0,1.0,2.0,120.0,220.0,0.0,0.0,170.0,0.0,0.0,1.0,0.0,3.0,0
62.0,0.0,4.0,124.0,209.0,0.0,0.0,163.0,0.0,0.0,1.0,0.0,3.0,0
54.0,1.0,3.0,120.0,258.0,0.0,2.0,147.0,0.0,0.4,2.0,0.0,7.0,0
51.0,1.0,3.0,94.0,227.0,0.0,0.0,154.0,1.0,0.0,1.0,1.0,7.0,0
29.0,1.0,2.0,130.0,204.0,0.0,2.0,202.0,0.0,0.0,1.0,0.0,3.0,0
51.0,1.0,4.0,140.0,261.0,0.0,2.0,186.0,1.0,0.0,1.0,0.0,3.0,0
43.0,0.0,3.0,122.0,213.0,0.0,0.0,165.0,0.0,0.2,2.0,0.0,3.0,0
55.0,0.0,2.0,135.0,250.0,0.0,2.0,161.0,0.0,1.4,2.0,0.0,3.0,0
70.0,1.0,4.0,145.0,174.0,0.0,0.0,125.0,1.0,2.6,3.0,0.0,7.0,4
62.0,1.0,2.0,120.0,281.0,0.0,2.0,103.0,0.0,1.4,2.0,1.0,7.0,3
35.0,1.0,4.0,120.0,198.0,0.0,0.0,130.0,1.0,1.6,2.0,0.0,7.0,1
51.0,1.0,3.0,125.0,245.0,1.0,2.0,166.0,0.0,2.4,2.0,0.0,3.0,0
59.0,1.0,2.0,140.0,221.0,0.0,0.0,164.0,1.0,0.0,1.0,0.0,3.0,0
59.0,1.0,1.0,170.0,288.0,0.0,2.0,159.0,0.0,0.2,2.0,0.0,7.0,1
52.0,1.0,2.0,128.0,205.0,1.0,0.0,184.0,0.0,0.0,1.0,0.0,3.0,0
64.0,1.0,3.0,125.0,309.0,0.0,0.0,131.0,1.0,1.8,2.0,0.0,7.0,1
58.0,1.0,3.0,105.0,240.0,0.0,2.0,154.0,1.0,0.6,2.0,0.0,7.0,0
47.0,1.0,3.0,108.0,243.0,0.0,0.0,152.0,0.0,0.0,1.0,0.0,3.0,1
57.0,1.0,4.0,165.0,289.0,1.0,2.0,124.0,0.0,1.0,2.0,3.0,7.0,4
41.0,1.0,3.0,112.0,250.0,0.0,0.0,179.0,0.0,0.0,1.0,0.0,3.0,0
45.0,1.0,2.0,128.0,308.0,0.0,2.0,170.0,0.0,0.0,1.0,0.0,3.0,0
60.0,0.0,3.0,102.0,318.0,0.0,0.0,160.0,0.0,0.0,1.0,1.0,3.0,0
52.0,1.0,1.0,152.0,298.0,1.0,0.0,178.0,0.0,1.2,2.0,0.0,7.0,0
42.0,0.0,4.0,102.0,265.0,0.0,2.0,122.0,0.0,0.6,2.0,0.0,3.0,0
67.0,0.0,3.0,115.0,564.0,0.0,2.0,160.0,0.0,1.6,2.0,0.0,7.0,0
55.0,1.0,4.0,160.0,289.0,0.0,2.0,145.0,1.0,0.8,2.0,1.0,7.0,4
64.0,1.0,4.0,120.0,246.0,0.0,2.0,96.0,1.0,2.2,3.0,1.0,3.0,3
70.0,1.0,4.0,130.0,322.0,0.0,2.0,109.0,0.0,2.4,2.0,3.0,3.0,1
51.0,1.0,4.0,140.0,299.0,0.0,0.0,173.0,1.0,1.6,1.0,0.0,7.0,1
58.0,1.0,4.0,125.0,300.0,0.0,2.0,171.0,0.0,0.0,1.0,2.0,7.0,1
60.0,1.0,4.0,140.0,293.0,0.0,2.0,170.0,0.0,1.2,2.0,2.0,7.0,2
68.0,1.0,3.0,118.0,277.0,0.0,0.0,151.0,0.0,1.0,1.0,1.0,7.0,0
46.0,1.0,2.0,101.0,197.0,1.0,0.0,156.0,0.0,0.0,1.0,0.0,7.0,0
77.0,1.0,4.0,125.0,304.0,0.0,2.0,162.0,1.0,0.0,1.0,3.0,3.0,4
54.0,0.0,3.0,110.0,214.0,0.0,0.0,158.0,0.0,1.6,2.0,0.0,3.0,0
58.0,0.0,4.0,100.0,248.0,0.0,2.0,122.0,0.0,1.0,2.0,0.0,3.0,0
48.0,1.0,3.0,124.0,255.0,1.0,0.0,175.0,0.0,0.0,1.0,2.0,3.0,0
57.0,1.0,4.0,132.0,207.0,0.0,0.0,168.0,1.0,0.0,1.0,0.0,7.0,0
52.0,1.0,3.0,138.0,223.0,0.0,0.0,169.0,0.0,0.0,1.0,?,3.0,0
54.0,0.0,2.0,132.0,288.0,1.0,2.0,159.0,1.0,0.0,1.0,1.0,3.0,0
35.0,1.0,4.0,126.0,282.0,0.0,2.0,156.0,1.0,0.0,1.0,0.0,7.0,1
45.0,0.0,2.0,112.0,160.0,0.0,0.0,138.0,0.0,0.0,2.0,0.0,3.0,0
70.0,1.0,3.0,160.0,269.0,0.0,0.0,112.0,1.0,2.9,2.0,1.0,7.0,3
53.0,1.0,4.0,142.0,226.0,0.0,2.0,111.0,1.0,0.0,1.0,0.0,7.0,0
59.0,0.0,4.0,174.0,249.0,0.0,0.0,143.0,1.0,0.0,2.0,0.0,3.0,1
62.0,0.0,4.0,140.0,394.0,0.0,2.0,157.0,0.0,1.2,2.0,0.0,3.0,0
64.0,1.0,4.0,145.0,212.0,0.0,2.0,132.0,0.0,2.0,2.0,2.0,6.0,4
57.0,1.0,4.0,152.0,274.0,0.0,0.0,88.0,1.0,1.2,2.0,1.0,7.0,1
52.0,1.0,4.0,108.0,233.0,1.0,0.0,147.0,0.0,0.1,1.0,3.0,7.0,0
56.0,1.0,4.0,132.0,184.0,0.0,2.0,105.0,1.0,2.1,2.0,1.0,6.0,1
43.0,1.0,3.0,130.0,315.0,0.0,0.0,162.0,0.0,1.9,1.0,1.0,3.0,0
53.0,1.0,3.0,130.0,246.0,1.0,2.0,173.0,0.0,0.0,1.0,3.0,3.0,0
48.0,1.0,4.0,124.0,274.0,0.0,2.0,166.0,0.0,0.5,2.0,0.0,7.0,3
56.0,0.0,4.0,134.0,409.0,0.0,2.0,150.0,1.0,1.9,2.0,2.0,7.0,2
42.0,1.0,1.0,148.0,244.0,0.0,2.0,178.0,0.0,0.8,1.0,2.0,3.0,0
59.0,1.0,1.0,178.0,270.0,0.0,2.0,145.0,0.0,4.2,3.0,0.0,7.0,0
60.0,0.0,4.0,158.0,305.0,0.0,2.0,161.0,0.0,0.0,1.0,0.0,3.0,1
63.0,0.0,2.0,140.0,195.0,0.0,0.0,179.0,0.0,0.0,1.0,2.0,3.0,0
42.0,1.0,3.0,120.0,240.0,1.0,0.0,194.0,0.0,0.8,3.0,0.0,7.0,0
66.0,1.0,2.0,160.0,246.0,0.0,0.0,120.0,1.0,0.0,2.0,3.0,6.0,2
54.0,1.0,2.0,192.0,283.0,0.0,2.0,195.0,0.0,0.0,1.0,1.0,7.0,1
69.0,1.0,3.0,140.0,254.0,0.0,2.0,146.0,0.0,2.0,2.0,3.0,7.0,2
50.0,1.0,3.0,129.0,196.0,0.0,0.0,163.0,0.0,0.0,1.0,0.0,3.0,0
51.0,1.0,4.0,140.0,298.0,0.0,0.0,122.0,1.0,4.2,2.0,3.0,7.0,3
43.0,1.0,4.0,132.0,247.0,1.0,2.0,143.0,1.0,0.1,2.0,?,7.0,1
62.0,0.0,4.0,138.0,294.0,1.0,0.0,106.0,0.0,1.9,2.0,3.0,3.0,2
68.0,0.0,3.0,120.0,211.0,0.0,2.0,115.0,0.0,1.5,2.0,0.0,3.0,0
67.0,1.0,4.0,100.0,299.0,0.0,2.0,125.0,1.0,0.9,2.0,2.0,3.0,3
69.0,1.0,1.0,160.0,234.0,1.0,2.0,131.0,0.0,0.1,2.0,1.0,3.0,0
45.0,0.0,4.0,138.0,236.0,0.0,2.0,152.0,1.0,0.2,2.0,0.0,3.0,0
50.0,0.0,2.0,120.0,244.0,0.0,0.0,162.0,0.0,1.1,1.0,0.0,3.0,0
59.0,1.0,1.0,160.0,273.0,0.0,2.0,125.0,0.0,0.0,1.0,0.0,3.0,1
50.0,0.0,4.0,110.0,254.0,0.0,2.0,159.0,0.0,0.0,1.0,0.0,3.0,0
64.0,0.0,4.0,180.0,325.0,0.0,0.0,154.0,1.0,0.0,1.0,0.0,3.0,0
57.0,1.0,3.0,150.0,126.0,1.0,0.0,173.0,0.0,0.2,1.0,1.0,7.0,0
64.0,0.0,3.0,140.0,313.0,0.0,0.0,133.0,0.0,0.2,1.0,0.0,7.0,0
43.0,1.0,4.0,110.0,211.0,0.0,0.0,161.0,0.0,0.0,1.0,0.0,7.0,0
45.0,1.0,4.0,142.0,309.0,0.0,2.0,147.0,1.0,0.0,2.0,3.0,7.0,3
58.0,1.0,4.0,128.0,259.0,0.0,2.0,130.0,1.0,3.0,2.0,2.0,7.0,3
50.0,1.0,4.0,144.0,200.0,0.0,2.0,126.0,1.0,0.9,2.0,0.0,7.0,3
55.0,1.0,2.0,130.0,262.0,0.0,0.0,155.0,0.0,0.0,1.0,0.0,3.0,0
62.0,0.0,4.0,150.0,244.0,0.0,0.0,154.0,1.0,1.4,2.0,0.0,3.0,1
37.0,0.0,3.0,120.0,215.0,0.0,0.0,170.0,0.0,0.0,1.0,0.0,3.0,0
38.0,1.0,1.0,120.0,231.0,0.0,0.0,182.0,1.0,3.8,2.0,0.0,7.0,4
41.0,1.0,3.0,130.0,214.0,0.0,2.0,168.0,0.0,2.0,2.0,0.0,3.0,0
66.0,0.0,4.0,178.0,228.0,1.0,0.0,165.0,1.0,1.0,2.0,2.0,7.0,3
52.0,1.0,4.0,112.0,230.0,0.0,0.0,160.0,0.0,0.0,1.0,1.0,3.0,1
56.0,1.0,1.0,120.0,193.0,0.0,2.0,162.0,0.0,1.9,2.0,0.0,7.0,0
46.0,0.0,2.0,105.0,204.0,0.0,0.0,172.0,0.0,0.0,1.0,0.0,3.0,0
46.0,0.0,4.0,138.0,243.0,0.0,2.0,152.0,1.0,0.0,2.0,0.0,3.0,0
64.0,0.0,4.0,130.0,303.0,0.0,0.0,122.0,0.0,2.0,2.0,2.0,3.0,0
59.0,1.0,4.0,138.0,271.0,0.0,2.0,182.0,0.0,0.0,1.0,0.0,3.0,0
41.0,0.0,3.0,112.0,268.0,0.0,2.0,172.0,1.0,0.0,1.0,0.0,3.0,0
54.0,0.0,3.0,108.0,267.0,0.0,2.0,167.0,0.0,0.0,1.0,0.0,3.0,0
39.0,0.0,3.0,94.0,199.0,0.0,0.0,179.0,0.0,0.0,1.0,0.0,3.0,0
53.0,1.0,4.0,123.0,282.0,0.0,0.0,95.0,1.0,2.0,2.0,2.0,7.0,3
63.0,0.0,4.0,108.0,269.0,0.0,0.0,169.0,1.0,1.8,2.0,2.0,3.0,1
34.0,0.0,2.0,118.0,210.0,0.0,0.0,192.0,0.0,0.7,1.0,0.0,3.0,0
47.0,1.0,4.0,112.0,204.0,0.0,0.0,143.0,0.0,0.1,1.0,0.0,3.0,0
67.0,0.0,3.0,152.0,277.0,0.0,0.0,172.0,0.0,0.0,1.0,1.0,3.0,0
54.0,1.0,4.0,110.0,206.0,0.0,2.0,108.0,1.0,0.0,2.0,1.0,3.0,3
66.0,1.0,4.0,112.0,212.0,0.0,2.0,132.0,1.0,0.1,1.0,1.0,3.0,2
52.0,0.0,3.0,136.0,196.0,0.0,2.0,169.0,0.0,0.1,2.0,0.0,3.0,0
55.0,0.0,4.0,180.0,327.0,0.0,1.0,117.0,1.0,3.4,2.0,0.0,3.0,2
49.0,1.0,3.0,118.0,149.0,0.0,2.0,126.0,0.0,0.8,1.0,3.0,3.0,1
74.0,0.0,2.0,120.0,269.0,0.0,2.0,121.0,1.0,0.2,1.0,1.0,3.0,0
54.0,0.0,3.0,160.0,201.0,0.0,0.0,163.0,0.0,0.0,1.0,1.0,3.0,0
54.0,1.0,4.0,122.0,286.0,0.0,2.0,116.0,1.0,3.2,2.0,2.0,3.0,3
56.0,1.0,4.0,130.0,283.0,1.0,2.0,103.0,1.0,1.6,3.0,0.0,7.0,2
46.0,1.0,4.0,120.0,249.0,0.0,2.0,144.0,0.0,0.8,1.0,0.0,7.0,1
49.0,0.0,2.0,134.0,271.0,0.0,0.0,162.0,0.0,0.0,2.0,0.0,3.0,0
42.0,1.0,2.0,120.0,295.0,0.0,0.0,162.0,0.0,0.0,1.0,0.0,3.0,0
41.0,1.0,2.0,110.0,235.0,0.0,0.0,153.0,0.0,0.0,1.0,0.0,3.0,0
41.0,0.0,2.0,126.0,306.0,0.0,0.0,163.0,0.0,0.0,1.0,0.0,3.0,0
49.0,0.0,4.0,130.0,269.0,0.0,0.0,163.0,0.0,0.0,1.0,0.0,3.0,0
61.0,1.0,1.0,134.0,234.0,0.0,0.0,145.0,0.0,2.6,2.0,2.0,3.0,2
60.0,0.0,3.0,120.0,178.0,1.0,0.0,96.0,0.0,0.0,1.0,0.0,3.0,0
67.0,1.0,4.0,120.0,237.0,0.0,0.0,71.0,0.0,1.0,2.0,0.0,3.0,2
58.0,1.0,4.0,100.0,234.0,0.0,0.0,156.0,0.0,0.1,1.0,1.0,7.0,2
47.0,1.0,4.0,110.0,275.0,0.0,2.0,118.0,1.0,1.0,2.0,1.0,3.0,1
52.0,1.0,4.0,125.0,212.0,0.0,0.0,168.0,0.0,1.0,1.0,2.0,7.0,3
62.0,1.0,2.0,128.0,208.0,1.0,2.0,140.0,0.0,0.0,1.0,0.0,3.0,0
57.0,1.0,4.0,110.0,201.0,0.0,0.0,126.0,1.0,1.5,2.0,0.0,6.0,0
58.0,1.0,4.0,146.0,218.0,0.0,0.0,105.0,0.0,2.0,2.0,1.0,7.0,1
64.0,1.0,4.0,128.0,263.0,0.0,0.0,105.0,1.0,0.2,2.0,1.0,7.0,0
51.0,0.0,3.0,120.0,295.0,0.0,2.0,157.0,0.0,0.6,1.0,0.0,3.0,0
43.0,1.0,4.0,115.0,303.0,0.0,0.0,181.0,0.0,1.2,2.0,0.0,3.0,0
42.0,0.0,3.0,120.0,209.0,0.0,0.0,173.0,0.0,0.0,2.0,0.0,3.0,0
67.0,0.0,4.0,106.0,223.0,0.0,0.0,142.0,0.0,0.3,1.0,2.0,3.0,0
76.0,0.0,3.0,140.0,197.0,0.0,1.0,116.0,0.0,1.1,2.0,0.0,3.0,0
70.0,1.0,2.0,156.0,245.0,0.0,2.0,143.0,0.0,0.0,1.0,0.0,3.0,0
57.0,1.0,2.0,124.0,261.0,0.0,0.0,141.0,0.0,0.3,1.0,0.0,7.0,1
44.0,0.0,3.0,118.0,242.0,0.0,0.0,149.0,0.0,0.3,2.0,1.0,3.0,0
58.0,0.0,2.0,136.0,319.0,1.0,2.0,152.0,0.0,0.0,1.0,2.0,3.0,3
60.0,0.0,1.0,150.0,240.0,0.0,0.0,171.0,0.0,0.9,1.0,0.0,3.0,0
44.0,1.0,3.0,120.0,226.0,0.0,0.0,169.0,0.0,0.0,1.0,0.0,3.0,0
61.0,1.0,4.0,138.0,166.0,0.0,2.0,125.0,1.0,3.6,2.0,1.0,3.0,4
42.0,1.0,4.0,136.0,315.0,0.0,0.0,125.0,1.0,1.8,2.0,0.0,6.0,2
52.0,1.0,4.0,128.0,204.0,1.0,0.0,156.0,1.0,1.0,2.0,0.0,?,2
59.0,1.0,3.0,126.0,218.0,1.0,0.0,134.0,0.0,2.2,2.0,1.0,6.0,2
40.0,1.0,4.0,152.0,223.0,0.0,0.0,181.0,0.0,0.0,1.0,0.0,7.0,1
42.0,1.0,3.0,130.0,180.0,0.0,0.0,150.0,0.0,0.0,1.0,0.0,3.0,0
61.0,1.0,4.0,140.0,207.0,0.0,2.0,138.0,1.0,1.9,1.0,1.0,7.0,1
66.0,1.0,4.0,160.0,228.0,0.0,2.0,138.0,0.0,2.3,1.0,0.0,6.0,0
46.0,1.0,4.0,140.0,311.0,0.0,0.0,120.0,1.0,1.8,2.0,2.0,7.0,2
71.0,0.0,4.0,112.0,149.0,0.0,0.0,125.0,0.0,1.6,2.0,0.0,3.0,0
59.0,1.0,1.0,134.0,204.0,0.0,0.0,162.0,0.0,0.8,1.0,2.0,3.0,1
64.0,1.0,1.0,170.0,227.0,0.0,2.0,155.0,0.0,0.6,2.0,0.0,7.0,0
66.0,0.0,3.0,146.0,278.0,0.0,2.0,152.0,0.0,0.0,2.0,1.0,3.0,0
39.0,0.0,3.0,138.0,220.0,0.0,0.0,152.0,0.0,0.0,2.0,0.0,3.0,0
57.0,1.0,2.0,154.0,232.0,0.0,2.0,164.0,0.0,0.0,1.0,1.0,3.0,1
58.0,0.0,4.0,130.0,197.0,0.0,0.0,131.0,0.0,0.6,2.0,0.0,3.0,0
57.0,1.0,4.0,110.0,335.0,0.0,0.0,143.0,1.0,3.0,2.0,1.0,7.0,2
47.0,1.0,3.0,130.0,253.0,0.0,0.0,179.0,0.0,0.0,1.0,0.0,3.0,0
55.0,0.0,4.0,128.0,205.0,0.0,1.0,130.0,1.0,2.0,2.0,1.0,7.0,3
35.0,1.0,2.0,122.0,192.0,0.0,0.0,174.0,0.0,0.0,1.0,0.0,3.0,0
61.0,1.0,4.0,148.0,203.0,0.0,0.0,161.0,0.0,0.0,1.0,1.0,7.0,2
58.0,1.0,4.0,114.0,318.0,0.0,1.0,140.0,0.0,4.4,3.0,3.0,6.0,4
58.0,0.0,4.0,170.0,225.0,1.0,2.0,146.0,1.0,2.8,2.0,2.0,6.0,2
58.0,1.0,2.0,125.0,220.0,0.0,0.0,144.0,0.0,0.4,2.0,?,7.0,0
56.0,1.0,2.0,130.0,221.0,0.0,2.0,163.0,0.0,0.0,1.0,0.0,7.0,0
56.0,1.0,2.0,120.0,240.0,0.0,0.0,169.0,0.0,0.0,3.0,0.0,3.0,0
67.0,1.0,3.0,152.0,212.0,0.0,2.0,150.0,0.0,0.8,2.0,0.0,7.0,1
55.0,0.0,2.0,132.0,342.0,0.0,0.0,166.0,0.0,1.2,1.0,0.0,3.0,0
44.0,1.0,4.0,120.0,169.0,0.0,0.0,144.0,1.0,2.8,3.0,0.0,6.0,2
63.0,1.0,4.0,140.0,187.0,0.0,2.0,144.0,1.0,4.0,1.0,2.0,7.0,2
63.0,0.0,4.0,124.0,197.0,0.0,0.0,136.0,1.0,0.0,2.0,0.0,3.0,1
41.0,1.0,2.0,120.0,157.0,0.0,0.0,182.0,0.0,0.0,1.0,0.0,3.0,0
59.0,1.0,4.0,164.0,176.0,1.0,2.0,90.0,0.0,1.0,2.0,2.0,6.0,3
57.0,0.0,4.0,140.0,241.0,0.0,0.0,123.0,1.0,0.2,2.0,0.0,7.0,1
45.0,1.0,1.0,110.0,264.0,0.0,0.0,132.0,0.0,1.2,2.0,0.0,7.0,1
68.0,1.0,4.0,144.0,193.0,1.0,0.0,141.0,0.0,3.4,2.0,2.0,7.0,2
57.0,1.0,4.0,130.0,131.0,0.0,0.0,115.0,1.0,1.2,2.0,1.0,7.0,3
57.0,0.0,2.0,130.0,236.0,0.0,2.0,174.0,0.0,0.0,2.0,1.0,3.0,1
38.0,1.0,3.0,138.0,175.0,0.0,0.0,173.0,0.0,0.0,1.0,?,3.0,0

  1. 各列名称
数据集:uci心脏病数据集
数据属性说明:
age: 该朋友的年龄
sex: 该朋友的性别 (1 = 男性, 0 = 女性)
cp: 经历过的胸痛类型(值1:典型心绞痛,值2:非典型性心绞痛,值3:非心绞痛,值4:无症状)
trestbps: 该朋友的静息血压(入院时的毫米汞柱)
chol: 该朋友的胆固醇测量值,单位 :mg/dl
fbs: 人的空腹血糖(> 120 mg/dl,1=真;0=假)
restecg: 静息心电图测量(0=正常,1=患有ST-T波异常,2=根据Estes的标准显示可能或确定的左心室肥大)
thalach: 这朋友达到的最大心率
exang: 运动引起的心绞痛(1=有过;0=没有)
oldpeak: ST抑制,由运动引起的相对于休息引起的(“ ST”与ECG图上的位置有关。)
slope: 最高运动ST段的斜率(值1:上坡,值2:平坦,值3:下坡)
ca: 萤光显色的主要血管数目(0-4)
thal: 一种称为地中海贫血的血液疾病(3=正常;6=固定缺陷;7=可逆缺陷)
target: 心脏病(0=否,1=是)

题目:

处理一下数据

import pandas as pd

#      年龄,  性别,胸痛类型, 精细血压,   胆固醇, 空腹血糖,心电图测量,  最大心率,  心绞痛, ST抑制,最高运动St的斜率,主要血管数目,血液疾病,心脏病
name = ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca', 'thal', 'target']
# 读取数据
data = pd.read_csv('processed.cleveland.csv',names=name) # 增加列标签
# 当然也可以:data.columns = name
print(data)

运行结果:

      age  sex   cp  trestbps   chol  fbs  restecg  thalach  exang  oldpeak  slope   ca thal  target
0    63.0  1.0  1.0     145.0  233.0  1.0      2.0    150.0    0.0      2.3    3.0  0.0  6.0       0
1    67.0  1.0  4.0     160.0  286.0  0.0      2.0    108.0    1.0      1.5    2.0  3.0  3.0       2
2    67.0  1.0  4.0     120.0  229.0  0.0      2.0    129.0    1.0      2.6    2.0  2.0  7.0       1
3    37.0  1.0  3.0     130.0  250.0  0.0      0.0    187.0    0.0      3.5    3.0  0.0  3.0       0
4    41.0  0.0  2.0     130.0  204.0  0.0      2.0    172.0    0.0      1.4    1.0  0.0  3.0       0
5    56.0  1.0  2.0     120.0  236.0  0.0      0.0    178.0    0.0      0.8    1.0  0.0  3.0       0
6    62.0  0.0  4.0     140.0  268.0  0.0      2.0    160.0    0.0      3.6    3.0  2.0  3.0       3
7    57.0  0.0  4.0     120.0  354.0  0.0      0.0    163.0    1.0      0.6    1.0  0.0  3.0       0
8    63.0  1.0  4.0     130.0  254.0  0.0      2.0    147.0    0.0      1.4    2.0  1.0  7.0       2
9    53.0  1.0  4.0     140.0  203.0  1.0      2.0    155.0    1.0      3.1    3.0  0.0  7.0       1
10   57.0  1.0  4.0     140.0  192.0  0.0      0.0    148.0    0.0      0.4    2.0  0.0  6.0       0
11   56.0  0.0  2.0     140.0  294.0  0.0      2.0    153.0    0.0      1.3    2.0  0.0  3.0       0
12   56.0  1.0  3.0     130.0  256.0  1.0      2.0    142.0    1.0      0.6    2.0  1.0  6.0       2
13   44.0  1.0  2.0     120.0  263.0  0.0      0.0    173.0    0.0      0.0    1.0  0.0  7.0       0
14   52.0  1.0  3.0     172.0  199.0  1.0      0.0    162.0    0.0      0.5    1.0  0.0  7.0       0
15   57.0  1.0  3.0     150.0  168.0  0.0      0.0    174.0    0.0      1.6    1.0  0.0  3.0       0
16   48.0  1.0  2.0     110.0  229.0  0.0      0.0    168.0    0.0      1.0    3.0  0.0  7.0       1
17   54.0  1.0  4.0     140.0  239.0  0.0      0.0    160.0    0.0      1.2    1.0  0.0  3.0       0
18   48.0  0.0  3.0     130.0  275.0  0.0      0.0    139.0    0.0      0.2    1.0  0.0  3.0       0
19   49.0  1.0  2.0     130.0  266.0  0.0      0.0    171.0    0.0      0.6    1.0  0.0  3.0       0
20   64.0  1.0  1.0     110.0  211.0  0.0      2.0    144.0    1.0      1.8    2.0  0.0  3.0       0
21   58.0  0.0  1.0     150.0  283.0  1.0      2.0    162.0    0.0      1.0    1.0  0.0  3.0       0
22   58.0  1.0  2.0     120.0  284.0  0.0      2.0    160.0    0.0      1.8    2.0  0.0  3.0       1
23   58.0  1.0  3.0     132.0  224.0  0.0      2.0    173.0    0.0      3.2    1.0  2.0  7.0       3
24   60.0  1.0  4.0     130.0  206.0  0.0      2.0    132.0    1.0      2.4    2.0  2.0  7.0       4
25   50.0  0.0  3.0     120.0  219.0  0.0      0.0    158.0    0.0      1.6    2.0  0.0  3.0       0
26   58.0  0.0  3.0     120.0  340.0  0.0      0.0    172.0    0.0      0.0    1.0  0.0  3.0       0
27   66.0  0.0  1.0     150.0  226.0  0.0      0.0    114.0    0.0      2.6    3.0  0.0  3.0       0
28   43.0  1.0  4.0     150.0  247.0  0.0      0.0    171.0    0.0      1.5    1.0  0.0  3.0       0
29   40.0  1.0  4.0     110.0  167.0  0.0      2.0    114.0    1.0      2.0    2.0  0.0  7.0       3
..    ...  ...  ...       ...    ...  ...      ...      ...    ...      ...    ...  ...  ...     ...
273  71.0  0.0  4.0     112.0  149.0  0.0      0.0    125.0    0.0      1.6    2.0  0.0  3.0       0
274  59.0  1.0  1.0     134.0  204.0  0.0      0.0    162.0    0.0      0.8    1.0  2.0  3.0       1
275  64.0  1.0  1.0     170.0  227.0  0.0      2.0    155.0    0.0      0.6    2.0  0.0  7.0       0
276  66.0  0.0  3.0     146.0  278.0  0.0      2.0    152.0    0.0      0.0    2.0  1.0  3.0       0
277  39.0  0.0  3.0     138.0  220.0  0.0      0.0    152.0    0.0      0.0    2.0  0.0  3.0       0
278  57.0  1.0  2.0     154.0  232.0  0.0      2.0    164.0    0.0      0.0    1.0  1.0  3.0       1
279  58.0  0.0  4.0     130.0  197.0  0.0      0.0    131.0    0.0      0.6    2.0  0.0  3.0       0
280  57.0  1.0  4.0     110.0  335.0  0.0      0.0    143.0    1.0      3.0    2.0  1.0  7.0       2
281  47.0  1.0  3.0     130.0  253.0  0.0      0.0    179.0    0.0      0.0    1.0  0.0  3.0       0
282  55.0  0.0  4.0     128.0  205.0  0.0      1.0    130.0    1.0      2.0    2.0  1.0  7.0       3
283  35.0  1.0  2.0     122.0  192.0  0.0      0.0    174.0    0.0      0.0    1.0  0.0  3.0       0
284  61.0  1.0  4.0     148.0  203.0  0.0      0.0    161.0    0.0      0.0    1.0  1.0  7.0       2
285  58.0  1.0  4.0     114.0  318.0  0.0      1.0    140.0    0.0      4.4    3.0  3.0  6.0       4
286  58.0  0.0  4.0     170.0  225.0  1.0      2.0    146.0    1.0      2.8    2.0  2.0  6.0       2
287  58.0  1.0  2.0     125.0  220.0  0.0      0.0    144.0    0.0      0.4    2.0    ?  7.0       0
288  56.0  1.0  2.0     130.0  221.0  0.0      2.0    163.0    0.0      0.0    1.0  0.0  7.0       0
289  56.0  1.0  2.0     120.0  240.0  0.0      0.0    169.0    0.0      0.0    3.0  0.0  3.0       0
290  67.0  1.0  3.0     152.0  212.0  0.0      2.0    150.0    0.0      0.8    2.0  0.0  7.0       1
291  55.0  0.0  2.0     132.0  342.0  0.0      0.0    166.0    0.0      1.2    1.0  0.0  3.0       0
292  44.0  1.0  4.0     120.0  169.0  0.0      0.0    144.0    1.0      2.8    3.0  0.0  6.0       2
293  63.0  1.0  4.0     140.0  187.0  0.0      2.0    144.0    1.0      4.0    1.0  2.0  7.0       2
294  63.0  0.0  4.0     124.0  197.0  0.0      0.0    136.0    1.0      0.0    2.0  0.0  3.0       1
295  41.0  1.0  2.0     120.0  157.0  0.0      0.0    182.0    0.0      0.0    1.0  0.0  3.0       0
296  59.0  1.0  4.0     164.0  176.0  1.0      2.0     90.0    0.0      1.0    2.0  2.0  6.0       3
297  57.0  0.0  4.0     140.0  241.0  0.0      0.0    123.0    1.0      0.2    2.0  0.0  7.0       1
298  45.0  1.0  1.0     110.0  264.0  0.0      0.0    132.0    0.0      1.2    2.0  0.0  7.0       1
299  68.0  1.0  4.0     144.0  193.0  1.0      0.0    141.0    0.0      3.4    2.0  2.0  7.0       2
300  57.0  1.0  4.0     130.0  131.0  0.0      0.0    115.0    1.0      1.2    2.0  1.0  7.0       3
301  57.0  0.0  2.0     130.0  236.0  0.0      2.0    174.0    0.0      0.0    2.0  1.0  3.0       1
302  38.0  1.0  3.0     138.0  175.0  0.0      0.0    173.0    0.0      0.0    1.0    ?  3.0       0

[303 rows x 14 columns]

数据中我们发现有"?",所以我们处理一下缺失值

data = data.replace('?', np.NaN)
# 查看各字段缺失值统计情况
print(data.isna().sum())

上述数据缺失值较少,可直接删除。注意,在计算缺失值时,对于缺失值不是NaN的要用replace()函数替换成NaN格式,否则pd.isnull()检测不出来。

运行结果:

age         0
sex         0
cp          0
trestbps    0
chol        0
fbs         0
restecg     0
thalach     0
exang       0
oldpeak     0
slope       0
ca          4
thal        2
target      0
dtype: int64

我们删除有缺失值的那一行:

data = data.dropna()
print(data)
print(data.isna().sum())

运行结果:

      age  sex   cp  trestbps   chol  fbs  restecg  thalach  exang  oldpeak  slope   ca thal  target
0    63.0  1.0  1.0     145.0  233.0  1.0      2.0    150.0    0.0      2.3    3.0  0.0  6.0       0
1    67.0  1.0  4.0     160.0  286.0  0.0      2.0    108.0    1.0      1.5    2.0  3.0  3.0       2
2    67.0  1.0  4.0     120.0  229.0  0.0      2.0    129.0    1.0      2.6    2.0  2.0  7.0       1
3    37.0  1.0  3.0     130.0  250.0  0.0      0.0    187.0    0.0      3.5    3.0  0.0  3.0       0
4    41.0  0.0  2.0     130.0  204.0  0.0      2.0    172.0    0.0      1.4    1.0  0.0  3.0       0
5    56.0  1.0  2.0     120.0  236.0  0.0      0.0    178.0    0.0      0.8    1.0  0.0  3.0       0
6    62.0  0.0  4.0     140.0  268.0  0.0      2.0    160.0    0.0      3.6    3.0  2.0  3.0       3
7    57.0  0.0  4.0     120.0  354.0  0.0      0.0    163.0    1.0      0.6    1.0  0.0  3.0       0
8    63.0  1.0  4.0     130.0  254.0  0.0      2.0    147.0    0.0      1.4    2.0  1.0  7.0       2
9    53.0  1.0  4.0     140.0  203.0  1.0      2.0    155.0    1.0      3.1    3.0  0.0  7.0       1
10   57.0  1.0  4.0     140.0  192.0  0.0      0.0    148.0    0.0      0.4    2.0  0.0  6.0       0
11   56.0  0.0  2.0     140.0  294.0  0.0      2.0    153.0    0.0      1.3    2.0  0.0  3.0       0
12   56.0  1.0  3.0     130.0  256.0  1.0      2.0    142.0    1.0      0.6    2.0  1.0  6.0       2
13   44.0  1.0  2.0     120.0  263.0  0.0      0.0    173.0    0.0      0.0    1.0  0.0  7.0       0
14   52.0  1.0  3.0     172.0  199.0  1.0      0.0    162.0    0.0      0.5    1.0  0.0  7.0       0
15   57.0  1.0  3.0     150.0  168.0  0.0      0.0    174.0    0.0      1.6    1.0  0.0  3.0       0
16   48.0  1.0  2.0     110.0  229.0  0.0      0.0    168.0    0.0      1.0    3.0  0.0  7.0       1
17   54.0  1.0  4.0     140.0  239.0  0.0      0.0    160.0    0.0      1.2    1.0  0.0  3.0       0
18   48.0  0.0  3.0     130.0  275.0  0.0      0.0    139.0    0.0      0.2    1.0  0.0  3.0       0
19   49.0  1.0  2.0     130.0  266.0  0.0      0.0    171.0    0.0      0.6    1.0  0.0  3.0       0
20   64.0  1.0  1.0     110.0  211.0  0.0      2.0    144.0    1.0      1.8    2.0  0.0  3.0       0
21   58.0  0.0  1.0     150.0  283.0  1.0      2.0    162.0    0.0      1.0    1.0  0.0  3.0       0
22   58.0  1.0  2.0     120.0  284.0  0.0      2.0    160.0    0.0      1.8    2.0  0.0  3.0       1
23   58.0  1.0  3.0     132.0  224.0  0.0      2.0    173.0    0.0      3.2    1.0  2.0  7.0       3
24   60.0  1.0  4.0     130.0  206.0  0.0      2.0    132.0    1.0      2.4    2.0  2.0  7.0       4
25   50.0  0.0  3.0     120.0  219.0  0.0      0.0    158.0    0.0      1.6    2.0  0.0  3.0       0
26   58.0  0.0  3.0     120.0  340.0  0.0      0.0    172.0    0.0      0.0    1.0  0.0  3.0       0
27   66.0  0.0  1.0     150.0  226.0  0.0      0.0    114.0    0.0      2.6    3.0  0.0  3.0       0
28   43.0  1.0  4.0     150.0  247.0  0.0      0.0    171.0    0.0      1.5    1.0  0.0  3.0       0
29   40.0  1.0  4.0     110.0  167.0  0.0      2.0    114.0    1.0      2.0    2.0  0.0  7.0       3
..    ...  ...  ...       ...    ...  ...      ...      ...    ...      ...    ...  ...  ...     ...
271  66.0  1.0  4.0     160.0  228.0  0.0      2.0    138.0    0.0      2.3    1.0  0.0  6.0       0
272  46.0  1.0  4.0     140.0  311.0  0.0      0.0    120.0    1.0      1.8    2.0  2.0  7.0       2
273  71.0  0.0  4.0     112.0  149.0  0.0      0.0    125.0    0.0      1.6    2.0  0.0  3.0       0
274  59.0  1.0  1.0     134.0  204.0  0.0      0.0    162.0    0.0      0.8    1.0  2.0  3.0       1
275  64.0  1.0  1.0     170.0  227.0  0.0      2.0    155.0    0.0      0.6    2.0  0.0  7.0       0
276  66.0  0.0  3.0     146.0  278.0  0.0      2.0    152.0    0.0      0.0    2.0  1.0  3.0       0
277  39.0  0.0  3.0     138.0  220.0  0.0      0.0    152.0    0.0      0.0    2.0  0.0  3.0       0
278  57.0  1.0  2.0     154.0  232.0  0.0      2.0    164.0    0.0      0.0    1.0  1.0  3.0       1
279  58.0  0.0  4.0     130.0  197.0  0.0      0.0    131.0    0.0      0.6    2.0  0.0  3.0       0
280  57.0  1.0  4.0     110.0  335.0  0.0      0.0    143.0    1.0      3.0    2.0  1.0  7.0       2
281  47.0  1.0  3.0     130.0  253.0  0.0      0.0    179.0    0.0      0.0    1.0  0.0  3.0       0
282  55.0  0.0  4.0     128.0  205.0  0.0      1.0    130.0    1.0      2.0    2.0  1.0  7.0       3
283  35.0  1.0  2.0     122.0  192.0  0.0      0.0    174.0    0.0      0.0    1.0  0.0  3.0       0
284  61.0  1.0  4.0     148.0  203.0  0.0      0.0    161.0    0.0      0.0    1.0  1.0  7.0       2
285  58.0  1.0  4.0     114.0  318.0  0.0      1.0    140.0    0.0      4.4    3.0  3.0  6.0       4
286  58.0  0.0  4.0     170.0  225.0  1.0      2.0    146.0    1.0      2.8    2.0  2.0  6.0       2
288  56.0  1.0  2.0     130.0  221.0  0.0      2.0    163.0    0.0      0.0    1.0  0.0  7.0       0
289  56.0  1.0  2.0     120.0  240.0  0.0      0.0    169.0    0.0      0.0    3.0  0.0  3.0       0
290  67.0  1.0  3.0     152.0  212.0  0.0      2.0    150.0    0.0      0.8    2.0  0.0  7.0       1
291  55.0  0.0  2.0     132.0  342.0  0.0      0.0    166.0    0.0      1.2    1.0  0.0  3.0       0
292  44.0  1.0  4.0     120.0  169.0  0.0      0.0    144.0    1.0      2.8    3.0  0.0  6.0       2
293  63.0  1.0  4.0     140.0  187.0  0.0      2.0    144.0    1.0      4.0    1.0  2.0  7.0       2
294  63.0  0.0  4.0     124.0  197.0  0.0      0.0    136.0    1.0      0.0    2.0  0.0  3.0       1
295  41.0  1.0  2.0     120.0  157.0  0.0      0.0    182.0    0.0      0.0    1.0  0.0  3.0       0
296  59.0  1.0  4.0     164.0  176.0  1.0      2.0     90.0    0.0      1.0    2.0  2.0  6.0       3
297  57.0  0.0  4.0     140.0  241.0  0.0      0.0    123.0    1.0      0.2    2.0  0.0  7.0       1
298  45.0  1.0  1.0     110.0  264.0  0.0      0.0    132.0    0.0      1.2    2.0  0.0  7.0       1
299  68.0  1.0  4.0     144.0  193.0  1.0      0.0    141.0    0.0      3.4    2.0  2.0  7.0       2
300  57.0  1.0  4.0     130.0  131.0  0.0      0.0    115.0    1.0      1.2    2.0  1.0  7.0       3
301  57.0  0.0  2.0     130.0  236.0  0.0      2.0    174.0    0.0      0.0    2.0  1.0  3.0       1

[297 rows x 14 columns]
age         0
sex         0
cp          0
trestbps    0
chol        0
fbs         0
restecg     0
thalach     0
exang       0
oldpeak     0
slope       0
ca          0
thal        0
target      0
dtype: int64

1. 求心脏病患者年龄的平均值、中位数和众数,从结果里分析年龄与心脏病的关系

代码:

def Q(x, name): # x为数组,name为字符串
    print("{}的平均数是:{}".format(name, x.mean()))
    print("{}的中位数是:{}".format(name, np.median(x)))
    print("{}的众数是:{}".format(name, np.argmax(np.bincount(x))))
    return

# 1. 计算年龄的平均值,中位数和众数
age = data.iloc[:, 0:1]  # 截取第一列的数据
age = np.array(age.values.T[0], dtype='int')  # 转为数组
print(age)
Q(age, 'age')

运行结果:

age的平均数是:54.43894389438944
age的中位数是:56.0
age的众数是:58

由此可见58岁左右是心脏病高发年龄段

2. 胆固醇正常值是0-200mg/dL,区分胆固醇不合格和不合格人员,用百分位数分析年龄和胆固醇的关系(哪个年龄段胆固醇不合格的多,对比两组进行分析)

  1. 获取胆固醇数据
chol = data['chol']  # 截取第一列的数据
chol0 = np.array(chol.values.T, dtype='int')  # 转为数组
print("chol0:",chol0)

运行结果:

chol0: [233 286 229 250 204 236 268 354 254 203 192 294 256 263 199 168 229 239
 275 266 211 283 284 224 206 219 340 226 247 167 239 230 335 234 233 226
 177 276 353 243 225 199 302 212 330 230 175 243 417 197 198 177 290 219
 253 266 233 172 273 213 305 177 216 304 188 282 185 232 326 231 269 254
 267 248 197 360 258 308 245 270 208 264 321 274 325 235 257 234 256 302
 164 231 141 252 255 239 258 201 222 260 182 303 265 188 309 177 229 260
 219 307 249 186 341 263 203 211 183 330 254 256 407 222 217 282 234 288
 239 220 209 258 227 204 261 213 250 174 281 198 245 221 288 205 309 240
 243 289 250 308 318 298 265 564 289 246 322 299 300 293 277 197 304 214
 248 255 207 288 282 160 269 226 249 394 212 274 233 184 315 246 274 409
 244 270 305 195 240 246 283 254 196 298 294 211 299 234 236 244 273 254
 325 126 313 211 309 259 200 262 244 215 231 214 228 230 193 204 243 303
 271 268 267 199 282 269 210 204 277 206 212 196 327 149 269 201 286 283
 249 271 295 235 306 269 234 178 237 234 275 212 208 201 218 263 295 303
 209 223 197 245 261 242 319 240 226 166 315 218 223 180 207 228 311 149
 204 227 278 220 232 197 335 253 205 192 203 318 225 221 240 212 342 169
 187 197 157 176 241 264 193 131 236]
  1. 获取不正常胆固醇人员的年龄数据并输出直方图(这里为了应要求区分人员,做了直方图)
# 获取不正常胆固醇人员的年龄数据
age1 = data.loc[data['chol'] > 200]
age1 = age1.iloc[:,0]
# 做年龄的直方图
plt.hist(age1, bins=10, edgecolor='black',density=True)
plt.show()

运行结果:
大数据分析与实践 使用Python以UCI心脏病数据集为例,进行数据简单分析_第1张图片
3. 获取正常胆固醇人员的年龄数据并输出直方图

age2 = data.loc[data['chol'] < 200]
age2 = age2.iloc[:,0]
# 做年龄的直方图
plt.hist(age2, bins=10, edgecolor='black',density=True)
plt.show()

运行结果:
大数据分析与实践 使用Python以UCI心脏病数据集为例,进行数据简单分析_第2张图片
4. 百分位数

age1_25 = np.percentile(age1.values, 25, interpolation='linear')
age1_75 = np.percentile(age1.values, 75, interpolation='linear')
print('胆固醇不合格的人,年龄大多集中在:', age1_25 , '~', age1_75 , '之间')
age2_25 = np.percentile(age2.values, 25, interpolation='linear')
age2_75 = np.percentile(age2.values, 75, interpolation='linear')
print('胆固醇不合格的人,年龄大多集中在:', age2_25 , '~', age2_75 , '之间')

运行结果:

胆固醇不合格的人,年龄大多集中在: 48.75 ~ 61.25 之间
胆固醇不合格的人,年龄大多集中在: 43.75 ~ 59.25 之间

3. 求心脏病患者的胆固醇极差和四分位极差,并分析结果说明的问题

  1. 求心脏病患者胆固醇的极差和四分位极差,首先要有心脏病
# 3. 求心脏病患者胆固醇的极差和四分位极差
tarChol = data.loc[data['target'] == 1]['chol']
JC = max(tarChol) - min(tarChol)  # 极差
print("max:{},min:{}".format(max(tarChol), min(tarChol)))
SFW = np.percentile(tarChol, 75, interpolation='linear') - np.percentile(tarChol, 25, interpolation='linear')  # Q3-Q1
print("极差是", JC)
print("四分位极差是", SFW)

运行结果:

max:335.0,min:149.0
极差是 186.0
四分位极差是 51.25
  1. 画箱型图
# 绘制箱型图
print(tarChol.describe())
tarChol.plot.box(title="Box Chart")
plt.grid(linestyle="--")
plt.show()

运行结果:

count     54.000000
mean     249.148148
std       41.132738
min      149.000000
25%      224.500000
50%      249.000000
75%      275.750000
max      335.000000
Name: chol, dtype: float64

大数据分析与实践 使用Python以UCI心脏病数据集为例,进行数据简单分析_第3张图片
由此可得心脏病患者的胆固醇大部分都不正常

4. 分析心脏病患者的胆固醇是否满足正太分布?

  • 判断一个数据是否符合正态分布,这里我们用SW检验
  • SW检验中的S就是偏度,W就是峰度。
# 先转为Series类数据
s = pd.Series(tarChol)
print(s)

运行结果:

2      229.0
9      203.0
16     229.0
22     284.0
32     335.0
37     276.0
44     330.0
54     253.0
55     266.0
56     233.0
57     172.0
62     216.0
66     185.0
69     231.0
72     267.0
73     248.0
74     197.0
76     258.0
95     255.0
107    229.0
110    307.0
111    249.0
124    282.0
138    198.0
141    288.0
143    309.0
145    243.0
155    322.0
156    299.0
157    300.0
168    282.0
172    249.0
175    274.0
177    184.0
184    305.0
188    283.0
199    273.0
209    244.0
214    230.0
224    269.0
232    149.0
237    249.0
247    275.0
251    218.0
259    261.0
268    223.0
270    207.0
274    204.0
278    232.0
290    212.0
294    197.0
297    241.0
298    264.0
301    236.0
Name: chol, dtype: float64

计算偏度和峰度

print('偏度:', s.skew())  # 直接用pd进行偏度计算
print('峰度:', s.kurt())  # 直接用pd进行峰度计算

运行结果:

偏度: -0.05249449524863929
峰度: -0.2896560208524841

由此可得,心脏病患者的胆固醇满足正态分布。

绝对值均小于0.5,可以判断为正态分布

5. 用相关系数或卡方计算12个属性和得心脏病的相关性,分析哪些因素对确诊心脏病作用大。

代码:

print(data.corr()['target'])

运行结果:

age         0.222156
sex         0.226797
cp          0.404248
trestbps    0.159620
chol        0.066448
fbs         0.049040
restecg     0.184136
thalach    -0.420639
exang       0.391613
oldpeak     0.501461
slope       0.374689
target      1.000000
Name: target, dtype: float64

由此可见,oldpeak ,cp ,exang ,slope 对确诊心脏病作用大


大数据分析与实践 使用Python以UCI心脏病数据集为例,进行数据简单分析_第4张图片

创作不易,求个赞!!!
点赞 + 收藏 + 关注!!!
如有错误与建议,望告知!!!(将于下篇文章更正)
请多多关注我!!!谢谢!!!


上一篇:用python,numpy求平均数,众数,中位数,k百分位数

你可能感兴趣的:(python,大数据挖掘,python,数据分析)