pandas数据预处理实例

  1. 排序,默认从小到大排

    #By default, pandas will sort the data by the column we specify in ascending order and return a new DataFrame
    # Sorts the DataFrame in-place, rather than returning a new DataFrame.
    #print food_info["Sodium_(mg)"]
    food_info.sort_values("Sodium_(mg)", inplace=True)
    print (food_info["Sodium_(mg)"])
    #Sorts by descending order, rather than ascending.
    food_info.sort_values("Sodium_(mg)", inplace=True, ascending=False)
    print (food_info["Sodium_(mg)"])
    

    运行结果:
    pandas数据预处理实例_第1张图片

  2. 打开一个csv文件

    import pandas as pd
    import numpy as np
    titanic_survival = pd.read_csv("titanic_train.csv")
    titanic_survival.head()
    

    运行结果:
    pandas数据预处理实例_第2张图片

  3. 计算空值个数

    #The Pandas library uses NaN, which stands for "not a number", to indicate a missing value.
    #we can use the pandas.isnull() function which takes a pandas series and returns a series of True and False values
    age = titanic_survival["Age"]
    # print(age.loc[0:10])
    age_is_null = pd.isnull(age)
    # print (age_is_null)
    age_null_true = age[age_is_null]
    # print (age_null_true)
    age_null_count = len(age_null_true)
    print(age_null_count)
    

    运行结果:
    在这里插入图片描述

你可能感兴趣的:(机器学习)