http://pan.baidu.com/s/1nvNmzkH
将df分拆为df_sample和df_reset部分
df_sample = df.sample(frac = 0.7)
df_reset = df.loc[~df.index.isin(df_sample.index)]
dia_num = len(df[df['DiagGDM'] == 1])
total_num = len(df)
a = [['a', '1.2', '4.2'], ['b', '70', '0.03'], ['x', '5', '0']]
df = pd.DataFrame(a, columns=['one', 'two', 'three'])
df[['two', 'three']] = df[['two', 'three']].astype(float)
np.random.shuffle(train_data)
np.random.shuffle(test_data)
10 Minutes to pandas
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0)