Together_CZ

python中的auto_ml自动机器学习框架学习实践

之前就有接触过auto_ml这个自动机器学习框架，但是一直没有时间做一个简单的记录总结，以便于后续有时间继续学习，我相信随着机器学习的普及推广和发展，自动机器学习一定会占据越来越大的作用，因为机器学习、深度学习里面很大的一部分时间都要花在特征工程、模型选择、组合和参数调优上面，auto_ml框架提供了一种很好的解决思路，当前的自动学习框架也有很多，想要完整地进行学习还是需要花费一定的时间的，这里就简单对之前使用的auto_ml做个简单的记录。

由于数据集的缘故我不能随意公开使用，这里索性直接使用官方提供的Demo来简单学习实践一下，之后使用自己的数据集的时候只需要做一点数据集规范格式的统一处理就好了。

以波士顿房价数据为例，简单的一个小例子如下：

def bostonSimpleFunc():
    '''
    波士顿房价数据的简单应用实例
    '''
    train_data,test_data = get_boston_dataset()
    column_descriptions = {
                                'MEDV': 'output',
                                'CHAS': 'categorical'
                            }
    ml_predictor = Predictor(type_of_estimator='regressor', column_descriptions=column_descriptions)
    ml_predictor.train(train_data)
    ml_predictor.score(test_data, test_data.MEDV)

运行结果如下：

Welcome to auto_ml! We're about to go through and make sense of your data using machine learning, and give you a production-ready pipeline to get predictions with.

If you have any issues, or new feature ideas, let us know at http://auto.ml
You are running on version 2.9.10
Now using the model training_params that you passed in:
{}
After overwriting our defaults with your values, here are the final params that will be used to initialize the model:
{'presort': False, 'warm_start': True, 'learning_rate': 0.1}
Running basic data cleaning
Fitting DataFrameVectorizer
Now using the model training_params that you passed in:
{}
After overwriting our defaults with your values, here are the final params that will be used to initialize the model:
{'presort': False, 'warm_start': True, 'learning_rate': 0.1}


********************************************************************************************
About to fit the pipeline for the model GradientBoostingRegressor to predict MEDV
Started at:
2019-06-12 09:14:59
[1] random_holdout_set_from_training_data's score is: -9.82
[2] random_holdout_set_from_training_data's score is: -9.054
[3] random_holdout_set_from_training_data's score is: -8.48
[4] random_holdout_set_from_training_data's score is: -7.925
[5] random_holdout_set_from_training_data's score is: -7.424
[6] random_holdout_set_from_training_data's score is: -7.051
[7] random_holdout_set_from_training_data's score is: -6.608
[8] random_holdout_set_from_training_data's score is: -6.315
[9] random_holdout_set_from_training_data's score is: -6.0
[10] random_holdout_set_from_training_data's score is: -5.728
[11] random_holdout_set_from_training_data's score is: -5.499
[12] random_holdout_set_from_training_data's score is: -5.288
[13] random_holdout_set_from_training_data's score is: -5.126
[14] random_holdout_set_from_training_data's score is: -4.918
[15] random_holdout_set_from_training_data's score is: -4.775
[16] random_holdout_set_from_training_data's score is: -4.625
[17] random_holdout_set_from_training_data's score is: -4.513
[18] random_holdout_set_from_training_data's score is: -4.365
[19] random_holdout_set_from_training_data's score is: -4.281
[20] random_holdout_set_from_training_data's score is: -4.196
[21] random_holdout_set_from_training_data's score is: -4.133
[22] random_holdout_set_from_training_data's score is: -4.033
[23] random_holdout_set_from_training_data's score is: -4.004
[24] random_holdout_set_from_training_data's score is: -3.945
[25] random_holdout_set_from_training_data's score is: -3.913
[26] random_holdout_set_from_training_data's score is: -3.852
[27] random_holdout_set_from_training_data's score is: -3.844
[28] random_holdout_set_from_training_data's score is: -3.795
[29] random_holdout_set_from_training_data's score is: -3.824
[30] random_holdout_set_from_training_data's score is: -3.795
[31] random_holdout_set_from_training_data's score is: -3.778
[32] random_holdout_set_from_training_data's score is: -3.748
[33] random_holdout_set_from_training_data's score is: -3.739
[34] random_holdout_set_from_training_data's score is: -3.72
[35] random_holdout_set_from_training_data's score is: -3.721
[36] random_holdout_set_from_training_data's score is: -3.671
[37] random_holdout_set_from_training_data's score is: -3.644
[38] random_holdout_set_from_training_data's score is: -3.639
[39] random_holdout_set_from_training_data's score is: -3.617
[40] random_holdout_set_from_training_data's score is: -3.62
[41] random_holdout_set_from_training_data's score is: -3.614
[42] random_holdout_set_from_training_data's score is: -3.643
[43] random_holdout_set_from_training_data's score is: -3.647
[44] random_holdout_set_from_training_data's score is: -3.624
[45] random_holdout_set_from_training_data's score is: -3.589
[46] random_holdout_set_from_training_data's score is: -3.578
[47] random_holdout_set_from_training_data's score is: -3.565
[48] random_holdout_set_from_training_data's score is: -3.555
[49] random_holdout_set_from_training_data's score is: -3.549
[50] random_holdout_set_from_training_data's score is: -3.539
[52] random_holdout_set_from_training_data's score is: -3.571
[54] random_holdout_set_from_training_data's score is: -3.545
[56] random_holdout_set_from_training_data's score is: -3.588
[58] random_holdout_set_from_training_data's score is: -3.587
[60] random_holdout_set_from_training_data's score is: -3.584
[62] random_holdout_set_from_training_data's score is: -3.585
[64] random_holdout_set_from_training_data's score is: -3.589
[66] random_holdout_set_from_training_data's score is: -3.59
[68] random_holdout_set_from_training_data's score is: -3.558
[70] random_holdout_set_from_training_data's score is: -3.587
[72] random_holdout_set_from_training_data's score is: -3.583
[74] random_holdout_set_from_training_data's score is: -3.58
[76] random_holdout_set_from_training_data's score is: -3.578
[78] random_holdout_set_from_training_data's score is: -3.577
[80] random_holdout_set_from_training_data's score is: -3.591
[82] random_holdout_set_from_training_data's score is: -3.592
[84] random_holdout_set_from_training_data's score is: -3.586
[86] random_holdout_set_from_training_data's score is: -3.58
[88] random_holdout_set_from_training_data's score is: -3.562
[90] random_holdout_set_from_training_data's score is: -3.561
The number of estimators that were the best for this training dataset: 50
The best score on the holdout set: -3.539421497275334
Finished training the pipeline!
Total training time:
0:00:01


Here are the results from our GradientBoostingRegressor
predicting MEDV
Calculating feature responses, for advanced analytics.
The printed list will only contain at most the top 100 features.
+----+----------------+--------------+----------+-------------------+-------------------+-----------+-----------+-----------+-----------+
|    | Feature Name   |   Importance |    Delta |   FR_Decrementing |   FR_Incrementing |   FRD_abs |   FRI_abs |   FRD_MAD |   FRI_MAD |
|----+----------------+--------------+----------+-------------------+-------------------+-----------+-----------+-----------+-----------|
|  1 | ZN             |       0.0001 |  11.5619 |           -0.0027 |            0.0050 |    0.0027 |    0.0050 |    0.0000 |    0.0000 |
| 13 | CHAS=1.0       |       0.0011 | nan      |          nan      |          nan      |  nan      |  nan      |  nan      |  nan      |
| 12 | CHAS=0.0       |       0.0012 | nan      |          nan      |          nan      |  nan      |  nan      |  nan      |  nan      |
|  2 | INDUS          |       0.0013 |   3.4430 |            0.0070 |           -0.0539 |    0.0070 |    0.0539 |    0.0000 |    0.0000 |
|  7 | RAD            |       0.0029 |   4.2895 |           -0.7198 |            0.0463 |    0.7198 |    0.0463 |    0.3296 |    0.0000 |
|  5 | AGE            |       0.0145 |  13.9801 |            0.0757 |           -0.0292 |    0.2862 |    0.2393 |    0.0000 |    0.0000 |
|  8 | TAX            |       0.0160 |  82.9834 |            0.9411 |           -0.3538 |    0.9691 |    0.3538 |    0.0398 |    0.0000 |
| 10 | B              |       0.0171 |  45.7266 |           -0.1144 |            0.0896 |    0.1746 |    0.1200 |    0.1503 |    0.0000 |
|  3 | NOX            |       0.0193 |   0.0588 |            0.1792 |           -0.1584 |    0.1996 |    0.2047 |    0.0000 |    0.0000 |
|  9 | PTRATIO        |       0.0247 |   1.1130 |            0.5625 |           -0.2905 |    0.5991 |    0.2957 |    0.4072 |    0.1155 |
|  0 | CRIM           |       0.0252 |   4.4320 |           -0.0986 |           -0.4012 |    0.3789 |    0.4623 |    0.0900 |    0.0900 |
|  6 | DIS            |       0.0655 |   1.0643 |            3.4743 |           -0.2346 |    3.5259 |    0.5256 |    0.5473 |    0.2233 |
| 11 | LSTAT          |       0.3086 |   3.5508 |            1.5328 |           -1.6693 |    1.5554 |    1.6703 |    1.3641 |    1.6349 |
|  4 | RM             |       0.5026 |   0.3543 |           -1.1450 |            1.7191 |    1.1982 |    1.8376 |    0.4338 |    0.8010 |
+----+----------------+--------------+----------+-------------------+-------------------+-----------+-----------+-----------+-----------+


*******
Legend:
Importance = Feature Importance
     Explanation: A weighted measure of how much of the variance the model is able to explain is due to this column
FR_delta = Feature Response Delta Amount
     Explanation: Amount this column was incremented or decremented by to calculate the feature reponses
FR_Decrementing = Feature Response From Decrementing Values In This Column By One FR_delta
     Explanation: Represents how much the predicted output values respond to subtracting one FR_delta amount from every value in this column
FR_Incrementing = Feature Response From Incrementing Values In This Column By One FR_delta
     Explanation: Represents how much the predicted output values respond to adding one FR_delta amount to every value in this column
FRD_MAD = Feature Response From Decrementing- Median Absolute Delta
     Explanation: Takes the absolute value of all changes in predictions, then takes the median of those. Useful for seeing if decrementing this feature provokes strong changes that are both positive and negative
FRI_MAD = Feature Response From Incrementing- Median Absolute Delta
     Explanation: Takes the absolute value of all changes in predictions, then takes the median of those. Useful for seeing if incrementing this feature provokes strong changes that are both positive and negative
FRD_abs = Feature Response From Decrementing Avg Absolute Change
     Explanation: What is the average absolute change in predicted output values to subtracting one FR_delta amount to every value in this column. Useful for seeing if output is sensitive to a feature, but not in a uniformly positive or negative way
FRI_abs = Feature Response From Incrementing Avg Absolute Change
     Explanation: What is the average absolute change in predicted output values to adding one FR_delta amount to every value in this column. Useful for seeing if output is sensitive to a feature, but not in a uniformly positive or negative way
*******

None


***********************************************
Advanced scoring metrics for the trained regression model on this particular dataset:

Here is the overall RMSE for these predictions:
2.9415706036925924

Here is the average of the predictions:
21.3944468736

Here is the average actual value on this validation set:
21.4882352941

Here is the median prediction:
20.688959488015513

Here is the median actual value:
20.15

Here is the mean absolute error:
2.011340247445387

Here is the median absolute error (robust to outliers):
1.4717184675805761

Here is the explained variance:
0.8821274319123865

Here is the R-squared value:
0.882007483541501
Count of positive differences (prediction > actual):
51
Count of negative differences:
51
Average positive difference:
1.91755182694
Average negative difference:
-2.10512866795


***********************************************


[Finished in 2.8s]

作者说了，这个auto_ml是为了产品研发的，提供了很完整的应用，这里从训练测试数据集划分、模型训练、模型持久化、模型加载、模型预测几个部分来拿波士顿房价数据做一个完成的实践，具体如下：

def bostonWholeFunc():
    '''
    波士顿房价数据的一个比较完整的实例
    包括： 训练测试数据集划分、模型训练、模型持久化、模型加载、模型预测
    '''
    train_data,test_data = get_boston_dataset()
    column_descriptions = {
      'MEDV': 'output', 
      'CHAS': 'categorical'
    }
    ml_predictor = Predictor(type_of_estimator='regressor', column_descriptions=column_descriptions)
    ml_predictor.train(train_data)
    test_score = ml_predictor.score(test_data, test_data.MEDV)
    file_name = ml_predictor.save()
    trained_model = load_ml_model(file_name)
    predictions = trained_model.predict(test_data)
    print('=====================predictions===========================')
    print(predictions)
    predictions = trained_model.predict_proba(test_data)
    print('=====================predictions===========================')
    print(predictions)

结果如下：

Welcome to auto_ml! We're about to go through and make sense of your data using machine learning, and give you a production-ready pipeline to get predictions with.

If you have any issues, or new feature ideas, let us know at http://auto.ml
You are running on version 2.9.10
Now using the model training_params that you passed in:
{}
After overwriting our defaults with your values, here are the final params that will be used to initialize the model:
{'presort': False, 'warm_start': True, 'learning_rate': 0.1}
Running basic data cleaning
Fitting DataFrameVectorizer
Now using the model training_params that you passed in:
{}
After overwriting our defaults with your values, here are the final params that will be used to initialize the model:
{'presort': False, 'warm_start': True, 'learning_rate': 0.1}


********************************************************************************************
About to fit the pipeline for the model GradientBoostingRegressor to predict MEDV
Started at:
2019-06-12 09:21:21
[1] random_holdout_set_from_training_data's score is: -9.93
[2] random_holdout_set_from_training_data's score is: -9.281
[3] random_holdout_set_from_training_data's score is: -8.683
[4] random_holdout_set_from_training_data's score is: -8.03
[5] random_holdout_set_from_training_data's score is: -7.494
[6] random_holdout_set_from_training_data's score is: -7.074
[7] random_holdout_set_from_training_data's score is: -6.649
[8] random_holdout_set_from_training_data's score is: -6.374
[9] random_holdout_set_from_training_data's score is: -6.115
[10] random_holdout_set_from_training_data's score is: -5.877
[11] random_holdout_set_from_training_data's score is: -5.566
[12] random_holdout_set_from_training_data's score is: -5.391
[13] random_holdout_set_from_training_data's score is: -5.088
[14] random_holdout_set_from_training_data's score is: -4.911
[15] random_holdout_set_from_training_data's score is: -4.692
[16] random_holdout_set_from_training_data's score is: -4.566
[17] random_holdout_set_from_training_data's score is: -4.379
[18] random_holdout_set_from_training_data's score is: -4.296
[19] random_holdout_set_from_training_data's score is: -4.14
[20] random_holdout_set_from_training_data's score is: -4.009
[21] random_holdout_set_from_training_data's score is: -3.92
[22] random_holdout_set_from_training_data's score is: -3.856
[23] random_holdout_set_from_training_data's score is: -3.81
[24] random_holdout_set_from_training_data's score is: -3.72
[25] random_holdout_set_from_training_data's score is: -3.632
[26] random_holdout_set_from_training_data's score is: -3.601
[27] random_holdout_set_from_training_data's score is: -3.538
[28] random_holdout_set_from_training_data's score is: -3.487
[29] random_holdout_set_from_training_data's score is: -3.459
[30] random_holdout_set_from_training_data's score is: -3.458
[31] random_holdout_set_from_training_data's score is: -3.422
[32] random_holdout_set_from_training_data's score is: -3.408
[33] random_holdout_set_from_training_data's score is: -3.356
[34] random_holdout_set_from_training_data's score is: -3.335
[35] random_holdout_set_from_training_data's score is: -3.323
[36] random_holdout_set_from_training_data's score is: -3.313
[37] random_holdout_set_from_training_data's score is: -3.262
[38] random_holdout_set_from_training_data's score is: -3.236
[39] random_holdout_set_from_training_data's score is: -3.207
[40] random_holdout_set_from_training_data's score is: -3.214
[41] random_holdout_set_from_training_data's score is: -3.198
[42] random_holdout_set_from_training_data's score is: -3.188
[43] random_holdout_set_from_training_data's score is: -3.174
[44] random_holdout_set_from_training_data's score is: -3.164
[45] random_holdout_set_from_training_data's score is: -3.122
[46] random_holdout_set_from_training_data's score is: -3.122
[47] random_holdout_set_from_training_data's score is: -3.109
[48] random_holdout_set_from_training_data's score is: -3.11
[49] random_holdout_set_from_training_data's score is: -3.119
[50] random_holdout_set_from_training_data's score is: -3.113
[52] random_holdout_set_from_training_data's score is: -3.113
[54] random_holdout_set_from_training_data's score is: -3.099
[56] random_holdout_set_from_training_data's score is: -3.102
[58] random_holdout_set_from_training_data's score is: -3.097
[60] random_holdout_set_from_training_data's score is: -3.069
[62] random_holdout_set_from_training_data's score is: -3.061
[64] random_holdout_set_from_training_data's score is: -3.024
[66] random_holdout_set_from_training_data's score is: -2.999
[68] random_holdout_set_from_training_data's score is: -2.999
[70] random_holdout_set_from_training_data's score is: -2.984
[72] random_holdout_set_from_training_data's score is: -2.978
[74] random_holdout_set_from_training_data's score is: -2.96
[76] random_holdout_set_from_training_data's score is: -2.943
[78] random_holdout_set_from_training_data's score is: -2.947
[80] random_holdout_set_from_training_data's score is: -2.938
[82] random_holdout_set_from_training_data's score is: -2.921
[84] random_holdout_set_from_training_data's score is: -2.914
[86] random_holdout_set_from_training_data's score is: -2.91
[88] random_holdout_set_from_training_data's score is: -2.901
[90] random_holdout_set_from_training_data's score is: -2.906
[92] random_holdout_set_from_training_data's score is: -2.892
[94] random_holdout_set_from_training_data's score is: -2.885
[96] random_holdout_set_from_training_data's score is: -2.884
[98] random_holdout_set_from_training_data's score is: -2.894
[100] random_holdout_set_from_training_data's score is: -2.88
[103] random_holdout_set_from_training_data's score is: -2.893
[106] random_holdout_set_from_training_data's score is: -2.889
[109] random_holdout_set_from_training_data's score is: -2.886
[112] random_holdout_set_from_training_data's score is: -2.869
[115] random_holdout_set_from_training_data's score is: -2.875
[118] random_holdout_set_from_training_data's score is: -2.852
[121] random_holdout_set_from_training_data's score is: -2.855
[124] random_holdout_set_from_training_data's score is: -2.848
[127] random_holdout_set_from_training_data's score is: -2.854
[130] random_holdout_set_from_training_data's score is: -2.86
[133] random_holdout_set_from_training_data's score is: -2.857
[136] random_holdout_set_from_training_data's score is: -2.854
[139] random_holdout_set_from_training_data's score is: -2.856
[142] random_holdout_set_from_training_data's score is: -2.854
[145] random_holdout_set_from_training_data's score is: -2.845
[148] random_holdout_set_from_training_data's score is: -2.84
[151] random_holdout_set_from_training_data's score is: -2.838
[154] random_holdout_set_from_training_data's score is: -2.838
[157] random_holdout_set_from_training_data's score is: -2.839
[160] random_holdout_set_from_training_data's score is: -2.837
[163] random_holdout_set_from_training_data's score is: -2.838
[166] random_holdout_set_from_training_data's score is: -2.838
[169] random_holdout_set_from_training_data's score is: -2.84
[172] random_holdout_set_from_training_data's score is: -2.828
[175] random_holdout_set_from_training_data's score is: -2.836
[178] random_holdout_set_from_training_data's score is: -2.834
[181] random_holdout_set_from_training_data's score is: -2.836
[184] random_holdout_set_from_training_data's score is: -2.837
[187] random_holdout_set_from_training_data's score is: -2.86
[190] random_holdout_set_from_training_data's score is: -2.862
[193] random_holdout_set_from_training_data's score is: -2.856
[196] random_holdout_set_from_training_data's score is: -2.855
[199] random_holdout_set_from_training_data's score is: -2.857
[202] random_holdout_set_from_training_data's score is: -2.856
[205] random_holdout_set_from_training_data's score is: -2.86
[208] random_holdout_set_from_training_data's score is: -2.859
[211] random_holdout_set_from_training_data's score is: -2.857
[214] random_holdout_set_from_training_data's score is: -2.855
[217] random_holdout_set_from_training_data's score is: -2.852
[220] random_holdout_set_from_training_data's score is: -2.849
[223] random_holdout_set_from_training_data's score is: -2.853
[226] random_holdout_set_from_training_data's score is: -2.845
[229] random_holdout_set_from_training_data's score is: -2.846
[232] random_holdout_set_from_training_data's score is: -2.849
The number of estimators that were the best for this training dataset: 172
The best score on the holdout set: -2.827876248876794
Finished training the pipeline!
Total training time:
0:00:01


Here are the results from our GradientBoostingRegressor
predicting MEDV
Calculating feature responses, for advanced analytics.
The printed list will only contain at most the top 100 features.
+----+----------------+--------------+----------+-------------------+-------------------+-----------+-----------+-----------+-----------+
|    | Feature Name   |   Importance |    Delta |   FR_Decrementing |   FR_Incrementing |   FRD_abs |   FRI_abs |   FRD_MAD |   FRI_MAD |
|----+----------------+--------------+----------+-------------------+-------------------+-----------+-----------+-----------+-----------|
| 12 | CHAS=0.0       |       0.0000 | nan      |          nan      |          nan      |  nan      |  nan      |  nan      |  nan      |
|  1 | ZN             |       0.0004 |  11.5619 |           -0.0194 |            0.0204 |    0.0205 |    0.0230 |    0.0000 |    0.0000 |
| 13 | CHAS=1.0       |       0.0005 | nan      |          nan      |          nan      |  nan      |  nan      |  nan      |  nan      |
|  2 | INDUS          |       0.0031 |   3.4430 |            0.1103 |            0.0494 |    0.1565 |    0.1543 |    0.0597 |    0.0000 |
|  7 | RAD            |       0.0059 |   4.2895 |           -0.3558 |            0.0537 |    0.3620 |    0.1431 |    0.3727 |    0.0000 |
|  5 | AGE            |       0.0105 |  13.9801 |            0.2805 |           -0.3050 |    0.5735 |    0.4734 |    0.3615 |    0.2435 |
| 10 | B              |       0.0118 |  45.7266 |           -0.1885 |            0.1507 |    0.3139 |    0.2903 |    0.1688 |    0.0582 |
|  8 | TAX            |       0.0167 |  82.9834 |            1.1477 |           -0.4399 |    1.2920 |    0.4563 |    0.2671 |    0.2617 |
|  9 | PTRATIO        |       0.0247 |   1.1130 |            0.5095 |           -0.2323 |    0.5599 |    0.4590 |    0.2984 |    0.3357 |
|  0 | CRIM           |       0.0284 |   4.4320 |           -0.4701 |           -0.2061 |    0.7788 |    0.4938 |    0.5027 |    0.2806 |
|  3 | NOX            |       0.0298 |   0.0588 |            0.3083 |           -0.1691 |    0.4285 |    0.3968 |    0.0745 |    0.0745 |
|  6 | DIS            |       0.0608 |   1.0643 |            3.4966 |           -0.3628 |    3.5823 |    0.8045 |    0.9935 |    0.3655 |
|  4 | RM             |       0.3571 |   0.3543 |           -1.2174 |            1.4995 |    1.3628 |    1.7090 |    0.7740 |    1.0375 |
| 11 | LSTAT          |       0.4504 |   3.5508 |            1.9849 |           -1.8635 |    2.0343 |    1.9289 |    1.8354 |    1.5375 |
+----+----------------+--------------+----------+-------------------+-------------------+-----------+-----------+-----------+-----------+


*******
Legend:
Importance = Feature Importance
     Explanation: A weighted measure of how much of the variance the model is able to explain is due to this column
FR_delta = Feature Response Delta Amount
     Explanation: Amount this column was incremented or decremented by to calculate the feature reponses
FR_Decrementing = Feature Response From Decrementing Values In This Column By One FR_delta
     Explanation: Represents how much the predicted output values respond to subtracting one FR_delta amount from every value in this column
FR_Incrementing = Feature Response From Incrementing Values In This Column By One FR_delta
     Explanation: Represents how much the predicted output values respond to adding one FR_delta amount to every value in this column
FRD_MAD = Feature Response From Decrementing- Median Absolute Delta
     Explanation: Takes the absolute value of all changes in predictions, then takes the median of those. Useful for seeing if decrementing this feature provokes strong changes that are both positive and negative
FRI_MAD = Feature Response From Incrementing- Median Absolute Delta
     Explanation: Takes the absolute value of all changes in predictions, then takes the median of those. Useful for seeing if incrementing this feature provokes strong changes that are both positive and negative
FRD_abs = Feature Response From Decrementing Avg Absolute Change
     Explanation: What is the average absolute change in predicted output values to subtracting one FR_delta amount to every value in this column. Useful for seeing if output is sensitive to a feature, but not in a uniformly positive or negative way
FRI_abs = Feature Response From Incrementing Avg Absolute Change
     Explanation: What is the average absolute change in predicted output values to adding one FR_delta amount to every value in this column. Useful for seeing if output is sensitive to a feature, but not in a uniformly positive or negative way
*******

None


***********************************************
Advanced scoring metrics for the trained regression model on this particular dataset:

Here is the overall RMSE for these predictions:
2.4474947386663786

Here is the average of the predictions:
21.2925792927

Here is the average actual value on this validation set:
21.4882352941

Here is the median prediction:
20.457423442279662

Here is the median actual value:
20.15

Here is the mean absolute error:
1.844793596155306

Here is the median absolute error (robust to outliers):
1.3340192567295777

Here is the explained variance:
0.9188375538746201

Here is the R-squared value:
0.9183155397464807
Count of positive differences (prediction > actual):
51
Count of negative differences:
51
Average positive difference:
1.64913759477
Average negative difference:
-2.04044959754


***********************************************




We have saved the trained pipeline to a filed called "auto_ml_saved_pipeline.dill"
It is saved in the directory: 
C:\Users\18706\Desktop\myBlogs\auto_ml_use
To use it to get predictions, please follow the following flow (adjusting for your own uses as necessary:


`from auto_ml.utils_models import load_ml_model
`trained_ml_pipeline = load_ml_model("auto_ml_saved_pipeline.dill")
`trained_ml_pipeline.predict(data)`


Note that this pickle/dill file can only be loaded in an environment with the same modules installed, and running the same Python version.
This version of Python is:
sys.version_info(major=2, minor=7, micro=13, releaselevel='final', serial=0)


When passing in new data to get predictions on, columns that were not present (or were not found to be useful) in the training data will be silently ignored.
It is worthwhile to make sure that you feed in all the most useful data points though, to make sure you can get the highest quality predictions.
=====================predictions===========================
[23.503099796820333, 32.63486484873551, 17.607843570794248, 22.96364141712182, 18.037259790025, 22.154154350077157, 18.157171399351753, 14.490724400217747, 20.91569106207268, 21.371745165599958, 19.978460029298827, 17.617959317911595, 6.657480263073871, 21.259425283809687, 19.30470390603625, 23.54754498054679, 20.616057833202493, 8.569816325663448, 45.01902942229479, 15.319975928505148, 23.84765254861352, 24.49050663723932, 12.344561585629016, 23.24874551694055, 15.137348894013865, 15.067038653704085, 21.674735923166942, 12.88017013620315, 19.43339890697579, 20.933210490656045, 20.235546222120107, 22.99264652948031, 20.45638944287541, 20.50831821637611, 14.026411558432988, 17.14000803427353, 34.322736768893236, 19.82116882409099, 20.757084718131125, 23.523990773770624, 17.92101235838185, 30.745980540024213, 45.09505946725109, 18.76719301853909, 23.69250732281568, 14.627546717865679, 15.404318347865019, 23.856332667077602, 18.597560915078148, 28.295069087679007, 20.335783749261154, 35.49551328178157, 17.049478769941757, 27.36240739278428, 49.168123673644864, 21.919364008618228, 16.431621230418827, 32.50614954154076, 22.60486571683311, 17.190717714534216, 24.86659240393153, 34.726632201151446, 32.56154963374883, 17.991423510542266, 23.19139847589728, 16.3827778391806, 13.763406903575234, 23.041746542718485, 28.897952087920405, 15.16115409656009, 20.54704218671605, 27.630784534960636, 9.265217126500687, 20.218468086624206, 22.678130640115423, 3.978712919679104, 20.458457441683915, 44.47945990229906, 12.603336785642627, 11.482102006681343, 21.066151218556975, 13.559181962607349, 21.19973222974325, 10.447704116792627, 20.110776756244167, 28.928923567731772, 15.527462244687818, 23.24725371877329, 25.743821297087276, 18.04671684265537, 22.950747524482065, 9.088864852661203, 19.075035374223955, 18.42257896844079, 23.564483816162195, 19.647455910849818, 44.12778583727594, 11.427374611849514, 12.040264853009598, 16.998049081305517, 20.25692214075818, 22.80453061159547]
=====================predictions===========================
[[1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0], [1, 0]]
[Finished in 3.3s]

在官方提供的原始实例上，我对输出多加了一层类别的输出。

完整的程序如下：

#!usr/bin/env python
#encoding:utf-8
from __future__ import division

'''
__Author__:沂水寒城
功能： auto_ml 学习实践使用

GitHub地址：
           https://github.com/yishuihanhan/auto_ml
官方文档：
         https://auto-ml.readthedocs.io/en/latest/formatting_data.html
'''


from auto_ml import Predictor
from auto_ml.utils import get_boston_dataset
from auto_ml.utils_models import load_ml_model



def bostonSimpleFunc():
    '''
    波士顿房价数据的简单应用实例
    '''
    train_data,test_data = get_boston_dataset()
    column_descriptions = {
                                'MEDV': 'output',
                                'CHAS': 'categorical'
                            }
    ml_predictor = Predictor(type_of_estimator='regressor', column_descriptions=column_descriptions)
    ml_predictor.train(train_data)
    ml_predictor.score(test_data, test_data.MEDV)


def bostonWholeFunc():
    '''
    波士顿房价数据的一个比较完整的实例
    包括： 训练测试数据集划分、模型训练、模型持久化、模型加载、模型预测
    '''
    train_data,test_data = get_boston_dataset()
    column_descriptions = {
      'MEDV': 'output', 
      'CHAS': 'categorical'
    }
    ml_predictor = Predictor(type_of_estimator='regressor', column_descriptions=column_descriptions)
    ml_predictor.train(train_data)
    test_score = ml_predictor.score(test_data, test_data.MEDV)
    file_name = ml_predictor.save()
    trained_model = load_ml_model(file_name)
    predictions = trained_model.predict(test_data)
    print('=====================predictions===========================')
    print(predictions)
    predictions = trained_model.predict_proba(test_data)
    print('=====================predictions===========================')
    print(predictions)


if __name__=='__main__':
    bostonSimpleFunc()

    bostonWholeFunc()

相应地GitHub地址和官方文档地址在代码的开头部分我都给出来了，感兴趣的话可以去看看，记录学习了！

你可能感兴趣的:(编程技术,机器学习)

基于机器学习的加密货币资金费率预测与套利策略云梦量化科技 python
一、资金费率机制解析永续合约的资金费率是加密货币衍生品市场独有的机制，旨在使永续合约价格锚定现货价格。资金费率每8小时结算一次，结算时多空双方互相支付资金费用：费率为正时，多头支付给空头；费率为负时，空头支付给多头。此机制既促使永续合约价格回归现货价格，也反映市场多空情绪。某安永续合约资金费率计算公式通常为：资金费率 F = 平均溢价指数 P + Clamp(综合利率 I − 溢价指数 P, +0
机器人-组成结构-感知 - 决策 - 执行具身智能-查布嘎具身智能机器人人工智能
目录一、感知系统内部传感器：外部传感器：二、智能决策系统机器学习家族1.1机器学习2.1深度学习2.2深度学习模型(主要属于监督/强化学习范畴，但结构通用)：3.1监督学习3.2监督学习模型4.1半监督学习4.2无/半监督学习模型：5.1无监督学习5.2生成模型(可属于监督/无监督)：6.1强化学习7.1其他学习三、控制系统（运控）①对应小脑和脊柱一、感知系统①对应人体的五官。由具有不同功能的各种
机器学习入门（五）：线性回归—从模型函数到目标函数米饭超人
从数据反推公式假设我们获得了这样一张表格，上面列举了美国纽约若干程序员职位的年薪：enterimagedescriptionhere大家可以看到，表格中列举了职位、经验、技能、国家和城市几项特征。除了经验一项，其他都是一样的。不同的经验（工作年限），薪水不同。而且看起来，工作年头越多，工资也就越高。那么我们把Experience与Salary抽取出来，用x和y来分别指代它们。enterimaged
Django ORM系统
1.ORM基础概念1.1什么是ORM？ORM（ObjectRelationalMapping，对象关系映射）是一种编程技术，用于在面向对象编程语言中实现不同类型系统的数据转换。在Django中，ORM充当业务逻辑层和数据库层之间的桥梁。核心映射关系：类（Class）↔数据库表（Table）类实例（Instance）↔表记录（Row）类属性（Attribute）↔表字段（Field）1.2ORM的优
Python深度学习实践：LSTM与GRU在序列数据预测中的应用 AI智能应用 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
Python深度学习实践：LSTM与GRU在序列数据预测中的应用作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来序列数据预测是机器学习领域的一个重要研究方向，涉及时间序列分析、自然语言处理、语音识别等多个领域。序列数据具有时间依赖性，即序列中每个元素都受到前面元素的影响。传统的机器学习算法难以捕捉这种时间依赖性，而深度学习
一个例子带你入门机器学习
目录1.为建模选择数据2.选择预测目标3.选择“特征”4.构建您的模型（这篇文章将使用经典墨尔本房价数据集作为例子，引导机器学习的流程，数据集为melb_data.csv，请在csdn的下载区自行下载，运行代码时需要将数据集下载在同个目录下）1.为建模选择数据数据集有太多的变量，多到难以理解，甚至无法很好地打印出来。如何将这海量的数据削减为能够理解的内容？我们将首先凭借直觉选择几个变量。后续将介绍
初探机器学习与力学研究的交叉领域 faderbic 机器学习人工智能深度学习
目录关于如何踏入机器学习领域机器学习与力学研究的交叉方向1.使用机器学习加速有限元求解2.结合有限元计算和机器学习预测复杂材料结构与力学性能的关系3.结构健康检测4.疲劳寿命预测总结关于如何踏入机器学习领域因为我本科的专业是力学，所以当我开始关注机器学习领域时，首先考虑的是机器学习和力学的交叉领域。对于很多对人工智能感兴趣的朋友，想加入人工智能的潮流却不知道从何学起，我提供一个思路，我认为将自己学
[NIPST AI]对抗性机器学习攻击和缓解的分类和术语 Anooyman 人工智能网络安全人工智能大语言模型网络安全安全
原文link：https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2025.pdfIntroduction人工智能（AI）系统在过去几年中持续全球扩展。这些系统正在被众多国家开发并广泛部署于各自的经济体系中，人们在生活的许多领域都获得了更多使用AI系统的机会。本报告区分了两大类AI系统：预测型AI（PredictiveAI，PredAI）和生成型A
通俗易懂：什么是决策树？淦暴尼算法 python 决策树算法机器学习
1.引言：决策树就像“选择题”你是否曾经在生活中做过“选择题”？比如：今天要不要带伞？晚饭吃什么？该不该买那件心仪已久的商品？其实，我们的大脑经常会像“决策树”一样，通过一连串问题和判断，逐步缩小选择范围，最终做出决定。**决策树（DecisionTree）**就是这样一种模拟人类决策过程的机器学习模型。它通过“提问-分支-决策”的方式，把复杂问题拆解成一系列简单的判断，广泛应用于分类（如判断邮件
java毕业设计-基于Javaweb的家常小菜烹饪学习管理系统的设计与实现(源码+LW+部署文档+全bao+远程调试+代码讲解等) 程序猿刘 vue spring boot 毕业设计 java 课程设计学习
博主介绍：✌️码农一枚，专注于大学生项目实战开发、讲解和毕业文撰写修改等。全栈领域优质创作者，博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java、小程序技术领域和毕业项目实战✌️技术范围：：小程序、SpringBoot、SSM、JSP、Vue、PHP、Java、python、爬虫、数据可视化、大数据、物联网、机器学习等设计与开发。主要内容：免费开题报告、任务书、全bao定制+
java毕业设计源码案例-基于ssm+协同过滤的个性化小说推荐系统设计与实现(源码+LW+部署文档+全bao+远程调试+代码讲解等) 项目帮 springboot java 计算机毕设 java 课程设计开发语言
博主介绍：✌️码农一枚，专注于大学生项目实战开发、讲解和毕业文撰写修改等。全栈领域优质创作者，博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java、小程序技术领域和毕业项目实战✌️技术范围：：小程序、SpringBoot、SSM、JSP、Vue、PHP、Java、python、爬虫、数据可视化、大数据、物联网、机器学习等设计与开发。主要内容：免费功能设计，开题报告、任务书、全b
机器学习中的数据预处理：从入门到实践耐思nice～机器学习由浅入深-吴恩达机器学习人工智能
在当今的智能时代，机器学习已经渗透到我们生活的方方面面。比如我们常用的推荐系统，它能根据我们的浏览记录精准推送喜欢的商品或视频，这背后就离不开机器学习的支撑。而一个优秀的机器学习模型，离不开高质量的数据，数据预处理正是保证数据质量的关键环节，它就像烹饪前的食材处理，直接影响着最终“菜品”的口感，也就是模型的性能。今天，我们就来全面学习机器学习中数据预处理的关键步骤。一、数据预处理的重要性数据预处理
计算机专业大数据毕业设计-基于 Spark 的音乐数据分析项目(源码+LW+部署文档+全bao+远程调试+代码讲解等) 程序猿八哥数据可视化计算机毕设 spark 大数据课程设计 spark
博主介绍：✌️码农一枚，专注于大学生项目实战开发、讲解和毕业文撰写修改等。全栈领域优质创作者，博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java、小程序技术领域和毕业项目实战✌️技术范围：：小程序、SpringBoot、SSM、JSP、Vue、PHP、Java、python、爬虫、数据可视化、大数据、物联网、机器学习等设计与开发。主要内容：免费功能设计，开题报告、任务书、全b
Protein FID：AI蛋白质结构生成模型评估新指标
一、引言：蛋白质生成模型面临的评估挑战近年来，AI驱动的蛋白质结构生成模型取得了令人瞩目的进展，但如何有效评估这些模型的质量却一直是一个悬而未决的问题。虽然实验验证仍然是金标准，但计算机模拟评估对于快速开发和比较机器学习模型至关重要。然而，尽管最先进的模型在当前评估指标上表现卓越，但它们在实际设计应用中的成功率仍然相对有限。例如，有研究报告显示生成结构的实验成功率仅为3%，而计算机模拟评分却远高于
在 Conda 中删除环境及所有安装的库 Studying 开龙wu conda
注意事项1.删除环境前确保你没有在该环境中运行任何程序。2.删除操作是不可逆的，所有该环境中的包和配置都会被永久删除。3.如果你想保留环境的配置信息，可以在删除前使用condaenvexport>environment.yml导出环境配置。关于requirements.txt和environment.yaml文件使用介绍详情可参考以往文章，争对机器学习和深度学习里Python项目开发管理项目依赖的
TensorFlow GPU 2.10.1 for Python 3.9快速安装指南疑样
本文还有配套的精品资源，点击获取简介：TensorFlowGPU2.10.1是专为Windowsx64和Python3.9设计的TensorFlow版本，它集成了GPU支持以加快深度学习模型的训练。本指南提供了该版本的概述、安装步骤及注意事项，旨在帮助开发者利用其性能优势提升机器学习项目的效率。1.TensorFlowGPU介绍1.1TensorFlow的起源与功能TensorFlow是由Goog
进阶向:基于Python的智能客服系统设计与实现
智能客服系统开发指南系统概述智能客服系统是人工智能领域的重要应用，它通过自然语言处理(NLP)和机器学习技术自动化处理用户查询，显著提升客户服务效率和响应速度。基于Python的实现方案因其丰富的生态系统（如NLTK、spaCy、Transformers等库）、跨平台兼容性以及易于集成的特点，成为开发智能客服系统的首选。系统架构系统核心包括两个主要功能模块：1.API集成模块负责连接各类外部服务，
机器学习专栏（62）：手把手实现工业级ResNet-34及调优全攻略
目录一、ResNet革命性突破解析1.1残差学习核心思想1.2ResNet-34结构详解二、工业级Keras实现详解2.1数据预处理流水线2.2完整模型实现三、模型训练调优策略3.1学习率动态调整3.2混合精度训练四、性能优化技巧4.1分布式训练配置4.2TensorRT推理加速五、实战应用案例5.1医疗影像分类5.2工业质检系统六、模型可视化分析6.1特征热力图6.2参数量分析七、常见问题解决方
模式识别与机器学习课程笔记（1）：数学基础 Ro Jace 学习笔记机器学习笔记人工智能
模式识别与机器学习课程笔记（1）：数学基础特征矢量和特征空间随机矢量的描述随机矢量的分布函数随机矢量的数字特征随机变量、随机矢量间的统计关系随机矢量的变换正态分布正态分布的定义正态分布随机矢量的性质离散随机矢量及其分布信息论矩阵微分法基本知识矢量或矩阵对于数量变量的微分二、数量函数对于矢量的微分三、矢量函数对于矢量的微分特征矢量和特征空间特征量的类型：物理量、次序量、名义量物理量：直接反映特征的实
6+，基于免疫原性细胞死亡的非肿瘤分型文章，投稿到接收仅一个多月，肿瘤的热点已经传导至非肿瘤生信文章中！生信小课堂
影响因子：6.147本文从投稿到接收仅一个多月关于非肿瘤生信，我们也解读过很多，主要有以下类型1单个疾病WGCNA+PPI分析筛选hub基因。2单个疾病结合免疫浸润，热点基因集，机器学习，分子分型等。3两种相关疾病联合分析，包括非肿瘤结合非肿瘤，非肿瘤结合肿瘤或者非肿瘤结合泛癌分析目前非肿瘤生信发文的门槛较低，有需要的朋友欢迎交流！研究概述：脑卒中是世界上死亡和残疾的主要原因之一，缺血性中风占80
VSCode使用Jupyter完整指南配置机器学习环境 z日火校招学习日记 vscode jupyter 机器学习
接下来开始机器学习部分第一步配置环境：VSCode使用Jupyter完整指南1.安装必要的扩展打开VSCode，按Ctrl+Shift+X打开扩展市场，搜索并安装以下扩展：必装扩展：Python(Microsoft官方)-Python语言支持Jupyter(Microsoft官方)-Jupyternotebook支持Pylance(Microsoft官方)-Python智能提示和语法检查推荐扩展：
养老院管理系统基于SpringBoot的养老院管理系统系统设计与实现（源码+论文+部署讲解等）
博主介绍：✌全网粉丝60W+,csdn特邀作者、Java领域优质创作者、csdn/掘金/哔哩哔哩/知乎/道客/小红书等平台优质作者，计算机毕设实战导师，目前专注于大学生项目实战开发,讲解,毕业答疑辅导，欢迎高校老师/同行前辈交流合作✌技术栈范围：SpringBoot、Vue、SSM、Jsp、HLMT、Nodejs、Python、爬虫、数据可视化、小程序、安卓app、大数据、物联网、机器学习、单片机
AI产品经理成长记《零号列车》第一集邂逅0XAI列车黑客思维者 AI产品经理养成人工智能 AI产品经理大模型智能体
《零号列车》绝非传统意义上的AI产品经理教程——它是我沉淀二十多年跨行业数字化转型与工业4.0实战经验后，首创的100集大型小说体培养指南。那些曾在千行百业验证过的知识与经验，不再是枯燥的文字堆砌，而是化作一场沉浸式的学习旅程。这里没有生硬的理论灌输，而是用跌宕起伏的故事情节，串联起AI技术的底层逻辑。你会跟着角色的脚步推进剧情，在不知不觉中吃透机器学习、大模型应用等专业概念；更有深入浅出的技术拆
人工智能时代下的数据新职业：新兴工作岗位版图研究司南锤 economics 人工智能
目录摘要第一章：AI驱动的数据价值链重构1.1从“沉睡金矿”到“流动的血液”：数据作为核心经济资产的激活1.2知识的新经济学：零边际成本革命1.3AI作为新的“操作系统”：重塑产业竞争格局第二章：基石层：数据准备与质量保障中的角色2.1数据标注与标签领导力：数据标注经理/主管2.2“地面真实”的守护者：AI数据质量专家第三章：技术核心层：构建AI与机器学习全生命周期的工程角色3.1AI生产线架构师
Python领域制造业的Python应用 Python编程之道 Python编程之道 python 开发语言 ai
Python在制造业中的应用：从自动化到智能制造关键词：Python、制造业、工业自动化、数据分析、机器学习、物联网、智能制造摘要：本文深入探讨Python编程语言在制造业中的广泛应用。从基础的自动化脚本到复杂的智能制造系统，Python凭借其丰富的库生态系统和易用性，正在重塑现代制造业。我们将分析Python在制造业中的核心应用场景，包括设备监控、质量控制、预测性维护和供应链优化等，并通过实际案
【机器学习】探索未来科技的前沿：人工智能、机器学习与大模型 AIGC零基础入门小白 AI大模型大模型教程人工智能机器学习科技 AI大模型 AIGC AI教程大模型教程
文章目录引言一、人工智能：从概念到现实1.1人工智能的定义1.2人工智能的发展历史1.3人工智能的分类1.4人工智能的应用二、机器学习：人工智能的核心技术2.1机器学习的定义2.2机器学习的分类2.3机器学习的实现原理2.4机器学习的应用2.5机器学习的示例代码2.6解释代码三、大模型：推动AI前沿发展的关键技术3.1大模型的定义3.2大模型的发展历程3.3深度学习与神经网络3.4大模型的优势与挑
人工智能入门指南：从基础概念到实际应用
前些天发现了一个巨牛的人工智能学习网站，通俗易懂，风趣幽默，忍不住分享一下给大家。点击跳转到网站。https://www.captainbed.cn/north文章目录1.**人工智能的基本概念**1.1什么是人工智能？1.2人工智能的分类2.**人工智能的核心技术**2.1机器学习（MachineLearning）2.1.1机器学习的类型2.1.2机器学习流程2.2深度学习（DeepLearni
Java与机器学习的邂逅：Weka框架入门指南墨夶 Java学习资料1 java 机器学习数据挖掘
在这个数据驱动的时代，机器学习已经成为各行业创新和优化的关键技术。而Java，作为一门成熟且广泛应用的编程语言，在企业级应用开发中占据着重要地位。将二者结合起来，利用Java实现机器学习算法，不仅可以充分发挥其强大的生态系统优势，还能为开发者提供一个高效、稳定的开发环境。今天，我们将带您走进Java与机器学习的世界，探索如何使用Weka这一著名的机器学习库来开启您的智能之旅。Weka简介及其优势什
机器学习基础：从数据到智能的入门指南
一、何谓机器学习在我们的日常生活中，机器学习的身影无处不在。当你打开购物软件，它总能精准推荐你可能喜欢的商品；当你解锁手机，人脸识别瞬间完成；当你使用语音助手，它能准确理解你的指令。这些背后，都离不开机器学习的支撑。机器学习是一门让计算机能够从数据中学习并改进的学科。随着传感器技术的飞速发展，我们身边充满了各种传感器，如手机中的摄像头、麦克风，交通监控中的传感器等，它们收集了海量的数据。这些数据就
大模型算法工程师技术路线全解析：从基础到资深的能力跃迁 Mr.小海大模型算法数据挖掘人工智能机器学习深度学习机器翻译 web3
文章目录大模型算法工程师技术路线全解析：从基础到资深的能力跃迁一、基础阶段（0-2年经验）：构建核心知识体系与工程入门数学与机器学习基础编程与深度学习框架NLP与Transformer入门二、进阶阶段（2-4年经验）：深化模型技术与工程落地能力大模型预训练与微调技术预训练原理：数据与任务的协同设计微调工具：参数高效适配与工程优化对齐实践：价值观优化与实证效果分布式训练与框架工具并行策略：多维度协同
多线程编程之理财周凡杨 java 多线程生产者消费者理财
现实生活中，我们一边工作，一边消费，正常情况下会把多余的钱存起来，比如存到余额宝，还可以多挣点钱，现在就有这个情况：我每月可以发工资20000万元（暂定每月的1号），每月消费5000（租房+生活费）元（暂定每月的1号），其中租金是大头占90%，交房租的方式可以选择（一月一交，两月一交、三月一交），理财：1万元存余额宝一天可以赚1元钱，
[Zookeeper学习笔记之三]Zookeeper会话超时机制 bit1129 zookeeper
首先，会话超时是由Zookeeper服务端通知客户端会话已经超时，客户端不能自行决定会话已经超时，不过客户端可以通过调用Zookeeper.close()主动的发起会话结束请求，如下的代码输出内容 Created /zoo-739160015 CONNECTEDCONNECTED .............CONNECTEDCONNECTED CONNECTEDCLOSEDCLOSED
SecureCRT快捷键 daizj secureCRT 快捷键
ctrl + a : 移动光标到行首ctrl + e ：移动光标到行尾crtl + b: 光标前移1个字符crtl + f: 光标后移1个字符crtl + h : 删除光标之前的一个字符ctrl + d ：删除光标之后的一个字符crtl + k ：删除光标到行尾所有字符crtl + u : 删除光标至行首所有字符crtl + w: 删除光标至行首
Java 子类与父类这间的转换周凡杨 java 父类与子类的转换
最近同事调的一个服务报错，查看后是日期之间转换出的问题。代码里是把 java.sql.Date 类型的对象强制转换为 java.sql.Timestamp 类型的对象。报java.lang.ClassCastException。代码：
可视化swing界面编辑朱辉辉33 eclipse swing
今天发现了一个WindowBuilder插件，功能好强大，啊哈哈，从此告别手动编辑swing界面代码，直接像VB那样编辑界面，代码会自动生成。首先在Eclipse中点击help，选择Install New Software,然后在Work with中输入WindowBui
web报表工具FineReport常用函数的用法总结（文本函数）老A不折腾 finereport web报表工具报表软件 java报表
文本函数 CHAR CHAR(number):根据指定数字返回对应的字符。CHAR函数可将计算机其他类型的数字代码转换为字符。 Number:用于指定字符的数字，介于1Number:用于指定字符的数字，介于165535之间（包括1和65535）。示例: CHAR(88)等于“X”。 CHAR(45)等于“-”。 CODE CODE(text):计算文本串中第一个字
mysql安装出错林鹤霄 mysql安装
[root@localhost ~]# rpm -ivh MySQL-server-5.5.24-1.linux2.6.x86_64.rpm Preparing... #####################
linux下编译libuv aigo libuv
下载最新版本的libuv源码，解压后执行： ./autogen.sh 这时会提醒找不到automake命令，通过一下命令执行安装（redhat系用yum，Debian系用apt-get）： # yum -y install automake # yum -y install libtool 如果提示错误：make: *** No targe
中国行政区数据及三级联动菜单 alxw4616
近期做项目需要三级联动菜单,上网查了半天竟然没有发现一个能直接用的! 呵呵,都要自己填数据....我了个去这东西麻烦就麻烦的数据上. 哎,自己没办法动手写吧. 现将这些数据共享出了,以方便大家.嗯,代码也可以直接使用文件说明 lib\area.sql -- 县及县以上行政区划分代码（截止2013年8月31日)来源：国家统计局发布时间：2014-01-17 15:0
哈夫曼加密文件百合不是茶哈夫曼压缩哈夫曼加密二叉树
在上一篇介绍过哈夫曼编码的基础知识,下面就直接介绍使用哈夫曼编码怎么来做文件加密或者压缩与解压的软件,对于新手来是有点难度的,主要还是要理清楚步骤; 加密步骤: 1,统计文件中字节出现的次数,作为权值 2,创建节点和哈夫曼树 3,得到每个子节点01串 4,使用哈夫曼编码表示每个字节
JDK1.5 Cyclicbarrier实例 bijian1013 java thread java多线程 Cyclicbarrier
CyclicBarrier类一个同步辅助类，它允许一组线程互相等待，直到到达某个公共屏障点 (common barrier point)。在涉及一组固定大小的线程的程序中，这些线程必须不时地互相等待，此时 CyclicBarrier 很有用。因为该 barrier 在释放等待线程后可以重用，所以称它为循环的 barrier。 CyclicBarrier支持一个可选的 Runnable 命令，
九项重要的职业规划 bijian1013 工作学习
一. 学习的步伐不停止古人说，活到老，学到老。终身学习应该是您的座右铭。世界在不断变化，每个人都在寻找各自的事业途径。您只有保证了足够的技能储
【Java范型四】范型方法 bit1129 java
范型参数不仅仅可以用于类型的声明上，例如 package com.tom.lang.generics; import java.util.List; public class Generics<T> { private T value; public Generics(T value) { this.value =
【Hadoop十三】HDFS Java API基本操作 bit1129 hadoop
package com.examples.hadoop; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FileStatus; import org.apache.hadoo
ua实现split字符串分隔 ronin47 lua split
LUA并不象其它许多"大而全"的语言那样，包括很多功能，比如网络通讯、图形界面等。但是LUA可以很容易地被扩展：由宿主语言(通常是C或 C++)提供这些功能，LUA可以使用它们，就像是本来就内置的功能一样。LUA只包括一个精简的核心和最基本的库。这使得LUA体积小、启动速度快，从而适合嵌入在别的程序里。因此在lua中并没有其他语言那样多的系统函数。习惯了其他语言的字符串分割函
java-从先序遍历和中序遍历重建二叉树 bylijinnan java
public class BuildTreePreOrderInOrder { /** * Build Binary Tree from PreOrder and InOrder * _______7______ / \ __10__ ___2 / \ / 4
openfire开发指南《连接和登陆》开窍的石头 openfire 开发指南 smack
第一步官网下载smack.jar包下载地址：http://www.igniterealtime.org/downloads/index.jsp#smack 第二步把smack里边的jar导入你新建的java项目中开始编写smack连接openfire代码 p
[移动通讯]手机后盖应该按需要能够随时开启 comsci 移动
看到新的手机，很多由金属材质做的外壳，内存和闪存容量越来越大，CPU速度越来越快，对于这些改进，我们非常高兴，也非常欢迎但是，对于手机的新设计，有几点我们也要注意第一：手机的后盖应该能够被用户自行取下来，手机的电池的可更换性应该是必须保留的设计,
20款国外知名的php开源cms系统 cuiyadll cms
内容管理系统，简称CMS，是一种简易的发布和管理新闻的程序。用户可以在后端管理系统中发布，编辑和删除文章，即使您不需要懂得HTML和其他脚本语言，这就是CMS的优点。在这里我决定介绍20款目前国外市面上最流行的开源的PHP内容管理系统，以便没有PHP知识的读者也可以通过国外内容管理系统建立自己的网站。 1. Wordpress WordPress的是一个功能强大且易于使用的内容管
Java生成全局唯一标识符 darrenzhu java uuid unique identifier id
How to generate a globally unique identifier in Java http://stackoverflow.com/questions/21536572/generate-unique-id-in-java-to-label-groups-of-related-entries-in-a-log http://stackoverflow
php安装模块检测是否已安装过, 使用的SQL语句 dcj3sjt126com sql
SHOW [FULL] TABLES [FROM db_name] [LIKE 'pattern'] SHOW TABLES列举了给定数据库中的非TEMPORARY表。您也可以使用mysqlshow db_name命令得到此清单。本命令也列举数据库中的其它视图。支持FULL修改符，这样SHOW FULL TABLES就可以显示第二个输出列。对于一个表，第二列的值为BASE T
5天学会一种 web 开发框架 dcj3sjt126com Web 框架 framework
web framework层出不穷，特别是ruby/python,各有10+个,php/java也是一大堆根据我自己的经验写了一个to do list,按照这个清单，一条一条的学习，事半功倍，很快就能掌握一共25条，即便很磨蹭，2小时也能搞定一条，25*2=50。只需要50小时就能掌握任意一种web框架各类web框架大同小异:现代web开发框架的6大元素，把握主线，就不会迷路建议把本文
Gson使用三(Map集合的处理,一对多处理) eksliang json gson Gson map Gson 集合处理
转载请出自出处：http://eksliang.iteye.com/blog/2175532 一、概述 Map保存的是键值对的形式，Json的格式也是键值对的，所以正常情况下，map跟json之间的转换应当是理所当然的事情。二、Map参考实例 package com.ickes.json; import java.lang.refl
cordova实现“再点击一次退出”效果 gundumw100 android
基本的写法如下： document.addEventListener("deviceready", onDeviceReady, false); function onDeviceReady() { //navigator.splashscreen.hide(); document.addEventListener("b
openldap configuration leaning note iwindyforest configuration
hostname // to display the computer name hostname <changed name> // to change go to: /etc/sysconfig/network, add/modify HOSTNAME=NEWNAME to change permenately dont forget to change /etc/hosts
Nullability and Objective-C 啸笑天 Objective-C
https://developer.apple.com/swift/blog/?id=25 http://www.cocoachina.com/ios/20150601/11989.html http://blog.csdn.net/zhangao0086/article/details/44409913 http://blog.sunnyxx
jsp中实现参数隐藏的两种方法 macroli JavaScript jsp
在一个JSP页面有一个链接，//确定是一个链接?点击弹出一个页面，需要传给这个页面一些参数。//正常的方法是设置弹出页面的src="***.do?p1=aaa&p2=bbb&p3=ccc"//确定目标URL是Action来处理?但是这样会在页面上看到传过来的参数，可能会不安全。要求实现src="***.do"，参数通过其他方法传！//////
Bootstrap A标签关闭modal并打开新的链接解决方案 qiaolevip 每天进步一点点学习永无止境 bootstrap 纵观千象
Bootstrap里面的js modal控件使用起来很方便，关闭也很简单。只需添加标签 data-dismiss="modal" 即可。可是偏偏有时候需要a标签既要关闭modal，有要打开新的链接，尝试多种方法未果。只好使用原始js来控制。 <a href="#/group-buy" class="btn bt
二维数组在Java和C中的区别流淚的芥末 java c 二维数组数组
Java代码： public class test03 { public static void main(String[] args) { int[][] a = {{1},{2,3},{4,5,6}}; System.out.println(a[0][1]); } } 运行结果： Exception in thread "mai
systemctl命令用法 wmlJava linux systemctl
对比表，以 apache / httpd 为例任务旧指令新指令使某服务自动启动 chkconfig --level 3 httpd on systemctl enable httpd.service 使某服务不自动启动 chkconfig --level 3 httpd off systemctl disable httpd.service 检查服务状态 service h