Chain .

Catboost参数全集

CatBoost 英文官网地址：https://catboost.ai/docs/concepts/python-reference_parameters-list.html

Training parameters

Python package training parameters

Several parameters have aliases. For example, the iterations parameter has the following synonyms: num_boost_round, n_estimators, num_trees. Simultaneous usage of different names of one parameter raises an error.

Training on GPU requires NVIDIA Driver of version 390.xx or higher.


Parameter	Type	Description	Default value	Supported processing units
Common parameters
loss_function Alias: objective	string object	The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric). Format: `[:=;..;=]` Supported metrics: RMSE Logloss MAE CrossEntropy Quantile LogLinQuantile Lq MultiClass MultiClassOneVsAll MAPE Poisson PairLogit PairLogitPairwise QueryRMSE QuerySoftMax YetiRank YetiRankPairwise A custom python object can also be set as the value of this parameter (see an example). For example, use the following construction to calculate the value of Quantile with the coefficient : `Quantile:alpha=0.1`	Depends on the class	CPU and GPU
custom_metric	string list of strings	Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metricssection for details on each metric).. Format: `[:=;..;=]` Supported metrics: RMSE Logloss MAE CrossEntropy Quantile LogLinQuantile Lq MultiClass MultiClassOneVsAll MAPE Poisson PairLogit PairLogitPairwise QueryRMSE QuerySoftMax SMAPE Recall Precision F1 TotalF1 Accuracy BalancedAccuracy BalancedErrorRate Kappa WKappa LogLikelihoodOfPrediction AUC R2 NumErrors MCC BrierScore HingeLoss HammingLoss ZeroOneLoss MSLE MedianAbsoluteError Huber PairAccuracy AverageGain PFound NDCG PrecisionAt RecallAt MAP CtrFactor Examples: Calculate the value of CrossEntropy: `CrossEntropy` Calculate the value of в with the coefficient `Quantile:alpha=0.1` Calculate the values of Logloss and AUC: `['Logloss', 'AUC']` Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv and test_error.tsvrespectively). The directory for these files is specified in the --train-dir (train_dir) parameter. Use the visualization tools to see a live chart with the dynamics of the specified metrics.	None	CPU and GPU
eval_metric	string object	The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric). Format: `[:=;..;=]` Supported metrics: RMSE Logloss MAE CrossEntropy Quantile LogLinQuantile Lq MultiClass MultiClassOneVsAll MAPE Poisson PairLogit PairLogitPairwise QueryRMSE QuerySoftMax SMAPE Recall Precision F1 TotalF1 Accuracy BalancedAccuracy BalancedErrorRate Kappa WKappa LogLikelihoodOfPrediction AUC R2 NumErrors MCC BrierScore HingeLoss HammingLoss ZeroOneLoss MSLE MedianAbsoluteError Huber PairAccuracy AverageGain PFound NDCG PrecisionAt RecallAt MAP A user-defined function can also be set as the value (see an example). Examples: `R2`	Optimized objective is used	CPU and GPU
iterations Aliases: num_boost_round n_estimators num_trees	int	The maximum number of trees that can be built when solving machine learning problems. When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter.	1000	CPU and GPU
learning_rate Alias: eta	float	The learning rate. Used for reducing the gradient step.	The default value is defined automatically for binary classification based on the dataset properties and the number of iterations if none of these parametersis set. In this case, the selected learning rate is printed to stdout and saved in the model. In other cases, the default value is 0.03.	CPU and GPU
random_seed Alias: random_state	int	The random seed used for training.	None (0)	CPU and GPU
l2_leaf_reg Alias: reg_lambda	float	Coefficient at the L2 regularization term of the cost function. Any positive value is allowed.	3.0	CPU and GPU
bootstrap_type	string	Bootstrap type. Defines the method for sampling the weights of objects. Supported methods: Bayesian Bernoulli MVS Poisson (supported for GPU only) No	Bayesian	CPU and GPU
bagging_temperature	float	Defines the settings of the Bayesian bootstrap. It is used by default in classification and regression modes. Use the Bayesian bootstrap to assign random weights to objects. The weights are sampled from exponential distribution if the value of this parameter is set to “1”. All weights are equal to 1 if the value of this parameter is set to “0”. Possible values are in the range . The higher the value the more aggressive the bagging is. This parameter can be used if the selected bootstrap type is Bayesian.	1	CPU and GPU
subsample	float	Sample rate for bagging. This parameter can be used if one of the following bootstrap types is selected: Poisson Bernoulli	0.66	CPU and GPU
sampling_frequency	string	Frequency to sample weights and objects when building trees. Supported values: PerTree — Before constructing each new tree PerTreeLevel — Before choosing each new split of a tree	PerTreeLevel	CPU and GPU
sampling_unit	String	The sampling scheme. Possible values: Object — The weight of the i-th object is used for sampling the corresponding object. Group — The weight of the group is used for sampling each object from the group .	Object	CPU and GPU
mvs_head_fraction	float	Controls the fraction of the highest by absolute value gradients taken for the minimal variance sampling. Possible values are in the range . This parameter can be used if the selected bootstrap type is MVS.	1.0	CPU
random_strength	float	The amount of randomness to use for scoring splits when the tree structure is selected. Use this parameter to avoid overfitting the model. The value of this parameter is used when selecting splits. On every iteration each possible split gets a score (for example, the score indicates how much adding this split will improve the loss function for the training dataset). The split with the highest score is selected. The scores have no randomness. A normally distributed random variable is added to the score of the feature. It has a zero mean and a variance that decreases during the training. The value of this parameter is the multiplier of the variance.Note.This parameter is not supported for the following loss functions: QueryCrossEntropy YetiRankPairwise PairLogitPairwise	1	CPU
use_best_model	bool	If this parameter is set, the number of trees that are saved in the resulting model is defined as follows: Build the number of trees defined by the training parameters. Use the validation dataset to identify the iteration with the optimal value of the metric specified in --eval-metric (eval_metric). No trees are saved after this iteration. This option requires a validation dataset to be provided.	True if a validation set is input (the eval_setparameter is defined) and at least one of the label values of objects in this set differs from the others. False otherwise.	CPU and GPU
best_model_min_trees	int	The minimal number of trees that the best model should have. If set, the output model contains at least the given number of trees even if the best model is located within these trees. Should be used with the use_best_model parameter.	None (The minimal number of trees for the best model is not set)	CPU and GPU
depth Alias: max_depth	int	Depth of the tree. The range of supported values depends on the processing unit type and the type of the selected loss function: CPU — Any integer up to 16. GPU — Any integer up to 8 pairwise modes (YetiRank, PairLogitPairwise and QueryCrossEntropy) and up to 16 for all other loss functions.	6 (16 if the growing policy is set to Lossguide)	CPU and GPU
grow_policy	string	The tree growing policy. Defines how to perform greedy tree construction. Possible values: SymmetricTree —A tree is built level by level until the specified depth is reached. On each iteration, all leaves from the last tree level are split with the same condition. The resulting tree structure is always symmetric. Depthwise — A tree is built level by level until the specified depth is reached. On each iteration, all non-terminal leaves from the last tree level are split. Each leaf is split by condition with the best loss improvement. Lossguide — A tree is built leaf by leaf until the specified maximum number of leaves is reached. On each iteration, non-terminal leaf with the best loss improvement is split. Note. The Depthwise and Lossguidegrowing policies are currently supported only in training and prediction modes. They are not supported for model analysis (such as Feature importance and ShapValues) and exporting to different model formats (such as AppleCoreML , onnx and json) .	SymmetricTree	GPU
min_data_in_leaf	int	The minimum number of training samples in a leaf. CatBoost does not search for new splits in leaves with samples count less than the specified value. Can be used only with the Lossguide and Depthwisegrowing policies.	1	GPU
max_leaves	int	The maximum number of leafs in the resulting tree. Can be used only with the Lossguide growing policy. Tip. It is not recommended to use values greater than 64, since it can significantly slow down the training process.	31	GPU
ignored_features	list	Feature indices or names to exclude from the training. It is assumed that all passed values are feature names if at least one of the passed values can not be converted to a number or a range of numbers. Otherwise, it is assumed that all passed values are feature indices. Specifics: Non-negative indices that do not match any features are successfully ignored. For example, if five features are defined for the objects in the dataset and this parameter is set to “42”, the corresponding non-existing feature is successfully ignored. The identifier corresponds to the feature's index. Feature indices used in train and feature importance are numbered from 0 to featureCount – 1. If a file is used as input data then any non-feature column types are ignored when calculating these indices. For example, each row in the input file contains data in the following order: `cat feature<\t>label value<\t>num feature`. So for the row `rock<\t>0<\t>42`, the identifier for the “rock” feature is 0, and for the “42” feature it's 1. The addition of a non-existing feature name raises an error. For example, use the following construction if features indexed 1, 2, 7, 42, 43, 44, 45, should be ignored: `[1,2,7,42,43,44,45]`	None (use all features)	CPU and GPU
one_hot_max_size	int	Use one-hot encoding for all categorical features with a number of different values less than or equal to the given parameter value. Ctrs are not calculated for such features. See details.	The default value depends on various conditions: N/A if training is performed on CPU in Pairwise scoring mode 255 if training is performed on GPU and the selected Ctr types require target data that is not available during the training 10 if training is performed in Ranking mode 2 if none of the conditions above is met	CPU and GPU
has_time	bool	Use the order of objects in the input data (do not perform random permutations during the Transforming categorical features to numerical features and Choosing the tree structure stages). The Timestamp column type is used to determine the order of objects if specified in the input data.	False (not used; generates random permutations)	CPU and GPU
rsm Alias: colsample_bylevel	float (0;1]	Random subspace method. The percentage of features to use at each split selection, when features are selected over again at random. The value must be in the range (0;1].	None (set to 1)	CPU
nan_mode	string	The method for processing missing values in the input dataset. Possible values: “Forbidden” — Missing values are not supported, their presence is interpreted as an error. “Min” — Missing values are processed as the minimum value (less than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees. “Max” — Missing values are processed as the maximum value (greater than all other values) for the feature. It is guaranteed that a split that separates missing values from all other values is considered when selecting trees. Using the Min or Max value of this parameter guarantees that a split between missing values and other values is considered when selecting a new split in the tree. Note. The method for processing missing values can be set individually for each feature in the Custom quantization borders and missing value modes input file. Such values override the ones specified in this parameter.	Min	CPU and GPU
input_borders	string	Load Custom quantization borders and missing value modes from a file (do not generate them). Borders are automatically generated before training if this parameter is not set.	None	CPU and GPU
output_borders	string	Save quantization borders for the current dataset to a file. Refer to the file format description.	Noneкк	CPU and GPU
fold_permutation_block	int	Objects in the dataset are grouped in blocks before the random permutations. This parameter defines the size of the blocks. The smaller is the value, the slower is the training. Large values may result in quality degradation.	1	CPU and GPU
leaf_estimation_method	string	The method used to calculate the values in leaves. Possible values: Newton Gradient	Gradient	CPU and GPU
leaf_estimation_iterations	int	The number of gradient steps when calculating the values in leaves.	None (Depends on the training objective)	CPU and GPU
leaf_estimation_backtracking	string	The type of backtracking to use during the gradient descent. Possible values: No — Do not use backtracking. Supported on CPUand GPU. AnyImprovement — Reduce the descent step up to the point when the loss function value is smaller than it was on the previous step. Supported on CPUand GPU. Armijo — Reduce the descent step until the Armijo condition is met. Supported only on GPU.	AnyImprovement	Depends on the selected value
fold_len_multiplier	float	Coefficient for changing the length of folds. The value must be greater than 1. The best validation result is achieved with minimum values. With values close to 1 (for example, ), each iteration takes a quadratic amount of memory and time for the number of objects in the iteration. Thus, low values are possible only when there is a small number of objects.	2	CPU and GPU
approx_on_full_history	bool	The principles for calculating the approximated values. Possible values: “False” — Use only а fraction of the fold for calculating the approximated values. The size of the fraction is calculated as follows: , where X is the specified coefficient for changing the length of folds. This mode is faster and in rare cases slightly less accurate “True” — Use all the preceding rows in the fold for calculating the approximated values. This mode is slower and in rare cases slightly more accurate.	False	CPU
class_weights	list	Class weights. The values are used as multipliers for the object weights. This parameter can be used for solving classification and multiclassification problems. Tip. For imbalanced datasets with binary classification the weight multiplier can be set to 1 for class 0 and to for class 1. For example, `class_weights=[0.1, 4]` multiplies the weights of objects from class 0 by 0.1 and the weights of objects from class 1 by 4.	None (the weight for all classes is set to 1)	CPU and GPU
scale_pos_weight	float	The weight for class 1 in binary classification. The value is used as a multiplier for the weights of objects from class 1. Tip. For imbalanced datasets, the weight multiplier can be set to	1.0	CPU and GPU
boosting_type	string	Boosting scheme. Possible values: Ordered — Usually provides better quality on small datasets, but it may be slower than the Plain scheme. Plain — The classic gradient boosting scheme.	Depends on the number of objects in the training dataset and the selected learning mode	CPU and GPU Only the Plainmode is supported for the MultiClassloss on GPU
allow_const_label	bool	Use it to train models with datasets that have equal label values for all objects.	False	CPU and GPU
score_function	string	The score type used to select the next split during the tree construction. Possible values: Cosine (do not use this score type with the Lossguide tree growing policy) L2 LOOL2 NewtonCosine (do not use this score type with the Lossguide tree growing policy) NewtonL2 SatL2 SolarL2	Correlation (NewtonL2 if the growing policy is set to Lossguide)	GPU
Overfitting detection settings
early_stopping_rounds	int	Set the overfitting detector type to Iter and stop the training after the specified number of iterations since the iteration with the optimal metric value.	False	CPU and GPU
od_type	string	The type of the overfitting detector to use. Possible values: IncToDec Iter	IncToDec	CPU and GPU
od_pval	float	The threshold for the IncToDec overfitting detectortype. The training is stopped when the specified value is reached. Requires that a validation dataset was input. For best results, it is recommended to set a value in the range . The larger the value, the earlier overfitting is detected. Restriction. Do not use this parameter with the Iteroverfitting detector type.	0 (the overfitting detection is turned off)	CPU and GPU
od_wait	int	The number of iterations to continue the training after the iteration with the optimal metric value.The purpose of this parameter differs depending on the selected overfitting detector type: IncToDec — Ignore the overfitting detector when the threshold is reached and continue learning for the specified number of iterations after the iteration with the optimal metric value. Iter — Consider the model overfitted and stop training after the specified number of iterations since the iteration with the optimal metric value.	20	CPU and GPU
Quantization settings
target_border	float	If set, defines the border for converting target values to 0 and 1. Depending on the specified value: the target is converted to 0 the target is converted to 1	None	CPU and GPU
border_count Alias: max_bin	int	The number of splits for numerical features. Allowed values depend on the processing unit type: CPU — integers from 1 to 65535 inclusively. GPU — integers from 1 to 255 inclusively.	254 (if training is performed on CPU) or 128 (if training is performed on GPU)	CPU and GPU
feature_border_type	string	The quantization mode for numerical features. Possible values: Median Uniform UniformAndQuantiles MaxLogSum MinEntropy GreedyLogSum	GreedyLogSum	CPU and GPU
Multiclassification settings
classes_count	int	The upper limit for the numeric class label. Defines the number of classes for multiclassification. Only non-negative integers can be specified. The given integer should be greater than any of the label values. If this parameter is specified the labels for all classes in the input dataset should be smaller than the given value	None. Calculation principles	CPU and GPU
Performance settings
thread_count	int	The number of threads to use during training. For CPU Optimizes the speed of execution. This parameter doesn't affect results. For GPU The given value is used for reading the data from the hard drive and does not affect the training. During the training one main thread and one thread for each GPU are used.	-1 (the number of threads is equal to the number of processor cores)	CPU and GPU
used_ram_limit	int	Attempt to limit the amount of used CPU RAM. Restriction. This option affects only the CTR calculation memory usage. In some cases it is impossible to limit the amount of CPU RAM used in accordance with the specified value. Format: Supported measures of information (non case-sensitive): MB KB GB For example: `2gb`	None (memory usage is no limited)	CPU
gpu_ram_part	float	How much of the GPU RAM to use for training.	0.95	GPU
pinned_memory_size	int	How much pinned (page-locked) CPU RAM to use per GPU.	1073741824	GPU
gpu_cat_features_storage	string	The method for storing the categorical features' values. Possible values: CpuPinnedMemory GpuRam Tip. Use the CpuPinnedMemory value if feature combinations are used and the available GPU RAM is not sufficient.	None (set to GpuRam)	GPU
data_partition	string	The method for splitting the input dataset between multiple workers. Possible values: FeatureParallel — Split the input dataset by features and calculate the value of each of these features on a certain GPU. For example: GPU0 is used to calculate the values of features indexed 0, 1, 2 GPU1 is used to calculate the values of features indexed 3, 4, 5, etc. DocParallel — Split the input dataset by objects and calculate all features for each of these objects on a certain GPU. It is recommended to use powers of two as the value for optimal performance. For example: GPU0 is used to calculate all features for objects indexed `object_1`, `object_2` GPU1 is used to calculate all features for objects indexed `object_3`, `object_4`, etc.	Depends on the learning mode and the input dataset	GPU
Processing unit settings
task_type	string	The processing unit type to use for training. Possible values: CPU GPU	CPU	CPU and GPU
devices	string	IDs of the GPU devices to use for training (indices are zero-based). Format for one device (for example, `3`) `::..:` for multiple devices (for example, `devices='0:1:3'`) `-` for a range of devices (for example, `devices='0-3'`)	NULL (all GPU devices are used if the corresponding processing unit type is selected)	GPU
Visualization settings
name	string	The experiment name to display in visualization tools.	experiment	CPU and GPU
Output settings
logging_level	string	The logging level to output to stdout. Possible values: Silent — Do not output any logging information to stdout. Verbose — Output the following data to stdout: optimized metric elapsed time of training remaining time of training Info — Output additional information and the number of trees. Debug — Output debugging information.	None (corresponds to the Verboselogging level)	CPU and GPU
metric_period	int	The frequency of iterations to calculate the values of objectives and metrics. The value should be a positive integer. The usage of this parameter speeds up the training. Note. It is recommended to increase the value of this parameter to maintain training speed if a GPU processing unit type is used.	1	CPU and GPU
verbose Alias: verbose_eval	bool int	The purpose of this parameter depends on the type of the given value: bool — Defines the logging level: “True” corresponds to the Verbose logging level “False” corresponds to the Silent logging level int — Use the Verbose logging level and set the logging period to the value of this parameter. Restriction. Do not use this parameter with the logging_level parameter.	1	CPU and GPU
train_dir	string	The directory for storing the files generated during training.	catboost_info	CPU and GPU
model_size_reg	float	The model size regularization coefficient. The larger the value, the smaller the model size. Refer to the Model size regularization coefficient section for details. Possible values are in the range . This regularization is needed only for models with categorical features (other models are small). Models with categorical features might weight tens of gigabytes or more if categorical features have a lot of values. If the value of the regularizer differs from zero, then the usage of categorical features or feature combinations with a lot of values has a penalty, so less of them are used in the resulting model. Note that the resulting quality of the model can be affected. Set the value to 0 to turn off the model size optimization option.	None (Turned on and set to 0.5 on CPU and turned off for GPU)	CPU
allow_writing_files	bool	Allow to write analytical and snapshot files during training. If set to “False”, the snapshot and data visualizationtools are unavailable.	True	CPU and GPU
save_snapshot	bool	Enable snapshotting for restoring the training progress after an interruption. If enabled, the default period for making snapshots is 600 seconds. Use the snapshot_interval parameter to change this period. Note. This parameter is not supported in the params parameter of the cv function.	None	CPU and GPU
snapshot_file	string	The name of the file to save the training progress information in. This file is used for recovering training after an interruption. Depending on whether the specified file exists in the file system: Missing — Write information about training progress to the specified file. Exists — Load data from the specified file and continue training from where it left off. Note. This parameter is not supported in the params parameter of the cv function.	experiment...	CPU and GPU
snapshot_interval	int	The interval between saving snapshots in seconds. The first snapshot is taken after the specified number of seconds since the start of training. Every subsequent snapshot is taken after the specified number of seconds since the previous one. The last snapshot is taken at the end of the training. Note. This parameter is not supported in the params parameter of the cv function.	600	CPU and GPU
roc_file	string	The name of the output file to save the ROC curve points to. This parameter can only be set in cross-validation mode if the Logloss loss function is selected. The ROC curve points are calculated for the test fold. The output file is saved to the catboost_infodirectory.	None (the file is not saved)	CPU and GPU
CTR settings
simple_ctr	string	Quantization settings for simple categorical features. Use this parameter to specify the principles for defining the class of the object for regression tasks. By default, it is considered that an object belongs to the positive class if its' label value is greater than the median of all label values of the dataset. Format: `['CtrType[:TargetBorderCount=BorderCount][:TargetBorderType=BorderType][:CtrBorderCount=Count][:CtrBorderType=Type][:Prior=num_1/denum_1]..[:Prior=num_N/denum_N]', 'CtrType[:TargetBorderCount=BorderCount][:TargetBorderType=BorderType][:CtrBorderCount=Count][:CtrBorderType=Type][:Prior=num_1/denum_1]..[:Prior=num_N/denum_N]', ...]` Components: `CtrType` — The method for transforming categorical features to numerical features. Supported methods for training on CPU: Borders Buckets BinarizedTargetMeanValue Counter Supported methods for training on GPU: Borders Buckets FeatureFreq FloatTargetMeanValue `TargetBorderCount` — The number of borders for label value quantization. Only used for regression problems. Allowed values are integers from 1 to 255 inclusively. The default value is 1. This option is available for training on CPU only. `TargetBorderType` — The quantization type for the label value. Only used for regression problems. Possible values: Median Uniform UniformAndQuantiles MaxLogSum MinEntropy GreedyLogSum By default, MinEntropy. This option is available for training on CPU only. `CtrBorderCount` — The number of splits for categorical features. Allowed values are integers from 1 to 255 inclusively. `CtrBorderType` — The quantization type for categorical features. Supported values for training on CPU: Uniform Supported values for training on GPU: Median Uniform UniformAndQuantiles MaxLogSum MinEntropy GreedyLogSum `Prior` — Use the specified priors during training (several values can be specified). Possible formats: One number — Adds the value to the numerator. Two slash-delimited numbers (for GPU only) — Use this format to set a fraction. The number is added to the numerator and the second is added to the denominator. Examples `simple_ctr='Borders:TargetBorderCount=2'` Two new features with differing quantization settings are generated. The first one concludes that an object belongs to the positive class when the label value exceeds the first border. The second one concludes that an object belongs to the positive class when the label value exceeds the second border. For example, if the label takes three different values (0, 1, 2), the first border is 0.5 while the second one is 1.5. `simple_ctr='Buckets:TargetBorderCount=2'` The number of features depends on the number of different labels. For example, three new features are generated if the label takes three different values (0, 1, 2). In this case, the first one concludes that an object belongs to the positive class when the value of the feature is equal to 0 or belongs to the bucket indexed 0. The second one concludes that an object belongs to the positive class when the value of the feature is equal to 1 or belongs to the bucket indexed 1, and so on.		CPU and GPU
combinations_ctr	string	Quantization settings for combinations of categorical features. `['CtrType[:TargetBorderCount=BorderCount][:TargetBorderType=BorderType][:CtrBorderCount=Count][:CtrBorderType=Type][:Prior=num_1/denum_1]..[:Prior=num_N/denum_N]', 'CtrType[:TargetBorderCount=BorderCount][:TargetBorderType=BorderType][:CtrBorderCount=Count][:CtrBorderType=Type][:Prior=num_1/denum_1]..[:Prior=num_N/denum_N]', ...]` Components: `CtrType` — The method for transforming categorical features to numerical features. Supported methods for training on CPU: Borders Buckets BinarizedTargetMeanValue Counter Supported methods for training on GPU: Borders Buckets FeatureFreq FloatTargetMeanValue `TargetBorderCount` — The number of borders for label value quantization. Only used for regression problems. Allowed values are integers from 1 to 255 inclusively. The default value is 1. This option is available for training on CPU only. `TargetBorderType` — The quantization type for the label value. Only used for regression problems. Possible values: Median Uniform UniformAndQuantiles MaxLogSum MinEntropy GreedyLogSum By default, MinEntropy. This option is available for training on CPU only. `CtrBorderCount` — The number of splits for categorical features. Allowed values are integers from 1 to 255 inclusively. `CtrBorderType` — The quantization type for categorical features. Supported values for training on CPU: Uniform Supported values for training on GPU: Uniform Median `Prior` — Use the specified priors during training (several values can be specified). Possible formats: One number — Adds the value to the numerator. Two slash-delimited numbers (for GPU only) — Use this format to set a fraction. The number is added to the numerator and the second is added to the denominator.		CPU and GPU
per_feature_ctr	string	Per-feature quantization settings for categorical features. `['FeatureId:CtrType:[:TargetBorderCount=BorderCount][:TargetBorderType=BorderType][:CtrBorderCount=Count][:CtrBorderType=Type][:Prior=num_1/denum_1]..[:Prior=num_N/denum_N]', 'FeatureId:CtrType:[:TargetBorderCount=BorderCount][:TargetBorderType=BorderType][:CtrBorderCount=Count][:CtrBorderType=Type][:Prior=num_1/denum_1]..[:Prior=num_N/denum_N]', ...]` Components: `FeatureId` — A zero-based feature identifier. `CtrType` — The method for transforming categorical features to numerical features. Supported methods for training on CPU: Borders Buckets BinarizedTargetMeanValue Counter Supported methods for training on GPU: Borders Buckets FeatureFreq FloatTargetMeanValue `TargetBorderCount` — The number of borders for label value quantization. Only used for regression problems. Allowed values are integers from 1 to 255 inclusively. The default value is 1. This option is available for training on CPU only. `TargetBorderType` — The quantization type for the label value. Only used for regression problems. Possible values: Median Uniform UniformAndQuantiles MaxLogSum MinEntropy GreedyLogSum By default, MinEntropy. This option is available for training on CPU only. `CtrBorderCount` — The number of splits for categorical features. Allowed values are integers from 1 to 255 inclusively. `CtrBorderType` — The quantization type for categorical features. Supported values for training on CPU: Uniform Supported values for training on GPU: Median Uniform UniformAndQuantiles MaxLogSum MinEntropy GreedyLogSum `Prior` — Use the specified priors during training (several values can be specified). Possible formats: One number — Adds the value to the numerator. Two slash-delimited numbers (for GPU only) — Use this format to set a fraction. The number is added to the numerator and the second is added to the denominator.		CPU and GPU
ctr_target_border_count	int	The maximum number of borders to use in target quantization for categorical features that need it. Allowed values are integers from 1 to 255 inclusively. The value of the `TargetBorderCount` component overrides this parameter if it is specified for one of the following parameters: simple_ctr combinations_ctr per_feature_ctr	Number_of_classes - 1 for Multiclassification problems when training on CPU, 1 otherwise	CPU and GPU
counter_calc_method	string	The method for calculating the Counter CTR type. Possible values: SkipTest — Objects from the validation dataset are not considered at all Full — All objects from both learn and validation datasets are considered	None (Full is used)	CPU and GPU
max_ctr_complexity	int	The maximum number of features that can be combined. Each resulting combination consists of one or more categorical features and can optionally contain binary features in the following form: “numeric feature > value”.	4	CPU and GPU
ctr_leaf_count_limit	int	The maximum number of leaves with categorical features. If the quantity exceeds the specified value a part of leaves is discarded. The leaves to be discarded are selected as follows: The leaves are sorted by the frequency of the values. The top N leaves are selected, where N is the value specified in the parameter. All leaves starting from N+1 are discarded. This option reduces the resulting model size and the amount of memory required for training. Note that the resulting quality of the model can be affected.	None The number of different category values is not limited	CPU
store_all_simple_ctr	bool	Ignore categorical features, which are not used in feature combinations, when choosing candidates for exclusion. Use this parameter with ctr_leaf_count_limitonly.	None (set to False) Both simple features and feature combinations are taken in account when limiting the number of leafs with categorical features	CPU
final_ctr_computation_mode	string	Final CTR computation mode. Possible values: Default — Compute final CTRs for learn and validation datasets. Skip — Do not compute final CTRs for learn and validation datasets. In this case, the resulting model can not be applied. This mode decreases the size of the resulting model. It can be useful for research purposes when only the metric values have to be calculated.	Default	CPU and GPU

你可能感兴趣的:(机器学习)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
数字里的世界17期：2021年全球10大顶级数据中心，中国移动榜首张三叨
你知道吗？2016年，全球的数据中心共计用电4160亿千瓦时，比整个英国的发电量还多40％！前言每天，我们都会创造超过250万TB的数据。并且随着物联网（IOT）的不断普及，这一数据将持续增长。如此庞大的数据被存储在被称为“数据中心”的专用设施中。虽然最早的数据中心建于20世纪40年代，但直到1997-2000年的互联网泡沫期间才逐渐成为主流。当前人类的技术，比如人工智能和机器学习，已经将我们推向
nosql数据库技术与应用知识点皆过客，揽星河 NoSQL nosql 数据库大数据数据分析数据结构非关系型数据库
Nosql知识回顾大数据处理流程数据采集(flume、爬虫、传感器)数据存储(本门课程NoSQL所处的阶段)Hdfs、MongoDB、HBase等数据清洗(入仓)Hive等数据处理、分析(Spark、Flink等)数据可视化数据挖掘、机器学习应用(Python、SparkMLlib等)大数据时代存储的挑战(三高)高并发(同一时间很多人访问)高扩展(要求随时根据需求扩展存储)高效率(要求读写速度快)
Python开发常用的三方模块如下：换个网名有点难 python 开发语言
Python是一门功能强大的编程语言，拥有丰富的第三方库，这些库为开发者提供了极大的便利。以下是100个常用的Python库，涵盖了多个领域：1、NumPy，用于科学计算的基础库。2、Pandas，提供数据结构和数据分析工具。3、Matplotlib，一个绘图库。4、Scikit-learn，机器学习库。5、SciPy，用于数学、科学和工程的库。6、TensorFlow，由Google开发的开源机
Python实现简单的机器学习算法 master_chenchengg python python 办公效率 python开发 IT
Python实现简单的机器学习算法开篇：初探机器学习的奇妙之旅搭建环境：一切从安装开始必备工具箱第一步：安装Anaconda和JupyterNotebook小贴士：如何配置Python环境变量算法初体验：从零开始的Python机器学习线性回归：让数据说话数据准备：从哪里找数据编码实战：Python实现线性回归模型评估：如何判断模型好坏逻辑回归：从分类开始理论入门：什么是逻辑回归代码实现：使用skl
遥感影像的切片处理 sand&wich 计算机视觉 python 图像处理
在遥感影像分析中，经常需要将大尺寸的影像切分成小片段，以便于进行详细的分析和处理。这种方法特别适用于机器学习和图像处理任务，如对象检测、图像分类等。以下是如何使用Python和OpenCV库来实现这一过程，同时确保每个影像片段保留正确的地理信息。准备环境首先，确保安装了必要的Python库，包括numpy、opencv-python和xml.etree.ElementTree。这些库将用于图像处理
ai绘画工具midjourney怎么下载？附作品管理教程设计师早上好
Midjourney是一款功能强大的AI绘画工具，它使用机器学习技术和深度神经网络等算法，可以生成各种艺术风格的绘画作品。在创意设计、广告宣传等方面有着广泛的应用前景。那么，ai绘画工具midjourney怎么下载？本文将为您介绍Midjourney的下载以及作品的相关管理。一、Midjourney下载Midjourney的下载非常简单，只需打开Midjourney官网（点击“GetMidjour
[实践应用] 深度学习之模型性能评估指标 YuanDaima2048 深度学习工具使用深度学习人工智能损失函数性能评估 pytorch python 机器学习
文章总览：YuanDaiMa2048博客文章总览深度学习之模型性能评估指标分类任务回归任务排序任务聚类任务生成任务其他介绍在机器学习和深度学习领域，评估模型性能是一项至关重要的任务。不同的学习任务需要不同的性能指标来衡量模型的有效性。以下是对一些常见任务及其相应的性能评估指标的详细解释和总结。分类任务分类任务是指模型需要将输入数据分配到预定义的类别或标签中。以下是分类任务中常用的性能指标：准确率(
机器学习-聚类算法不良人龍木木机器学习机器学习算法聚类
机器学习-聚类算法1.AHC2.K-means3.SC4.MCL仅个人笔记，感谢点赞关注！1.AHC2.K-means3.SC传统谱聚类：个人对谱聚类算法的理解以及改进4.MCL目前仅专注于NLP的技术学习和分享感谢大家的关注与支持！
未来软件市场是怎么样的？做开发的生存空间如何？ cesske 软件需求
目录前言一、未来软件市场的发展趋势二、软件开发人员的生存空间前言未来软件市场是怎么样的？做开发的生存空间如何？一、未来软件市场的发展趋势技术趋势：人工智能与机器学习：随着技术的不断成熟，人工智能将在更多领域得到应用，如智能客服、自动驾驶、智能制造等，这将极大地推动软件市场的增长。云计算与大数据：云计算服务将继续普及，大数据技术的应用也将更加广泛。企业将更加依赖云计算和大数据来优化运营、提升效率，并
python中zeros用法_Python中的numpy.zeros()用法江平舟 python中zeros用法
numpy.zeros()函数是最重要的函数之一,广泛用于机器学习程序中。此函数用于生成包含零的数组。numpy.zeros()函数提供给定形状和类型的新数组,并用零填充。句法numpy.zeros(shape,dtype=float,order='C'参数形状：整数或整数元组此参数用于定义数组的尺寸。此参数用于我们要在其中创建数组的形状,例如(3,2)或2。dtype：数据类型(可选)此参数用于
【NumPy】深入解析numpy.zeros()函数二七830 numpy
欢迎莅临我的个人主页这里是我深耕Python编程、机器学习和自然语言处理（NLP）领域，并乐于分享知识与经验的小天地！博主简介：我是二七830，一名对技术充满热情的探索者。多年的Python编程和机器学习实践，使我深入理解了这些技术的核心原理，并能够在实际项目中灵活应用。尤其是在NLP领域，我积累了丰富的经验，能够处理各种复杂的自然语言任务。技术专长：我熟练掌握Python编程语言，并深入研究了机
【中国国际航空-注册_登录安全分析报告】风控牛验证码接口安全评测系列安全行为验证极验网易易盾智能手机
前言由于网站注册入口容易被黑客攻击，存在如下安全问题：1.暴力破解密码，造成用户信息泄露2.短信盗刷的安全问题，影响业务及导致用户投诉3.带来经济损失，尤其是后付费客户，风险巨大，造成亏损无底洞所以大部分网站及App都采取图形验证码或滑动验证码等交互解决方案，但在机器学习能力提高的当下，连百度这样的大厂都遭受攻击导致点名批评，图形验证及交互验证方式的安全性到底如何？请看具体分析一、中国国际航空PC
机器学习流形数据降维：UMAP 降维算法小嗷犬 Python 机器学习 #数据分析及可视化机器学习算法人工智能
✅作者简介：人工智能专业本科在读，喜欢计算机与编程，写博客记录自己的学习历程。个人主页：小嗷犬的个人主页个人网站：小嗷犬的技术小站个人信条：为天地立心，为生民立命，为往圣继绝学，为万世开太平。本文目录UMAP简介理论基础特点与优势应用场景在Python中使用UMAP安装umap-learn库使用UMAP可视化手写数字数据集UMAP简介UMAP（UniformManifoldApproximatio
七.正则化愿风去了
吴恩达机器学习之正则化（Regularization）http://www.cnblogs.com/jianxinzhou/p/4083921.html从数学公式上理解L1和L2https://blog.csdn.net/b876144622/article/details/81276818虽然在线性回归中加入基函数会使模型更加灵活，但是很容易引起数据的过拟合。例如将数据投影到30维的基函数上，模
机器学习-------数据标准化罔闻_spider 数据分析算法机器学习人工智能
什么是归一化，它与标准化的区别是什么？一作用在做训练时，需要先将特征值与标签标准化，可以防止梯度防炸和过拟合；将标签标准化后，网络预测出的数据是符合标准正态分布的—StandarScaler()，与真实值有很大差别。因为StandarScaler()对数据的处理是（真实值-平均值）/标准差。同时在做预测时需要将输出数据逆标准化提升模型精度：标准化/归一化使不同维度的特征在数值上更具比较性，提高分类
分享一个基于python的电子书数据采集与可视化分析 hadoop电子书数据分析与推荐系统 spark大数据毕设项目（源码、调试、LW、开题、PPT) 计算机源码社 Python项目大数据大数据 python hadoop 计算机毕业设计选题计算机毕业设计源码数据分析 spark毕设
作者：计算机源码社个人简介：本人八年开发经验，擅长Java、Python、PHP、.NET、Node.js、Android、微信小程序、爬虫、大数据、机器学习等，大家有这一块的问题可以一起交流！学习资料、程序开发、技术解答、文档报告如需要源码，可以扫取文章下方二维码联系咨询Java项目微信小程序项目Android项目Python项目PHP项目ASP.NET项目Node.js项目选题推荐项目实战|p
两种方法判断Python的位数是32位还是64位 sanqima Python编程电脑 python 开发语言
Python从1991年发布以来，凭借其简洁、清晰、易读的语法、丰富的标准库和第三方工具，在Web开发、自动化测试、人工智能、图形识别、机器学习等领域发展迅猛。 Python是一种胶水语言，通过Cython库与C/C++语言进行链接，通过Jython库与Java语言进行链接。 Python是跨平台的，可运行在多种操作系统上，包括但不限于Windows、Linux和macOS。这意味着用Py
使用最大边际相关性(MMR)选择示例：提高AI模型的多样性和相关性 aehrutktrjk 人工智能 easyui 前端 python
使用最大边际相关性(MMR)选择示例：提高AI模型的多样性和相关性引言在机器学习和自然语言处理领域，选择合适的训练示例对模型性能至关重要。最大边际相关性(MaximalMarginalRelevance,MMR)是一种优秀的示例选择方法，它不仅考虑了示例与输入的相关性，还注重保持所选示例之间的多样性。本文将深入探讨如何使用MMR来选择示例，以提高AI模型的性能和泛化能力。什么是最大边际相关性(MM
LangChain集成指南:如何利用多样化的AI提供商 aehrutktrjk 人工智能 langchain python
LangChain集成指南:如何利用多样化的AI提供商引言在人工智能和机器学习领域,LangChain已成为一个强大而灵活的框架,允许开发者轻松集成各种AI服务提供商。本文将深入探讨LangChain的集成能力,介绍如何利用不同的AI提供商来增强你的应用程序,并提供实用的代码示例。LangChain集成概览LangChain支持多种AI提供商的集成,这些集成可以分为两类:独立包集成:这些提供商有独
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
【机器学习与R语言】1-机器学习简介苹果酱0567 面试题汇总与解析 java 中间件开发语言 spring boot 后端
1.基本概念机器学习：发明算法将数据转化为智能行为数据挖掘VS机器学习：前者侧重寻找有价值的信息，后者侧重执行已知的任务。后者是前者的先期准备过程：数据——>抽象化——>一般化。或者：收集数据——推理数据——归纳数据——发现规律抽象化：训练：用一个特定模型来拟合数据集的过程用方程来拟合观测的数据：观测现象——数据呈现——模型建立。通过不同的格式来把信息概念化一般化：一般化：将抽象化的知识转换成可用
Python前沿技术：机器学习与人工智能 4.0啊 Python 人工智能 python 机器学习
Python前沿技术：机器学习与人工智能一、引言随着科技的飞速发展，机器学习和人工智能（AI）已经成为了计算机科学领域的热门话题。Python作为一门易学易用且功能强大的编程语言，已经成为了这两个领域的首选语言之一。本文将深入探讨Python在机器学习和人工智能领域的应用，以及一些前沿技术和工具。二、Python机器学习基础2.1机器学习概述机器学习是人工智能（AI）的一个关键子集，它的核心在于让
chatgpt赋能python：如何在Python中计算平均值 tulingtest ChatGpt python chatgpt numpy 计算机
如何在Python中计算平均值计算平均值是数据分析、统计和机器学习等许多领域中的常见任务。Python作为一门功能强大且易于学习的编程语言，为计算平均值提供了多种方法。在本文中，我们将介绍如何在Python中计算平均值。什么是平均值简单来说，平均值是一组数字的总和除以数字的数量。例如，对于数字序列1，3，5，7，9，平均值是(1+3+5+7+9)/5=5。平均值在数据分析中非常有用，因为它可以提供
Python 初学者入门必知： Anaconda是什么？有什么作用？怎么使用？懒大王爱吃狼 Python基础 python 开发语言 python基础 python学习 anaconda anaconda安装 python教程
初学者在学习Python时，经常看到的一个名字是Anaconda。究竟什么是Anaconda，为什么它如此受欢迎？在这篇文章中，我们将探讨Anaconda，了解Anaconda的从安装到使用的。Anaconda是一个免费开源的Python和R编程发行版，包含上千个适用于数据科学和机器学习的包。同时，配备了Spyder和Jupyternotebook等工具，初学者可以使用它们来学习Python，使用
每天五分钟玩转深度学习PyTorch：模型参数优化器torch.optim 幻风_huanfeng 深度学习框架pytorch 深度学习 pytorch 人工智能神经网络机器学习优化算法
本文重点在机器学习或者深度学习中，我们需要通过修改参数使得损失函数最小化(或最大化)，优化算法就是一种调整模型参数更新的策略。在pytorch中定义了优化器optim，我们可以使用它调用封装好的优化算法，然后传递给它神经网络模型参数，就可以对模型进行优化。本文是学习第6步(优化器)，参考链接pytorch的学习路线随机梯度下降算法在深度学习和机器学习中，梯度下降算法是最常用的参数更新方法，它的公式
一切皆是映射：AI的去中心化：区块链技术的融合 AI大模型应用之禅计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
一切皆是映射：AI的去中心化：区块链技术的融合作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming关键词：AI，区块链，去中心化，智能合约，共识机制，数据安全，隐私保护，分布式账本技术，机器学习，数据隐私1.背景介绍1.1问题的由来随着人工智能（AI）技术的快速发展，其在各个领域的应用越来越广泛，从自动驾驶、智能医疗到金融服务，AI正在改变着我们的生活。
第五届核磁机器学习班（训练营：2023.6.5~6.17）茗创科技
茗创科技专注于脑科学数据处理，涵盖（EEG/ERP,fMRI,结构像,DTI,ASL,FNIRS）等，欢迎留言讨论及转发推荐，也欢迎了解茗创科技的脑电课程，数据处理服务及脑科学工作站销售业务，可添加我们的工程师（微信号MCKJ-zhouyi或17373158786）咨询。★课程简介★基于血氧水平依赖的功能磁共振成像(fMRI)技术,利用其数据构建的功能性脑网络后,发现脑并不是一个单纯对外界刺激进行
如何有效的学习AI大模型？ Python程序员罗宾学习人工智能语言模型自然语言处理架构
学习AI大模型是一个系统性的过程，涉及到多个学科的知识。以下是一些建议，帮助你更有效地学习AI大模型：基础知识储备：数学基础：学习线性代数、概率论、统计学和微积分等，这些是理解机器学习算法的数学基础。编程技能：掌握至少一种编程语言，如Python，因为大多数AI模型都是用Python实现的。理论学习：机器学习基础：了解监督学习、非监督学习、强化学习等基本概念。深度学习：学习神经网络的基本结构，如卷
用MiddleGenIDE工具生成hibernate的POJO（根据数据表生成POJO类） AdyZhang POJO eclipse Hibernate MiddleGenIDE
推荐:MiddlegenIDE插件, 是一个Eclipse 插件. 用它可以直接连接到数据库, 根据表按照一定的HIBERNATE规则作出BEAN和对应的XML ，用完后你可以手动删除它加载的JAR包和XML文件! 今天开始试着使用
.9.png Cb123456 android
“点九”是andriod平台的应用软件开发里的一种特殊的图片形式，文件扩展名为：.9.png 　　智能手机中有自动横屏的功能,同一幅界面会在随着手机(或平板电脑)中的方向传感器的参数不同而改变显示的方向,在界面改变方向后,界面上的图形会因为长宽的变化而产生拉伸,造成图形的失真变形。　　我们都知道android平台有多种不同的分辨率，很多控件的切图文件在被放大拉伸后，边
算法的效率天子之骄算法效率复杂度最坏情况运行时间大O阶平均情况运行时间
算法的效率效率是速度和空间消耗的度量。集中考虑程序的速度，也称运行时间或执行时间，用复杂度的阶(O)这一标准来衡量。空间的消耗或需求也可以用大O表示，而且它总是小于或等于时间需求。以下是我的学习笔记： 1.求值与霍纳法则，即为秦九韶公式。 2.测定运行时间的最可靠方法是计数对运行时间有贡献的基本操作的执行次数。运行时间与这个计数成正比。
java数据结构何必如此 java 数据结构
Java 数据结构 Java工具包提供了强大的数据结构。在Java中的数据结构主要包括以下几种接口和类：枚举（Enumeration）位集合（BitSet）向量（Vector）栈（Stack）字典（Dictionary）哈希表（Hashtable）属性（Properties）以上这些类是传统遗留的，在Java2中引入了一种新的框架-集合框架(Collect
MybatisHelloWorld 3213213333332132
//测试入口TestMyBatis package com.base.helloworld.test; import java.io.IOException; import org.apache.ibatis.io.Resources; import org.apache.ibatis.session.SqlSession; import org.apache.ibat
Java|urlrewrite|URL重写|多个参数 7454103 java xml Web 工作
个人工作经验！如有不当之处，敬请指点 1.0 web -info 目录下建立 urlrewrite.xml 文件类似如下： <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE u
达梦数据库+ibatis darkranger sql mysql ibatis SQL Server
--插入数据方面如果您需要数据库自增... 那么在插入的时候不需要指定自增列. 如果想自己指定ID列的值, 那么要设置 set identity_insert 数据库名.模式名.表名; ----然后插入数据; example: create table zhabei.test( id bigint identity(1,1) primary key, nam
XML 解析四种方式 aijuans android
XML现在已经成为一种通用的数据交换格式,平台的无关性使得很多场合都需要用到XML。本文将详细介绍用Java解析XML的四种方法。 XML现在已经成为一种通用的数据交换格式,它的平台无关性,语言无关性,系统无关性,给数据集成与交互带来了极大的方便。对于XML本身的语法知识与技术细节,需要阅读相关的技术文献,这里面包括的内容有DOM(Document Object
spring中配置文件占位符的使用 avords
1.类 <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.o
前端工程化-公共模块的依赖和常用的工作流 bee1314 webpack
题记：一个人的项目，还有工程化的问题嘛？我们在推进模块化和组件化的过程中，肯定会不断的沉淀出我们项目的模块和组件。对于这些沉淀出的模块和组件怎么管理？另外怎么依赖也是个问题？你真的想这样嘛？ var BreadCrumb = require(‘../../../../uikit/breadcrumb’); //真心ugly。
上司说「看你每天准时下班就知道你工作量不饱和」，该如何回应？ bijian1013 项目管理沟通 IT职业规划
问题：上司说「看你每天准时下班就知道你工作量不饱和」，如何回应正常下班时间6点，只要是6点半前下班的，上司都认为没有加班。 Eno-Bea回答，注重感受，不一定是别人的虽然我不知道你具体从事什么工作与职业，但是我大概猜测，你是从事一项不太容易出现阶段性成果的工作
TortoiseSVN，过滤文件征客丶 SVN
环境： TortoiseSVN 1.8 配置：在文件夹空白处右键选择 TortoiseSVN -> Settings 在 Global ignote pattern 中添加要过滤的文件：多类型用英文空格分开 *name ：过滤所有名称为 name 的文件或文件夹 *.name ：过滤所有后缀为 name 的文件或文件夹 --------
【Flume二】HDFS sink细说 bit1129 Flume
1. Flume配置 a1.sources=r1 a1.channels=c1 a1.sinks=k1 ###Flume负责启动44444端口 a1.sources.r1.type=avro a1.sources.r1.bind=0.0.0.0 a1.sources.r1.port=44444 a1.sources.r1.chan
The Eight Myths of Erlang Performance bookjovi erlang
erlang有一篇guide很有意思： http://www.erlang.org/doc/efficiency_guide 里面有个The Eight Myths of Erlang Performance： http://www.erlang.org/doc/efficiency_guide/myths.html Myth: Funs are sl
java多线程网络传输文件(非同步)-2008-08-17 ljy325 java 多线程 socket
利用 Socket 套接字进行面向连接通信的编程。客户端读取本地文件并发送；服务器接收文件并保存到本地文件系统中。使用说明:请将TransferClient, TransferServer, TempFile三个类编译，他们的类包是FileServer. 客户端: 修改TransferClient: serPort, serIP, filePath, blockNum,的值来符合您机器的系
读《研磨设计模式》-代码笔记-模板方法模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; import java.sql.ResultSet;
配置心得 chenyu19891124 配置
时间就这样不知不觉的走过了一个春夏秋冬，转眼间来公司已经一年了，感觉时间过的很快，时间老人总是这样不停走，从来没停歇过。作为一名新手的配置管理员，刚开始真的是对配置管理是一点不懂，就只听说咱们公司配置主要是负责升级，而具体该怎么做却一点都不了解。经过老员工的一点点讲解，慢慢的对配置有了初步了解，对自己所在的岗位也慢慢的了解。做了一年的配置管理给自总结下： 1.改变从一个以前对配置毫无
对“带条件选择的并行汇聚路由问题”的再思考 comsci 算法工作软件测试嵌入式领域模型
2008年上半年，我在设计并开发基于”JWFD流程系统“的商业化改进型引擎的时候，由于采用了新的嵌入式公式模块而导致出现“带条件选择的并行汇聚路由问题”(请参考2009-02-27博文)，当时对这个问题的解决办法是采用基于拓扑结构的处理思想，对汇聚点的实际前驱分支节点通过算法预测出来，然后进行处理，简单的说就是找到造成这个汇聚模型的分支起点，对这个起始分支节点实际走的路径数进行计算，然后把这个实际
Oracle 10g 的clusterware 32位下载地址 daizj oracle
Oracle 10g 的clusterware 32位下载地址 http://pan.baidu.com/share/link?shareid=531580&uk=421021908 http://pan.baidu.com/share/link?shareid=137223&uk=321552738 http://pan.baidu.com/share/l
非常好的介绍：Linux定时执行工具cron dongwei_6688 linux
Linux经过十多年的发展，很多用户都很了解Linux了，这里介绍一下Linux下cron的理解，和大家讨论讨论。cron是一个Linux 定时执行工具，可以在无需人工干预的情况下运行作业，本文档不讲cron实现原理，主要讲一下Linux定时执行工具cron的具体使用及简单介绍。新增调度任务推荐使用crontab -e命令添加自定义的任务（编辑的是/var/spool/cron下对应用户的cr
Yii assets目录生成及修改 dcj3sjt126com yii
assets的作用是方便模块化，插件化的，一般来说出于安全原因不允许通过url访问protected下面的文件，但是我们又希望将module单独出来，所以需要使用发布，即将一个目录下的文件复制一份到assets下面方便通过url访问。 assets设置对应的方法位置 \framework\web\CAssetManager.php assets配置方法在m
mac工作软件推荐 dcj3sjt126com mac
mac上的Terminal + bash ＋ screen组合现在已经非常好用了，但是还是经不起iterm＋zsh＋tmux的冲击。在同事的强烈推荐下，趁着升级mac系统的机会，顺便也切换到iterm＋zsh＋tmux的环境下了。我为什么要要iterm2 切换过来也是脑袋一热的冲动，我也调查过一些资料，看了下iterm的一些优点： * 兼容性好，远程服务器 vi 什么的低版本能很好兼
Memcached(三)、封装Memcached和Ehcache frank1234 memcached ehcache spring ioc
本文对Ehcache和Memcached进行了简单的封装，这样对于客户端程序无需了解ehcache和memcached的差异，仅需要配置缓存的Provider类就可以在二者之间进行切换，Provider实现类通过Spring IoC注入。 cache.xml <?xml version="1.0" encoding="UTF-8"?>
Remove Duplicates from Sorted List II hcx2013 remove
Given a sorted linked list, delete all nodes that have duplicate numbers, leaving only distinct numbers from the original list. For example,Given 1->2->3->3->4->4->5,
Spring4新特性——注解、脚本、任务、MVC等其他特性改进 jinnianshilongnian spring4
Spring4新特性——泛型限定式依赖注入 Spring4新特性——核心容器的其他改进 Spring4新特性——Web开发的增强 Spring4新特性——集成Bean Validation 1.1(JSR-349)到SpringMVC Spring4新特性——Groovy Bean定义DSL Spring4新特性——更好的Java泛型操作API Spring4新
MySQL安装文档 liyong0802 mysql
工作中用到的MySQL可能安装在两种操作系统中，即Windows系统和Linux系统。以Linux系统中情况居多。安装在Windows系统时与其它Windows应用程序相同按照安装向导一直下一步就即，这里就不具体介绍，本文档只介绍Linux系统下MySQL的安装步骤。 Linux系统下安装MySQL分为三种：RPM包安装、二进制包安装和源码包安装。二
使用VS2010构建HotSpot工程 p2p2500 HotSpot OpenJDK VS2010
1. 下载OpenJDK7的源码： http://download.java.net/openjdk/jdk7 http://download.java.net/openjdk/ 2. 环境配置 ▶
Oracle实用功能之分组后列合并 seandeng888 oracle 分组实用功能合并
1 实例解析由于业务需求需要对表中的数据进行分组后进行合并的处理，鉴于Oracle10g没有现成的函数实现该功能，且该功能如若用JAVA代码实现会比较复杂，因此，特将SQL语言的实现方式分享出来，希望对大家有所帮助。如下：表test 数据如下： ID,SUBJECTCODE,DIMCODE,VALUE 1&nbs
Java定时任务注解方式实现 tuoni java spring jvm xml jni
Spring 注解的定时任务，有如下两种方式：第一种： <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http
11大Java开源中文分词器的使用方法和分词效果对比 yangshangchuan word分词器 ansj分词器 Stanford分词器 FudanNLP分词器 HanLP分词器
本文的目标有两个： 1、学会使用11大Java开源中文分词器 2、对比分析11大Java开源中文分词器的分词效果本文给出了11大Java开源中文分词的使用方法以及分词结果对比代码，至于效果哪个好，那要用的人结合自己的应用场景自己来判断。 11大Java开源中文分词器，不同的分词器有不同的用法，定义的接口也不一样，我们先定义一个统一的接口： /** * 获取文本的所有分词结果, 对比