z score vs. min-max scaling 优缺点

Min-max:所有特征具有相同尺度 (scale) 但不能处理outlier
z-score:与min-max相反,可以处理outlier, 但不能产生具有相同尺度的特征变换

More opinions (from researchgate):
– If you have a PHYSICALLY NECESSARY MAXIMUM (like in the number of voters for different parties in an election that cannot exceed the total number of voters) the best normalization method is dividing for the MAX so to have your data spanning the 0-1 interval, but it is mandatory you have REAL AND STABLE MAXIMUM (the same considerations hold for min-MAX) otherwise this procedure is very dangerous given the hyperbolic character of the ration (having at the denominator something that can change, even slightly, without control is a curse !).
In all the other cases z scores that clearly depend on the choice of an appropriate reference set to determin mean and standard deviation but, once this reference set is in your hands, allow you to judge immediately about the relevance of a single observation (Wow it is more than 3 !!!)…moreover the use of z scores allows for a very straight elimination of systematic errors and drifts (by the way you have an uneliminable ‘day effect’ in your experimentation…no problem you plan the same proportion of ‘treated’ and ‘control’ samples for each day and normalize for the mean and sd of the day…).

– In my opinion, Z-score. This method preserve range (maximum and minimum) and introduce the dispersion of the serie (standard deviation / variance). If you data follow a gaussian distribution, they are converted into a N(0,1) distribution and the comparison between series (probabilities calculation) will be easier.

– Depending on the task objetives. For example; for neural networks is recommended normalization Min max for activation functions. To avoid saturation Basheer & Najmeer (2000) recommend the range 0.1 and 0.9.
Another possibility is to use the Box Cox transformation + constant to avoid the problem of the zeros

References: https://www.researchgate.net/post/What_are_the_best_normalization_methods_Z-Score_Min-Max_etc_How_would_you_choose_a_data_normalization_method
https://www.codecademy.com/articles/normalization

你可能感兴趣的:(Machine,Learning)