parseInt()
和parseFloat()
这两个常用 API 其实还是有很多“坑”的,以此文统一梳理一下。(本文比较适合常与数字打交道的 jser 或对这两 API 运作感兴趣的同学)
(github:https://github.com/MichealWayne,个人博客地址:https://blog.michealwayne.cn/),转载可联系[email protected]
在以前描述 js 数值时(Q3.说出以下转换数字数值转换的结果),有提到
parseInt
和parseFloat
的运行机制,其实这两 API 还是有蛮多坑的,虽然平时不太会踩到。
首先猜测以下的执行,自我检验一番:
/* parseInt */
parseInt('123.456.789');
parseInt('+123.456.789');
parseInt('123abc');
parseInt('abc123');
parseInt('1e6');
parseInt(' 1 ');
parseInt('');
parseInt('0');
parseInt('0x');
parseInt('0x11');
parseInt(new String('123'));
parseInt('a', 16);
parseInt(123.456, -1);
parseInt(123.456, 0);
parseInt(123.456, 1);
parseInt(123.456, 2);
parseInt(123.456, 40);
parseInt(123.456, 36);
parseInt(1e6);
parseInt(1n);
parseInt();
parseInt(null);
parseInt(false);
// 有几个“超纲”题
parseInt(0.00000001);
parseInt(123.456, -99999999999999999999999999);
parseInt(123.456, 99999999999999999999999999);
parseInt(123.456, 9999999999999999999999999);
parseInt(9999999999999999);
parseInt('11111111111111111');
parseInt('11111111111111111111');
parseInt(1e21);
parseInt('123aef', 12);
parseInt('123', NaN);
parseInt('0xf', NaN);
parseInt('123', Infinity);
parseInt('111', 2 ** 32 + 2.1);
parseInt(Symbol());
parseInt(parseInt);
const objTest1 = {};
parseInt(objTest1);
objTest1.toString = () => 123;
parseInt(objTest1);
/* parseFloat */
parseFloat('123.456.789');
parseFloat('123abc.456.789');
parseFloat(' 123abc ');
parseFloat('1e6');
parseFloat('+Infinity');
parseFloat(Infinity);
parseFloat(123.456);
parseFloat(1e6);
parseFloat(0.00000001);
parseFloat(0.1 + 0.2);
parseFloat('0x1a');
parseFloat(1n);
parseFloat();
parseFloat(null);
parseFloat(false);
// 有几个“超纲”题
parseFloat(9999999999999999);
parseFloat('11111111111111111');
parseFloat('11111111111111111111');
parseFloat(Symbol());
parseFloat(parseFloat);
const objTest2 = {};
parseFloat(objTest2);
objTest2.toString = () => 123;
parseFloat(objTest2);
无论啥内核、浏览器或是 NodeJs,都会遵循 ECMA 的主要规范,因此要思考上列的执行结果,可以重点了解 ECMA 规范的执行描述。
parseInt()
语法:
parseInt(string, radix)
其中参数:
radix
:可选,介于2
~36
之间的整数。告知parseInt()
函数string
(比如 11)是radix
(比如 2)进制的表示,如果radix
不存在,parseInt
将固定返回string
以十进制显示的数。
另外:
parseInt === Number.parseInt; // => true
简单翻译一下执行步骤:
inputString
,它是入参string
执行 ToString(string)
的字符串结果。(ToString
是一个内部 abstract operations、不对外,具体执行见文档或文末附录);ReturnIfAbrupt
,关于ReturnIfAbrupt
的执行其实还蛮复杂的,涉及 ECMA 规范的规格术语,本文不做描述);S
,它是inputString
创建的一个子字符串,它由第一个不是空白字符的代码单元和该代码单元之后的所有代码单元组成,即去除前置空格,也就是说parseInt('123')
和parseInt(' 123')
效果相同。如果找不到这样的单元,则S
为空字符串(""
);sign
为1
;S
不为空并且S
的第一个单元是0x002D
(HYPHEN-MINUS
,即减号),变量sign
改为-1
;S
不为空并且S
的第一个单元是0x002B
(PLUS SIGN
,即加号) 或0x002D
(HYPHEN-MINUS,即减号),去除S
的第一个单元,即去除'+'
/-
符号;R
为ToInt32(radix)
,即对进制声明进行ToInt32数字转换,也就是说parseInt('123', 8)
和 parseInt('123', '8')
效果相同;ReturnIfAbrupt
);stripPrefix
为 true
;R
不等于0
,则:
R
小于 2 或大于 36,直接返回NaN
;R
不等于 16,变量stripPrefix
改为 false
;R
等于0
,则变量R
改为10
;stripPrefix
值为true
,则:
S
字符串长度不小于 2 并且前两个字符单元是0x
或0X
,则删除这两字符,且设置变量R
为16
;Z
,如果变量S
包含一个不是变量R
数字的字符单元,则Z
为S
的子字符串,且由第一个这样的字符单元之前的所有代码单元组成,即parseInt('789', 8)
和parseInt('7', 8)
效果相同;否则,Z
为 S
;Z
为空,直接返回NaN
;mathInt
为由Z
以基数R
进行表示的数学进制整数值,其中使用字母A-Z
和a-z
表示值为 10 ~ 35 的数字(如果R
为 10,且Z
包含 20 个以上的有效数字,则根据实现的选择,第 20 位之后的每个有效数字都可以替换为 0)R
不是 2、4、8、10、16 或 32,则 mathInt
可能是对数学整数值的依赖实现的近似值,该值由 Z
以基数R
表示法表示。mathInt
等于 0,则:
sign
等于-1,则返回-0
;+0
;number
为mathInt
的 Number 值;sign * number
parseInt()
只能将字符串的前导部分解释为整数值;它忽略任何不能被解释为整数符号的一部分的代码单元,并且没有给出任何此类代码单元被忽略的迹象。
parseFloat()
语法:
parseFloat(string)
其中参数:
另外:
parseFloat === Number.parseFloat; // => true
官网描述:
执行步骤:
1.定义变量inputString
,它是入参string
执行 ToString(string)
的字符串结果。;
2.执行出现异常则返回(ReturnIfAbrupt
);
3.定义变量trimmedString
,它是inputString
创建的一个子字符串,它由第一个不是空白字符的代码单元和该代码单元之后的所有代码单元组成,即去除前置空格,也就是说parseFloat('123.456')
和parseFloat(' 123.456')
效果相同。如果找不到这样的单元,边trimmedString
为空字符串(""
);
4.如果trimmedString
或trimmedString
的任何前缀都不满足StrDecimalLiteral
的语法,则返回NaN
;
StrDecimalLiteral
语法:
5.定义变量numberString
,它是trimmedString
的最长前缀(可能是trimmedString
) 本身,numberString
满足StrDecimalLiteral
的语法。
6.定义变量mathFloat
,它是 numberString
的 MV
(mathematical value):从文字中导出MV
,其次对这个值进行四舍五入(也有 20 位的阈值处理),这一步处理与parseInt()
有很大的不同。至于具体 MV 基本就是大学里学的内容,可见文档;
7.如果mathFloat
等于 0,则:
trimmedString
的第一个字符等于"-"
,则返回-0
;+0
;8.返回mathFloat
的 Number 值;
parseFloat()
只能将字符串的前导部分解释为整数值;它忽略任何不能被解释为整数符号的一部分的代码单元,并且没有给出任何此类代码单元被忽略的迹象。
声明文件(lib.es5.d.ts
)很简单:
/**
* Converts a string to an integer.
* @param string A string to convert into a number.
* @param radix A value between 2 and 36 that specifies the base of the number in `string`.
* If this argument is not supplied, strings with a prefix of '0x' are considered hexadecimal.
* All other strings are considered decimal.
*/
declare function parseInt(string: string, radix?: number): number;
/**
* Converts a string to a floating-point number.
* @param string A string that contains a floating-point number.
*/
declare function parseFloat(string: string): number;
再次注意
NaN
也是"number"
哦
以典型的WebKit(依赖 v8)为例,可以看下parseInt()
和parseFloat
的具体代码实现和单测内容(版本:tags/9.9.56
)
parseInt
源码(主要文件:/Source/JavaScriptCore/runtime/ParseInt.h
)
主要代码:
// 入口,方法定义
ALWAYS_INLINE static double parseInt(StringView s, int radix)
{
if (s.is8Bit())
return parseInt(s, s.characters8(), radix);
return parseInt(s, s.characters16(), radix);
}
// ES5.1 15.1.2.2
template
ALWAYS_INLINE
static double parseInt(StringView s, const CharType* data, int radix)
{
// 1. Let inputString be ToString(string).
// 2. Let S be a newly created substring of inputString consisting of the first character that is not a
// StrWhiteSpaceChar and all characters following that character. (In other words, remove leading white
// space.) If inputString does not contain any such characters, let S be the empty string.
int length = s.length();
int p = 0;
while (p < length && isStrWhiteSpace(data[p]))
++p;
// 3. Let sign be 1.
// 4. If S is not empty and the first character of S is a minus sign -, let sign be -1.
// 5. If S is not empty and the first character of S is a plus sign + or a minus sign -, then remove the first character from S.
double sign = 1;
if (p < length) {
if (data[p] == '+')
++p;
else if (data[p] == '-') {
sign = -1;
++p;
}
}
// 6. Let R = ToInt32(radix).
// 7. Let stripPrefix be true.
// 8. If R != 0,then
// b. If R != 16, let stripPrefix be false.
// 9. Else, R == 0
// a. LetR = 10.
// 10. If stripPrefix is true, then
// a. If the length of S is at least 2 and the first two characters of S are either ―0x or ―0X,
// then remove the first two characters from S and let R = 16.
// 11. If S contains any character that is not a radix-R digit, then let Z be the substring of S
// consisting of all characters before the first such character; otherwise, let Z be S.
if ((radix == 0 || radix == 16) && length - p >= 2 && data[p] == '0' && (data[p + 1] == 'x' || data[p + 1] == 'X')) {
radix = 16;
p += 2;
} else if (radix == 0)
radix = 10;
// 8.a If R < 2 or R > 36, then return NaN.
if (radix < 2 || radix > 36)
return PNaN;
// 13. Let mathInt be the mathematical integer value that is represented by Z in radix-R notation, using the letters
// A-Z and a-z for digits with values 10 through 35. (However, if R is 10 and Z contains more than 20 significant
// digits, every significant digit after the 20th may be replaced by a 0 digit, at the option of the implementation;
// and if R is not 2, 4, 8, 10, 16, or 32, then mathInt may be an implementation-dependent approximation to the
// mathematical integer value that is represented by Z in radix-R notation.)
// 14. Let number be the Number value for mathInt.
int firstDigitPosition = p;
bool sawDigit = false;
double number = 0;
while (p < length) {
int digit = parseDigit(data[p], radix);
if (digit == -1)
break;
sawDigit = true;
number *= radix;
number += digit;
++p;
}
// 12. If Z is empty, return NaN.
if (!sawDigit)
return PNaN;
// Alternate code path for certain large numbers.
if (number >= mantissaOverflowLowerBound) {
if (radix == 10) {
size_t parsedLength;
number = parseDouble(s.substring(firstDigitPosition, p - firstDigitPosition), parsedLength);
} else if (radix == 2 || radix == 4 || radix == 8 || radix == 16 || radix == 32)
number = parseIntOverflow(s.substring(firstDigitPosition, p - firstDigitPosition), radix);
}
// 15. Return sign x number.
return sign * number;
}
代码中没有“骚”操作,从执行到备注都完全贴合规范。
(文件:chromium / v8 / v8 / 9.9.56 / . / test / webkit / parseInt-expected.txt
)
PASS parseInt('123') is 123
PASS parseInt('123x4') is 123
PASS parseInt('-123') is -123
PASS parseInt('0x123') is 0x123
PASS parseInt('0x123x4') is 0x123
PASS parseInt('-0x123x4') is -0x123
PASS parseInt('-') is Number.NaN
PASS parseInt('0x') is Number.NaN
PASS parseInt('-0x') is Number.NaN
PASS parseInt('123', undefined) is 123
PASS parseInt('123', null) is 123
PASS parseInt('123', 0) is 123
PASS parseInt('123', 10) is 123
PASS parseInt('123', 16) is 0x123
PASS parseInt('0x123', undefined) is 0x123
PASS parseInt('0x123', null) is 0x123
PASS parseInt('0x123', 0) is 0x123
PASS parseInt('0x123', 10) is 0
PASS parseInt('0x123', 16) is 0x123
PASS parseInt(Math.pow(10, 20)) is 100000000000000000000
PASS parseInt(Math.pow(10, 21)) is 1
PASS parseInt(Math.pow(10, -6)) is 0
PASS parseInt(Math.pow(10, -7)) is 1
PASS parseInt(-Math.pow(10, 20)) is -100000000000000000000
PASS parseInt(-Math.pow(10, 21)) is -1
PASS parseInt(-Math.pow(10, -6)) is -0
PASS parseInt(-Math.pow(10, -7)) is -1
PASS parseInt('0') is 0
PASS parseInt('-0') is -0
PASS parseInt(0) is 0
PASS parseInt(-0) is 0
PASS parseInt(2147483647) is 2147483647
PASS parseInt(2147483648) is 2147483648
PASS parseInt('2147483647') is 2147483647
PASS parseInt('2147483648') is 2147483648
PASS state = null; try { parseInt('123', throwingRadix); } catch (e) {} state; is "throwingRadix"
PASS state = null; try { parseInt(throwingString, throwingRadix); } catch (e) {} state; is "throwingString"
parseFloat
源码(主要文件:/Source/JavaScriptCore/runtime/JSGlobalObjectFunctions.cpp
)
static double parseFloat(StringView s)
{
unsigned size = s.length();
if (size == 1) {
UChar c = s[0];
if (isASCIIDigit(c))
return c - '0';
return PNaN;
}
if (s.is8Bit()) {
const LChar* data = s.characters8();
const LChar* end = data + size;
// Skip leading white space.
for (; data < end; ++data) {
if (!isStrWhiteSpace(*data))
break;
}
// Empty string.
if (data == end)
return PNaN;
return jsStrDecimalLiteral(data, end);
}
const UChar* data = s.characters16();
const UChar* end = data + size;
// Skip leading white space.
for (; data < end; ++data) {
if (!isStrWhiteSpace(*data))
break;
}
// Empty string.
if (data == end)
return PNaN;
return jsStrDecimalLiteral(data, end);
}
// See ecma-262 6th 11.8.3
template
static double jsStrDecimalLiteral(const CharType*& data, const CharType* end)
{
RELEASE_ASSERT(data < end);
size_t parsedLength;
double number = parseDouble(data, end - data, parsedLength);
if (parsedLength) {
data += parsedLength;
return number;
}
// Check for [+-]?Infinity
switch (*data) {
case 'I':
if (isInfinity(data, end)) {
data += SizeOfInfinity;
return std::numeric_limits::infinity();
}
break;
case '+':
if (isInfinity(data + 1, end)) {
data += SizeOfInfinity + 1;
return std::numeric_limits::infinity();
}
break;
case '-':
if (isInfinity(data + 1, end)) {
data += SizeOfInfinity + 1;
return -std::numeric_limits::infinity();
}
break;
}
// Not a number.
return PNaN;
}
相比之下,parseInt
的注释要更为完善。
(文件:chromium / v8 / v8 / 9.9.56 / . / test / webkit / parseFloat-expected.txt
)
PASS parseFloat() is NaN
PASS parseFloat('') is NaN
PASS parseFloat(' ') is NaN
PASS parseFloat(' 0') is 0
PASS parseFloat('0 ') is 0
PASS parseFloat('x0') is NaN
PASS parseFloat('0x') is 0
PASS parseFloat(' 1') is 1
PASS parseFloat('1 ') is 1
PASS parseFloat('x1') is NaN
PASS parseFloat('1x') is 1
PASS parseFloat(' 2.3') is 2.3
PASS parseFloat('2.3 ') is 2.3
PASS parseFloat('x2.3') is NaN
PASS parseFloat('2.3x') is 2.3
PASS parseFloat('0x2') is 0
PASS parseFloat('1' + nonASCIINonSpaceCharacter) is 1
PASS parseFloat(nonASCIINonSpaceCharacter + '1') is NaN
PASS parseFloat('1' + illegalUTF16Sequence) is 1
PASS parseFloat(illegalUTF16Sequence + '1') is NaN
PASS parseFloat(tab + '1') is 1
PASS parseFloat(nbsp + '1') is 1
PASS parseFloat(ff + '1') is 1
PASS parseFloat(vt + '1') is 1
PASS parseFloat(cr + '1') is 1
PASS parseFloat(lf + '1') is 1
PASS parseFloat(ls + '1') is 1
PASS parseFloat(ps + '1') is 1
PASS parseFloat(oghamSpaceMark + '1') is 1
PASS parseFloat(mongolianVowelSeparator + '1') is NaN
PASS parseFloat(enQuad + '1') is 1
PASS parseFloat(emQuad + '1') is 1
PASS parseFloat(enSpace + '1') is 1
PASS parseFloat(emSpace + '1') is 1
PASS parseFloat(threePerEmSpace + '1') is 1
PASS parseFloat(fourPerEmSpace + '1') is 1
PASS parseFloat(sixPerEmSpace + '1') is 1
PASS parseFloat(figureSpace + '1') is 1
PASS parseFloat(punctuationSpace + '1') is 1
PASS parseFloat(thinSpace + '1') is 1
PASS parseFloat(hairSpace + '1') is 1
PASS parseFloat(narrowNoBreakSpace + '1') is 1
PASS parseFloat(mediumMathematicalSpace + '1') is 1
PASS parseFloat(ideographicSpace + '1') is 1
从 ECMA 规范和典型内核的代码实现中我们可以发现,parseFloat
和parseInt
存在很多边界处理,这也是造成踩坑的主要原因。
至此可以再回过头想想最初的那些执行问题,绝大部分都能得到解释。至于“超纲”题,有兴趣可以去看下 ECMA 规范中的数字及类型转换部分。
Diagram(
ZeroOrMore('Space'),
Optional(
Choice(0,
'+',
'-',
), 'skip'
),
Choice(1,
'Infinity',
Sequence(
Choice(0,
Sequence(
ZeroOrMore('0-9'),
Optional('.', 'skip'),
OneOrMore('0-9'),
),
),
Optional(
Sequence(
Choice(0,
'e',
'E',
),
Optional(
Choice(0,
'+',
'-',
), 'skip'
),
OneOrMore('0-9'),
)
, 'skip')
)
),
ZeroOrMore('Space'),
)
Diagram(
ZeroOrMore('Space'),
Optional(
Choice(0,
'+',
'-',
), 'skip'
),
ZeroOrMore('0-R'), // R 为进制最大值
ZeroOrMore('Space'),
)
有建议或转载可 -> [email protected]