PHP 中英文截取无乱码

在学习MySql 字符集时,解决了PHP中英文截取无乱码的问题。这个方法的核心在于判断取多少个字节上。

 1100 0000
//1110 xxxx && 1111 0000 -> 1110 0000
//位运算时不受英文字符最高位为0的影响,只是在转成字符串才受到影响
function utf8sub($str,$len){
	if($len<0){
		return '';
	}
	$res = '';
	$offset = 0;
	$chars = 0;
	$count = 0;
	$length = strlen($str);//待截取字符串的字节数
	while($chars<$len && $offset<$length){
		$high = decbin(ord(substr($str,$offset,1)));//先截取客串的一个字节,substr按字节进行截取
		//重要突破,已经能够判断高位字节
		if(strlen($high)<8){//英文字符ascii编码长度为7,通过长度小于8来判断
			$count = 1;
			// echo 'hello,I am in','
'; }elseif (substr($high,0,3) == '110') { $count = 2; //取两个字节的长度 }elseif (substr($high,0,4) == '1110') { $count = 3; //取三个字节的长度 }elseif (substr($high,0,5) == '11110') { $count = 4; }elseif (substr($high,0,6) == '111110') { $count = 5; }elseif(substr($high,0,7)=='1111110'){ $count = 6; } $res .= substr($str,$offset,$count); $chars +=1; $offset += $count; } return $res; } echo utf8sub($str,5),'
'; echo utf8sub($str,10),'
'; ?>



你可能感兴趣的:(PHP)