微信公众号最佳实践 ( 9.7)智能问答,关键词回复,中文分词

智能问答

前面我们使用的都是
  • 基于固定查询指令的回复,这样好处是内容格式统一,方便软件开发人员编写程序做出分析,回复精准的内容给用户
  • 但在生活中,人们问的内容很随意,甚至千差万别,这时,回复内容想要和用户的问题相匹配,就需要更智能的程序了

微信公众号最佳实践 ( 9.7)智能问答,关键词回复,中文分词_第1张图片

关键词回复

我们需先定一个数组,数组中键为关键词,值为对应的回复,当用户输入的文字能匹配到某个关键词时,则回复该关键词对应的内容,我们定义“电话”,“地址”,”微信”,三个关键词,将其作为数组中的键,然后定义相应的值,当用户发送的内容中包含相应的关键词名称时,则回复对应的内容。

代码如下所示:

 private function receiveText($object)
    {
        $content = trim($object->Content);

        $keywords = array(
          "电话" => "0755-87654321",
          "地址" => "广东省深圳市南山区高新南一道飞亚达科技大厦(518057)",
          "A型号手机" => "A产品报价 3000元,基本参数为:****,",
          "B型号手机" => "A产品报价 4000元,基本参数为:****,",
          "C型号手机" => "A产品报价 2000元,基本参数为:****,",
          "微信" => "xxxxx"
        );
        foreach ($keywords as $key=>$value){
            $contain = strchr($content, $key);
            if (!empty($contain)){
                $reply = $value;
                break;
            }
        }

        $result = $this->transmitText($object, $reply);
        return $result;
    }

中文分词

除了关键词回复之外,还可以使用中文分词来帮助我们实现智能回复。

中文分词指的是将一个汉字序列切分成一个个单独词。
分词就是将连续的字序列按照一定的规范重新组合成词序列的过程。
我们知道在英文的行文中,单词之间是以空格作为自然分隔符的,而中文只有字、句、和段,能通过简单的分界符来简单划分,唯独词没有一个形式上的分界符,虽然英文也同样存在短语的划分问题,不过在词这一层上,中文要比英文复杂的多,困难的多。
中文分词就是要解决句子中词的划分问题,然后可以提取其中的关键词语进行搜索。
打个比方:“深圳天气怎么样”,这句话可分词为“深圳”,“天气”,“怎么样”三部分,因为它们代表不同词语类别的意思。

新浪云SEA提供了中文分词服务seaSegment,上述句子进行中文分词的代码如下所示


    $str="深圳天气怎么样";
    $seg = new SaeSegment();
    $ret = $seg->segment($str, 1);
    print_r($ret);
?>

代码运行后,结果如下所示:

Array ( 
    [0] => Array ( 
        [word] => 深圳 
        [word_tag] => 102 
        [index] => 0 ) 
    [1] => Array ( 
        [word] => 天气 
        [word_tag] => 95 
        [index] => 1 ) 
    [2] => Array ( 
        [word] => 怎么样 
        [word_tag] => 40 
        [index] => 2 
    ) 
)

上述结果中,

  • word表示分词
  • word_tag表示词性(102代表地点名词,95代表名词)
  • index表示词在句子中的位置索引

我们可以把分词功能写成函数,并且将名词作为分类标准,地点作为关键词组合成数组。

相应代码如下:


function sinasegment($str)
{
    $seg = new SaeSegment();
    $ret = $seg->segment($str, 1);

    if ($ret === false){
        return;
    }
    $category = "";
    $keyword = "";
    foreach ($ret as $key => $value) {
        if ($value["word_tag"] == 95){
            $category = $value["word"];
        }
        if ($value["word_tag"] == 102){
            $keyword = $value["word"];
        }
    }
    if (!empty($category) && !empty($keyword)){
        return array('category'=>$category, 'keyword'=>$keyword); 
    }else{
        return;
    }
}
?>

根据获取的功能类别及关键字,我们可以做出根据地点来查询其相应的自动回复,比如天气预报和空气质量,在前面的章节中,我们提到了将语音识别结果并入文本消息结果的查询,基于这些,我们可以开发出非常智能的**语音识别搜索**及**文本搜索**功能.

相应代码如下:

 private function receiveVoice($object)
    {
        $funcFlag = 0;

        if (isset($object->Recognition) && !empty($object->Recognition)){
            $content = strval($object->Recognition);
            include("segment.php");
            $result = sinasegment($content);
            if (is_array($result)){
                switch ($result['category'])
                {
                    case "天气":
                        $url = "http://apix.sinaapp.com/weather/?appkey=trialuser&city=".urlencode($result['keyword']);
                        $output = file_get_contents($url);
                        $replyContent = json_decode($output, true);
                        break;
                    default:
                        $replyContent = "还不支持这一功能:".$result['category'];
                        break;
                }
            }else{
                $replyContent = "不能理解你的内容:".$content;
            }
        }else{
            $replyContent = "未开启语音识别功能";
        }
        if (is_array($replyContent)){
            $resultStr = $this->transmitNews($object, $replyContent);
        }else{
            $resultStr = $this->transmitText($object, $replyContent);
        }
        return $resultStr;
    }

segment.php


function sinasegment($str)
{
    $seg = new SaeSegment();
    $ret = $seg->segment($str, 1);

    if ($ret === false){
        return;
    }
    $category = "";
    $keyword = "";
    foreach ($ret as $key => $value) {
        if ($value["word_tag"] == 95){
            $category = $value["word"];
        }
        if ($value["word_tag"] == 102){
            $keyword = $value["word"];
        }
    }
    if (!empty($category) && !empty($keyword)){
        return array('category'=>$category, 'keyword'=>$keyword); 
    }else{
        return;
    }
}
?>

关键词index.php运行图片:

微信公众号最佳实践 ( 9.7)智能问答,关键词回复,中文分词_第2张图片

关键词回复 index.php 整体代码如下:


/*
    CopyRight 2018 All Rights Reserved
*/

define("TOKEN", "weixin");

$wechatObj = new wechatCallbackapiTest();
if (!isset($_GET['echostr'])) {
    $wechatObj->responseMsg();
}else{
    $wechatObj->valid();
}

class wechatCallbackapiTest
{
    public function valid()
    {
        $echoStr = $_GET["echostr"];
        if($this->checkSignature()){
            echo $echoStr;
            exit;
        }
    }

    private function checkSignature()
    {
        $signature = $_GET["signature"];
        $timestamp = $_GET["timestamp"];
        $nonce = $_GET["nonce"];

        $token = TOKEN;
        $tmpArr = array($token, $timestamp, $nonce);
        sort($tmpArr, SORT_STRING);
        $tmpStr = implode( $tmpArr );
        $tmpStr = sha1( $tmpStr );

        if( $tmpStr == $signature ){
            return true;
        }else{
            return false;
        }
    }

    public function responseMsg()
    {
        $postStr = $GLOBALS["HTTP_RAW_POST_DATA"];
        if (!empty($postStr)){
            $this->logger("R ".$postStr);
            $postObj = simplexml_load_string($postStr, 'SimpleXMLElement', LIBXML_NOCDATA);
            $RX_TYPE = trim($postObj->MsgType);

            switch ($RX_TYPE)
            {
                case "event":
                    $result = $this->receiveEvent($postObj);
                    break;
                case "text":
                    $result = $this->receiveText($postObj);
                    break;
                default:
                    $result = "other msg type: ".$RX_TYPE;
                    break;
            }
            $this->logger("T ".$result);
            echo $result;
        }else {
            echo "";
            exit;
        }
    }

    private function receiveEvent($object)
    {
        $content = "";
        switch ($object->Event)
        {
            case "subscribe":
                $content = "欢迎关注 德强1012 ";
                $content .= isset($object->EventKey)?("\n来自二维码场景 ".$object->EventKey):"";
                break;
            case "unsubscribe":
                $content = "取消关注";
                break;
        }
        $result = $this->transmitText($object, $content);
        return $result;
    }

    private function receiveText($object)
    {
        $content = trim($object->Content);

        $keywords = array(
          "电话" => "0755-87654321",
          "地址" => "广东省深圳市南山区高新南一道飞亚达科技大厦(518057)",
          "A型号手机" => "A产品报价 3000元,基本参数为:****,",
          "B型号手机" => "A产品报价 4000元,基本参数为:****,",
          "C型号手机" => "A产品报价 2000元,基本参数为:****,",
          "微信" => "xxxxx"
        );
        foreach ($keywords as $key=>$value){
            $contain = strchr($content, $key);
            if (!empty($contain)){
                $reply = $value;
                break;
            }
        }

        $result = $this->transmitText($object, $reply);
        return $result;
    }

    private function transmitText($object, $content)
    {
        $textTpl = "
                        
                        
                        %s
                        
                        
                    ";
        $result = sprintf($textTpl, $object->FromUserName, $object->ToUserName, time(), $content);
        return $result;
    }

    private function transmitNews($object, $arr_item)
    {
        if(!is_array($arr_item))
            return;

        $itemTpl = "    
                            <![CDATA[%s]]>
                            
                            
                            
                        
                    ";
        $item_str = "";
        foreach ($arr_item as $item)
            $item_str .= sprintf($itemTpl, $item['Title'], $item['Description'], $item['PicUrl'], $item['Url']);

        $newsTpl = "
                        
                        
                        %s
                        
                        
                        %s
                        
                        $item_str
                    ";

        $result = sprintf($newsTpl, $object->FromUserName, $object->ToUserName, time(), count($arr_item));
        return $result;
    }

    private function logger($log_content)
    {

    }
}
?>

中文分词index.php运行图片

微信公众号最佳实践 ( 9.7)智能问答,关键词回复,中文分词_第3张图片
微信公众号最佳实践 ( 9.7)智能问答,关键词回复,中文分词_第4张图片

中文分词index.php整体代码如下:

说明:方倍工作室接口100,目前免费暂无
他们解释原因如下:

  • 接口100免费提供这些年,受到很多用户欢迎,每天访问量非常大,但由于极个别用户使用机器程序抓取数据以及故意使用DDOS恶意攻击,导致服务器不堪重负,维护费用成倍增长。我们决定关停这些接口,有需要的用户,可以联系QQ 1354386063,购买我们的接口源码,或者自行使用第三方收费接口。网页地址http://www.cnblogs.com/txw1958/p/weixin-api100.html

语音功能公众号显示开通了,不知道为啥没有用,检测中…..





define("TOKEN", "weixin");

$wechatObj = new wechatCallbackapiTest();
if (!isset($_GET['echostr'])) {
    $wechatObj->responseMsg();
}else{
    $wechatObj->valid();
}

class wechatCallbackapiTest
{
    public function valid()
    {
        $echoStr = $_GET["echostr"];
        if($this->checkSignature()){
            echo $echoStr;
            exit;
        }
    }

    private function checkSignature()
    {
        $signature = $_GET["signature"];
        $timestamp = $_GET["timestamp"];
        $nonce = $_GET["nonce"];

        $token = TOKEN;
        $tmpArr = array($token, $timestamp, $nonce);
        sort($tmpArr);
        $tmpStr = implode( $tmpArr );
        $tmpStr = sha1( $tmpStr );

        if( $tmpStr == $signature ){
            return true;
        }else{
            return false;
        }
    }

    public function responseMsg()
    {
        $postStr = $GLOBALS["HTTP_RAW_POST_DATA"];
        if (!empty($postStr)){
            $postObj = simplexml_load_string($postStr, 'SimpleXMLElement', LIBXML_NOCDATA);
            $RX_TYPE = trim($postObj->MsgType);

            switch ($RX_TYPE)
            {
                case "text":
                    $resultStr = $this->receiveText($postObj);
                    break;
                case "event":
                    $resultStr = $this->receiveEvent($postObj);
                    break;
                case "voice":
                    $resultStr = $this->receiveVoice($postObj);
                    break;
                default:
                    $resultStr = "";
                    break;
            }
            echo $resultStr;
        }else {
            echo "";
            exit;
        }
    }

    private function receiveText($object)
    {
        if (isset($object->Recognition)){
            $content = strval($object->Recognition);
            $mediaid = trim($object->MediaID);
        }else{
            $content = trim($object->Content);
        }
        if (isset($_SERVER['HTTP_APPNAME'])){
            include("segment.php");
            $result = sinasegment($content);
            if (is_array($result)){
                switch ($result['category'])
                {
                    case "天气":
                        //$url = "http://api100.duapp.com/weather/?appkey=book97&city=".urlencode($result['keyword']);
                        //$output = file_get_contents($url);
                        //$replyContent = json_decode($output, true);
                        //$replyContent = $result['keyword'];
                        $city = $result['keyword'];
                        include("weather2.php");
                        $content = getWeatherInfo($city);
                        $result = $this->transmitNews($object, $content);
                        return $result;
                        break;
                    case "空气":
                        //$url = "http://api100.duapp.com/airquality/?appkey=book97&city=".urlencode($result['keyword']);
                        //$output = file_get_contents($url);
                        //$replyContent = json_decode($output, true);
                        $keyword =  $result['keyword'];
                        include("airquality.php");
                        $content = getAirQualityChina($keyword);

                        if(is_array($content)){
                            $result = $this->transmitNews($object, $content);
                        }else{
                            $result = $this->transmitText($object, $content);
                        }
                        return $result;
                    default:
                        $replyContent = "还不支持这一功能:".$result['category'];
                        break;
                }
            }else{
                $replyContent = "不能理解你的内容:".$content;
            }
        }else{
            $replyContent = "只能运行在SAE环境!";
        }

        if (is_array($replyContent)){
            $resultStr = $this->transmitNews($object, $replyContent);
        }else{
            $resultStr = $this->transmitText($object, $replyContent);
        }
        return $resultStr;
    }

    private function receiveEvent($object)
    {
        $replyContent = "";
        switch ($object->Event)
        {
            case "subscribe":
                $replyContent = "欢迎关注 德强1012";
            case "unsubscribe":
                break;
            default:
                break;      

        }
        if (is_array($replyContent)){
            $resultStr = $this->transmitNews($object, $replyContent);
        }else{
            $resultStr = $this->transmitText($object, $replyContent);
        }
        return $resultStr;
    }


    private function receiveVoice($object)
    {
        $funcFlag = 0;

        if (isset($object->Recognition) && !empty($object->Recognition)){
            $content = strval($object->Recognition);
            include("segment.php");
            $result = sinasegment($content);
            if (is_array($result)){
                switch ($result['category'])
                {
                    case "天气":
                       // $url = "http://apix.sinaapp.com/weather/?appkey=trialuser&city=".urlencode($result['keyword']);
                        //$output = file_get_contents($url);
                        //$replyContent = json_decode($output, true);
                        $city = $result['keyword'];
                        include("weather2.php");
                        $content = getWeatherInfo($city);
                        $result = $this->transmitNews($object, $content);
                        return $result;
                        //$replyContent = $result['keyword'];
                        break;
                    default:
                        $replyContent = "还不支持这一功能:".$result['category'];
                        break;
                }
            }else{
                $replyContent = "不能理解你的内容:".$content;
            }
        }else{
            $replyContent = "未开启语音识别功能";
        }
        if (is_array($replyContent)){
            $resultStr = $this->transmitNews($object, $replyContent);
        }else{
            $resultStr = $this->transmitText($object, $replyContent);
        }
        return $resultStr;
    }


    private function transmitText($object, $content, $flag = 0)
    {
        $textTpl = "
                        
                        
                        %s
                        
                        
                        %d
                    ";
        $resultStr = sprintf($textTpl, $object->FromUserName, $object->ToUserName, time(), $content, $flag);
        return $resultStr;
    }

    private function transmitNews($object, $arr_item)
    {
        if(!is_array($arr_item))
            return;

        $itemTpl = "    
                            <![CDATA[%s]]>
                            
                            
                            
                        
                    ";
        $item_str = "";
        foreach ($arr_item as $item)
            $item_str .= sprintf($itemTpl, $item['Title'], $item['Description'], $item['PicUrl'], $item['Url']);

            $newsTpl = "
                            
                            
                            %s
                            
                            
                            %s
                            
                                $item_str
                            
                        ";

        $result = sprintf($newsTpl, $object->FromUserName, $object->ToUserName, time(), count($arr_item));
        return $result;
    }
}
?>

空气质量airquality.php整体代码如下:




function getAirQualityChina($city)
{
          //http://www.pm25.in/api/querys/aqi_details.json?stations=no&city=%E6%B7%B1%E5%9C%B3&token=5j1znBVAsnSf5xQyNQyq
    $url = "http://www.pm25.in/api/querys/aqi_details.json?stations=no&city=".urlencode($city)."&token=5j1znBVAsnSf5xQyNQyq";

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $output = curl_exec($ch);
    curl_close($ch);

    $cityAir = json_decode($output, true);
    if (isset($cityAir['error'])){
        return $cityAir['error'];
    }else{
        $result = "空气质量指数(AQI):".$cityAir[0]['aqi']."\n".
                  "空气质量等级:".$cityAir[0]['quality']."\n".
                  "细颗粒物(PM2.5):".$cityAir[0]['pm2_5']."\n".
                  "可吸入颗粒物(PM10):".$cityAir[0]['pm10']."\n".
                  "一氧化碳(CO):".$cityAir[0]['co']."\n".
                  "二氧化氮(NO2):".$cityAir[0]['no2']."\n".
                  "二氧化硫(SO2):".$cityAir[0]['so2']."\n".
                  "臭氧(O3):".$cityAir[0]['o3']."\n".
                  "更新时间:".preg_replace("/([a-zA-Z])/i", " ", $cityAir[0]['time_point']);
        $aqiArray = array(); 
        $aqiArray[] = array("Title" =>$cityAir[0]['area']."空气质量", "Description" =>$result, "PicUrl" =>"", "Url" =>"");
        return $aqiArray;
    }
}


?>

天气查询weather2.php整体代码如下:



//var_dump(getWeatherInfo("桃江"));

function getWeatherInfo($cityName)
{
    if ($cityName == "" || (strstr($cityName, "+"))){
        return "发送天气加城市,例如'天气深圳'";
    }
         //1a3cde429f38434f1811a75e1a90310c
    $ak = '1a3cde429f38434f1811a75e1a90310c';
    $sk = '';
    $url = 'http://api.map.baidu.com/telematics/v3/weather?ak=%s&location=%s&output=%s&sn=%s';
    $uri = '/telematics/v3/weather';
    $location = $cityName;
    $output = 'json';
    $querystring_arrays = array(
        'ak' => $ak,
        'location' => $location,
        'output' => $output
    );
    $querystring = http_build_query($querystring_arrays);
    $sn = md5(urlencode($uri.'?'.$querystring.$sk));
    $targetUrl = sprintf($url, $ak, urlencode($location), $output, $sn);
    // var_dump($targetUrl);
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $targetUrl);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $result = curl_exec($ch);
    curl_close($ch);
    $result = json_decode($result, true);
    if ($result["error"] != 0){
        return $result["status"];
    }
    $curHour = (int)date('H',time());
    $weather = $result["results"][0];
    $weatherArray[] = array("Title" =>$weather['currentCity']."天气预报", "Description" =>"", "PicUrl" =>"", "Url" =>"");
    for ($i = 0; $i < count($weather["weather_data"]); $i++) {
        $weatherArray[] = array("Title"=>
            $weather["weather_data"][$i]["date"]."\n".
            $weather["weather_data"][$i]["weather"]." ".
            $weather["weather_data"][$i]["wind"]." ".
            $weather["weather_data"][$i]["temperature"],
        "Description"=>"", 
        "PicUrl"=>(($curHour >= 6) && ($curHour < 18))?$weather["weather_data"][$i]["dayPictureUrl"]:$weather["weather_data"][$i]["nightPictureUrl"], "Url"=>"");
    }
    return $weatherArray;
}


?>

分词功能函数segment.php整体代码如下:



// $str="深圳天气怎么样";
//$result = sinasegment($str);  
// print_r($result) ;
function sinasegment($str)
{
    $seg = new SaeSegment();
    $ret = $seg->segment($str, 1);

    if ($ret === false){
        return;
    }
    $category = "";
    $keyword = "";
    foreach ($ret as $key => $value) {
        if ($value["word_tag"] == 95){
            $category = $value["word"];
        }
        if ($value["word_tag"] == 102){
            $keyword = $value["word"];
        }
    }
    if (!empty($category) && !empty($keyword)){
        return array('category'=>$category, 'keyword'=>$keyword); 
    }else{
        return;
    }
}

?>

你可能感兴趣的:(微信公众号开发最佳实践)