自己动手写iPhone wap浏览器之BSD Socket引擎篇

自己动手写iPhone wap浏览器之BSD Socket引擎篇(手把手教你iphone开发进阶篇)

作者:孙东风 2009-12-01(转载请注明出处)

 

在《自己动手写iPhone wap浏览器之预备篇》中笔者讲述了进行iPhone wap浏览器开发的主要流程如下:

²        封装BSD Socket进行HTTP请求。

²        将请求到的WML页面解析成XML数据结构。

²        渲染需要在界面上显示的WML标签(英文名tag)。

²        将渲染后的WML标签显示在界面上(UIView)。

 

在《自己动手写iPhone wap浏览器之预备篇》中已经讲述了利用tinyxml解析请求到的XML页面内容的知识,这个章节里主要讲述利用BSD Socket封装HTTP引擎的知识。在笔者的文章《玩转iPhone网络通讯之BSD Socket》中已经初步讲解了iPhone中利用BSD Socket进行网络通讯的关键技术点,但是笔者只是把请求的WML页面内容保存在一个缓冲区内。在实际应用中,大多数情况下需要解析请求到的WML页面内容从而区分开HTTP响应的包头、包体,有时候还需要解析HTTP包头的每行内容。要做到这些,首先需要BSD Socket引擎同步的解析请求到的数据,修改部分如下:

 

NSMutableString* readString = [[NSMutableString alloc] init];

char readBuffer[1];

 

int br = 0;

NSMutableString* readHeaderBufferStr = [[NSMutableString alloc] init];

while((br = recv(sockfd, readBuffer, sizeof(readBuffer), 0)))

{

[readHeaderBufferStr appendString:[NSString stringWithCString:readBuffer length:sizeof(readBuffer)]];

 

if([self RecvRespHeaderFinished:readHeaderBufferStr])

{

break;

}else

{

}

}

 

笔者把缓冲区的大小改为1,这样每次读取一个字符到缓冲内并把每次读取到的内容添加到readHeaderBufferStr内,之后调用RecvRespHeaderFinished:readHeaderBufferStr方法判断HTTP头部内容是否读取完成,这个方法的实现如下:

 

- (BOOL)RecvRespHeaderFinished:(NSString*) aReadBuffer

{

    int len = [aReadBuffer length];

    if(len < 12)

    {

        return NO;

    }

   

    if([aReadBuffer characterAtIndex:(len-4)] == (const unichar)'/r'&&

        [aReadBuffer characterAtIndex:(len-3)] == (const unichar)'/n'&&

        [aReadBuffer characterAtIndex:(len-2)] == (const unichar)'/r'&&

        [aReadBuffer characterAtIndex:(len-1)] == (const unichar)'/n')

    {

        NSLog(@"ResponseHeader = %@",aReadBuffer);

        int nCode = [self GetResponseCode:aReadBuffer];

        NSLog(@"get http response code = %d",nCode);

       

        if(nCode > 299 || nCode < 200)

        {

            NSLog(@"ErrMsg:Server response code is %d.",nCode);

            close(sockfd);

        }

       

        contentlen = [[self GetHttpHdrFieldValue:aReadBuffer aField:EContentLength] intValue];

        NSLog(@"contentlen = %d",contentlen);

        return YES;

    }

   

    return NO;

}

 

笔者通过判断缓冲字符串的后四个字符是否依次为’/r’’/n’’/r’’/n’来断定HTTP头部是否解析完成,如果解析完成则打印出来并返回YES,否则返回NO,最后并调用GetHttpHdrFieldValue:aReadBuffer:aField方法获取HTTP包头中指定行的value值,在这里笔者需要获取"Content-Length"value值以便知道HTTP包体的长度,打印结果如下:

 

9-12-01 20:39:03.337 BSDHttpExample[253:207] getIpAddressForHost :220.181.37.183

2009-12-01 20:39:03.404 BSDHttpExample[253:207] Connect errno is :0

2009-12-01 20:39:03.404 BSDHttpExample[253:207] Then the conn is not -1!

2009-12-01 20:39:03.405 BSDHttpExample[253:207] httpCotent is :GET / HTTP/1.1

Host:wap.baidu.com

 

2009-12-01 20:39:03.406 BSDHttpExample[253:207] Sended content is :GET / HTTP/1.1

Host:wap.baidu.com

 

2009-12-01 20:39:03.406 BSDHttpExample[253:207] Datas have been sended over!

send 38 bytes to 220.181.37.183

2009-12-01 20:39:03.501 BSDHttpExample[253:207] ResponseHeader = HTTP/1.1 200 OK

Date: Tue, 01 Dec 2009 12:39:03 GMT

Server: Apache

Content-Length: 4638

Content-Type: text/vnd.wap.wml;charset=utf-8

Age: 0

Cache-Control: no-cache

Expires: -1

Set-Cookie: BAIDU_WISE_UID=frontui_1259671143_7379; Max-Age=800000000; expires=Sun, 08-Apr-35 18:52:23 GMT; path=/; domain=.baidu.com;

Vary: Accept-Encoding,User-Agent

Connection: close

 

 

可见,通过上面的方法成功的解析出来HTTP响应的头部内容并获取到HTTP包体的长度,那么接下来就需要解析HTTP包体的内容了,代码如下:

 

NSMutableString* readBodyBufferStr = [[NSMutableString alloc] init];

while((br = recv(sockfd, readBuffer, sizeof(readBuffer), 0)))

{

if(recLen < contentlen)

{

recLen++;

[readBodyBufferStr appendString:[NSString stringWithCString:readBuffer length:sizeof(readBuffer)]];

}else

{

}

}

NSLog(@"hava received all data = /n%@",readBodyBufferStr);

 

通过判断包体缓冲字符串的长度和“Content-Length”的值来决定HTTP包体内容是否已经解析完成,打印结果如下:

 

2009-12-01 20:39:03.503 BSDHttpExample[253:207] contentlen = 4638

2009-12-01 20:39:03.532 BSDHttpExample[253:207] hava received all data =

<?xml version="1.0" encoding="utf-8"?><!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml"><wml><!--STATUS OK--><card title="□□□□∫□‰∏□‰∏□,‰Ω□□∞±□ü□□□ì"><p><img src="/r/wise/wapsearchindex/logoindexsmall.gif" alt="□□□□∫□□□□□°□" /><br/><input name="word" emptyok="true"/><br/><anchor>□êú□Ω□□°□<go href="/s" method="get"><postfield name="tn" value="webmain"/><postfield name="word" value="$(word)"/><postfield name="ssid" value="0"/><postfield name="from" value="0"/><postfield name="vit" value=""/><postfield name="bd_page_type" value="0"/><postfield name="uid" value="frontui_1259671143_7379"/><postfield name="st" value="111041"/><postfield name="pu" value="pd@1,uc@0"/><postfield name="rn" value="10"/><postfield name="pn" value="0"/></go></anchor>&nbsp;<anchor>□êúWap<go href="/s" method="get"><postfield name="tn" value="fwapadv"/><postfield name="word" value="$(word)"/><postfield name="ssid" value="0"/><postfield name="from" value="0"/><postfield name="bd_page_type" value="0"/><postfield name="uid" value="frontui_1259671143_7379"/><postfield name="vit" value=""/><postfield name="st" value="102041"/><postfield name="pu" value="pd@1,uc@0"/><postfield name="rn" value="10"/><postfield name="pn" value="0"/></go></anchor><br/>□ú□<a href="http://wap.baidu.com/news?tn=bdwcn&amp;statcms=index_jrjd&amp;word=todaynews&amp;rn=10&amp;pn=0&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□□□ó□</a>|<a href="/fengyun/fengyun_novel_1.jsp?ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0&amp;statcms=index_novel">□∞è□□□</a>|<a href="http://m.baidu.com/fengyun/jingmeibizhi.jsp?ssid=0&amp;from=0&amp;bd_page_type=1&amp;uid=uc_MTI1ODU5OTc0Mzo0Njc1NzYw_743&amp;pu=pd@1,uc@2&amp;statcms=img_jingmeibizhi">□□é□□□</a>|<a href="http://wap.baidu.com/fengyun/fengyun_fast_1.jsp?stat=novel_fast&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□≠□□ú</a><br/>□é□<a href="/fengyun/fengyun_game_index.jsp?statcms=index_game&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□□□ú∫□∏∏□àè</a>|<a href="http://wap.skycn.com/?statcms=soft&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□□□§á□Ω□‰□□</a><br/><br/><a href="/img?stat=img&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□□□□á</a>|<a href="http://wapp.baidu.com/?stat=tieba&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□□□ê□</a>|<a href="http://wapiknow.baidu.com/?stat=iknow&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□ü□□□ì</a>|<a href="http://waphi.baidu.com/?stat=waphi&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□∫□ó□</a><br/><a href="/news?stat=news&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□∞□ó□</a>|<a href="/dt/index.jsp?ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□ú∞□□□</a>|<a href="/tq?ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□§□□∞□</a>|<a href="/s?tn=wisedict&amp;mark=3&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□□□□∏</a><br/><a href="/s?tn=wisestock&amp;mark=5&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□°□□□</a>|<a href="/s?tn=wisetraffic&amp;mark=4&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□àó□Ω□□à□□è≠</a>|<a href="/more.jsp?ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0">□□□□§□</a><br/><br/><a href="http://mo.baidu.com/index1.wml">□□□□□π:□é□‰∏□□□□□∫□|□□□□∫□□□□□ú∫□□ì□□□□≥□</a><br/><a href="http://wap.hao123.com/index.wml?ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0&amp;vit=&amp;tt=G1F11">hao123□Ω□□ù□‰π□□□□</a><br/><br/><a href="/wxlm.jsp?ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0&amp;vit=&amp;tt=H1E11">□ó□□∫□□êú□□□□□□□□ü</a><br/><a href="/help/help.jsp?ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0&amp;vit=&amp;tt=I1211">□∏□□□□</a>|<a href="http://wapp.baidu.com/f?kw=%B0%D9%B6%C8%CA%D6%BB%FA%CB%D1%CB%F7%B0%EF%D6%FA&amp;ssid=0&amp;from=0&amp;bd_page_type=0&amp;uid=frontui_1259671143_7379&amp;pu=pd@1,uc@0&amp;vit=&amp;tt=78811">□è□□□à</a><br/>2009-12-1 20:39</p></card></wml>

 

可见,接收到了一个完整的WML页面内容,而这个WML页面的内容是完全符合XML的基本格式规范的,所以下面可以通过tinyxml解析这个请求到的HTTP包体,从而解析出来需要渲染的WML标签,为下一章节的页面渲染作准备,代码如下:

 

NSLog(@"**********Now start parsing xml data**********/n");

 

XMLParserEx *xmlParser = XMLParserEx::GetInstance();

xmlParser->parsexml([readBodyBufferStr UTF8String]);

 

close(sockfd);

[readBodyBufferStr release];

[readString release];

 

笔者调用《自己动手写iPhone wap浏览器之预备篇》中写好的parsexml(const char* buffer)方法解析请求到的HTTP包体内容,打印结果如下:

 

parse xml succeed

aChild value = STATUS OK

aChild value = card

attr name = title, attr value = □□□□∫□‰∏□‰∏□,‰Ω□□∞±□ü□□□ì

aChild value = p

aChild value = img

attr name = src, attr value = /r/wise/wapsearchindex/logoindexsmall.gif

attr name = alt, attr value = □□□□∫□□□□□°□

aChild value = br

aChild value = input

attr name = name, attr value = word

attr name = emptyok, attr value = true

aChild value = br

aChild value = anchor

aChild value = □êú□Ω□□°□

aChild Value = □êú□Ω□□°□

aChild value = go

attr name = href, attr value = /s

attr name = method, attr value = get

aChild value = postfield

attr name = name, attr value = tn

attr name = value, attr value = webmain

aChild value = postfield

attr name = name, attr value = word

attr name = value, attr value = $(word)

aChild value = postfield

attr name = name, attr value = ssid

attr name = value, attr value = 0

aChild value = postfield

attr name = name, attr value = from

attr name = value, attr value = 0

aChild value = postfield

attr name = name, attr value = vit

attr name = value, attr value =

aChild value = postfield

attr name = name, attr value = bd_page_type

attr name = value, attr value = 0

aChild value = postfield

attr name = name, attr value = uid

attr name = value, attr value = frontui_1259671143_7379

aChild value = postfield

attr name = name, attr value = st

attr name = value, attr value = 111041

aChild value = postfield

attr name = name, attr value = pu

attr name = value, attr value = pd@1,uc@0

aChild value = postfield

attr name = name, attr value = rn

attr name = value, attr value = 10

aChild value = postfield

attr name = name, attr value = pn

attr name = value, attr value = 0

aChild value = nbsp;

aChild Value = nbsp;

aChild value = anchor

aChild value = □êúWap

aChild Value = □êúWap

aChild value = go

attr name = href, attr value = /s

attr name = method, attr value = get

aChild value = postfield

attr name = name, attr value = tn

attr name = value, attr value = fwapadv

aChild value = postfield

attr name = name, attr value = word

attr name = value, attr value = $(word)

aChild value = postfield

attr name = name, attr value = ssid

attr name = value, attr value = 0

aChild value = postfield

attr name = name, attr value = from

attr name = value, attr value = 0

aChild value = postfield

attr name = name, attr value = bd_page_type

attr name = value, attr value = 0

aChild value = postfield

attr name = name, attr value = uid

attr name = value, attr value = frontui_1259671143_7379

aChild value = postfield

attr name = name, attr value = vit

attr name = value, attr value =

aChild value = postfield

attr name = name, attr value = st

attr name = value, attr value = 102041

aChild value = postfield

attr name = name, attr value = pu

attr name = value, attr value = pd@1,uc@0

aChild value = postfield

attr name = name, attr value = rn

attr name = value, attr value = 10

aChild value = postfield

attr name = name, attr value = pn

attr name = value, attr value = 0

aChild value = br

aChild value = □ú□

aChild Value = □ú□

aChild value = a

attr name = href, attr value = http://wap.baidu.com/news?tn=bdwcn&statcms=index_jrjd&word=todaynews&rn=10&pn=0&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□□□ó□

aChild Value = □□□□ó□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = /fengyun/fengyun_novel_1.jsp?ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0&statcms=index_novel

aChild value = □∞è□□□

aChild Value = □∞è□□□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = http://m.baidu.com/fengyun/jingmeibizhi.jsp?ssid=0&from=0&bd_page_type=1&uid=uc_MTI1ODU5OTc0Mzo0Njc1NzYw_743&pu=pd@1,uc@2&statcms=img_jingmeibizhi

aChild value = □□é□□□

aChild Value = □□é□□□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = http://wap.baidu.com/fengyun/fengyun_fast_1.jsp?stat=novel_fast&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□≠□□ú

aChild Value = □□≠□□ú

aChild value = br

aChild value = □é□

aChild Value = □é□

aChild value = a

attr name = href, attr value = /fengyun/fengyun_game_index.jsp?statcms=index_game&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□□□ú∫□∏∏□àè

aChild Value = □□□□ú∫□∏∏□àè

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = http://wap.skycn.com/?statcms=soft&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□□□§á□Ω□‰□□

aChild Value = □□□□§á□Ω□‰□□

aChild value = br

aChild value = br

aChild value = a

attr name = href, attr value = /img?stat=img&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□□□□á

aChild Value = □□□□□á

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = http://wapp.baidu.com/?stat=tieba&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□□□ê□

aChild Value = □□□□ê□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = http://wapiknow.baidu.com/?stat=iknow&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □ü□□□ì

aChild Value = □ü□□□ì

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = http://waphi.baidu.com/?stat=waphi&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□∫□ó□

aChild Value = □□∫□ó□

aChild value = br

aChild value = a

attr name = href, attr value = /news?stat=news&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□∞□ó□

aChild Value = □□∞□ó□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = /dt/index.jsp?ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □ú∞□□□

aChild Value = □ú∞□□□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = /tq?ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □§□□∞□

aChild Value = □§□□∞□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = /s?tn=wisedict&mark=3&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□□□□∏

aChild Value = □□□□□∏

aChild value = br

aChild value = a

attr name = href, attr value = /s?tn=wisestock&mark=5&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□°□□□

aChild Value = □□°□□□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = /s?tn=wisetraffic&mark=4&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □àó□Ω□□à□□è≠

aChild Value = □àó□Ω□□à□□è≠

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = /more.jsp?ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0

aChild value = □□□□§□

aChild Value = □□□□§□

aChild value = br

aChild value = br

aChild value = a

attr name = href, attr value = http://mo.baidu.com/index1.wml

aChild value = □□□□□π:□é□‰∏□□□□□∫□|□□□□∫□□□□□ú∫□□ì□□□□≥□

aChild Value = □□□□□π:□é□‰∏□□□□□∫□|□□□□∫□□□□□ú∫□□ì□□□□≥□

aChild value = br

aChild value = a

attr name = href, attr value = http://wap.hao123.com/index.wml?ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0&vit=&tt=G1F11

aChild value = hao123□Ω□□ù□‰π□□□□

aChild Value = hao123□Ω□□ù□‰π□□□□

aChild value = br

aChild value = br

aChild value = a

attr name = href, attr value = /wxlm.jsp?ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0&vit=&tt=H1E11

aChild value = □ó□□∫□□êú□□□□□□□□ü

aChild Value = □ó□□∫□□êú□□□□□□□□ü

aChild value = br

aChild value = a

attr name = href, attr value = /help/help.jsp?ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0&vit=&tt=I1211

aChild value = □∏□□□□

aChild Value = □∏□□□□

aChild value = |

aChild Value = |

aChild value = a

attr name = href, attr value = http://wapp.baidu.com/f?kw=%B0%D9%B6%C8%CA%D6%BB%FA%CB%D1%CB%F7%B0%EF%D6%FA&ssid=0&from=0&bd_page_type=0&uid=frontui_1259671143_7379&pu=pd@1,uc@0&vit=&tt=78811

aChild value = □è□□□à

aChild Value = □è□□□à

aChild value = br

aChild value = 2009-12-1 20:39

 

可见,tinyxml解析出了WML页面的全部标签,在实际的浏览器开发中需要把这些标签分类保存起来以供界面渲染使用,这些内容将来下面的章节中讲解。

你可能感兴趣的:(xml,socket,浏览器,iPhone,WAP,Parsing)