Boost搜索引擎

项目背景

先说一下什么是搜索引擎,很简单,就是我们平常使用的百度,我们把自己想要所有的内容输入进去,百度给我们返回相关的内容.百度一般给我们返回哪些内容呢?这里很简单,我们先来看一下.

Boost搜索引擎_第1张图片

搜索引擎基本原理

这里我们简单的说一下我们的搜索引擎的基本原理.

我们给服务器发起请求,例如搜索关键字"boost",服务器拿到请求之后,此时检索自己的资源,然后把结果构成响应发送给我们.

Boost搜索引擎_第2张图片

Boost库

boost库是一个经过千锤百炼、可移植、提供源代码的 C++ 库,作为标准库的后备.他的供能很强大,但是这里面有一个小小的缺陷,它不支持搜索,例如我们想要搜索一个函数,看一下cplus库,他是支持的.

image-20230909141645320

但是我们的boost库不支持,不知道我们后面支不支持.

image-20230909141829732

项目目的

下面我们就要说一下我们的项目的目的了,很简单,我们给boost添加一个搜索的功能,这里要说一下,我们服务器上面说了,我们需要搜索资源,可以通过两个方式

  • 搜索其他的网页资源:这里需要使用爬虫,有一定的技术要求
  • 把boost下载下来,我们在本地搜索资源

这里我们使用第二个方式,下载一下boost库.

Boost搜索引擎宏观流程

清晰数据

我们把boost库下载下来,此时我们想要把所有的后缀是html的文件进行处理,也就是清晰数据.我们先来看一个简单的html文件.我们把其中的title,content,url进行保存.

构建索引

我们把清晰出来的标签构建好索引,为了后期便于查找.这里细节很多,我们后面说/

处理请求

我们把请求处理好,然后根据索引拿到结果,由于我们的结果很多,这里我们把众多的结果根据权重排好序之后,发送给客户端.

前端页面

根据返回的结果,我们使用前端技术进行处理,让后我们就可以完成这个项目了.

Boost搜索引擎_第3张图片

技术栈与环境

技术栈

  • 后端: C/C++, C++11,STL, boost标准库, Jsoncpp, cppjieba, cpp-httplib
  • 前端: html5,css,js、jQuery, Ajax

环境

  • Centos7虚拟机,vim,gcc(g++),Makefile,Vscode

认识索引

下面我们要说下什么是索引,这里很简单,我们给编上号,我们可以根据编号找到唯一确定的文件,这就是索引的基本的原理.不过这里的索引分为正排索引和倒排索引.

  • 正派索引: 根据编号找到文件,这里的结果是唯一的
  • 倒排索引: 根据关键字,找到文件id.

这里们说大家可能觉得有点不太清楚,这里我们举一个例子,这里有两个文件.

Boost搜索引擎_第4张图片

正排索引

我们对每一个文件进行编号.

文档ID 文档名称 文档内容
1 文档A 你好,我是大学生
2 文档B 你好,我是社会人

这里的正派索引很简单,我们根据文档编号,直接就可以找到文档的内容.

倒排索引

我们把每一个文档都进行分词,拿出来不重复的词,对于每一个不重复的次,下面都挂着我们的文档的编号.

关键字 文档ID
你好 1, 2
1, 2
1, 2
大学生 1
社会人 2

倒排索引,就是根据关键字,拿到我们的文档ID.

如何分词

上面我们说了把文档进行分词,为何分词?为了提高查找的效率.那么请问我们该如何分词呢?这里我们可以自己手动分,但是已经有大佬给我们变好了一个库,我们直接使用就可以了.但是如果我们手动分?这里该如何分,很简单.

  • 你好,我是大学生: 你好/我/是/大学生

  • 你好,我是社会人: 你好/我/是/社会人

注意的,上面的分词我随意分的,不一定就是这样的.不过这里我们要谈一下我们一个提高效率的方法,我们发现,一个文旦里面的了" , “从” , “吗” , “the” , “a” 有的时候意义不是太大,那么我们这里是不是在分词的时候直接忽略,可以提高我们的效率,像这一种词,我们称为停止词.

模拟查找

下面我们模拟一下查找的流程的。

用户输入:你好 -> 倒排索引中查找 -> 提取出文档ID(1,2) -> 根据正排索引 -> 找到文档的内容 ->title+conent(desc)+url 文档结果进行摘要->构建响应结果

数据清洗

我们先下载一下boost库,直接使用最新版本的,我这里是1.83.0.我们下载到桌面,然后在centos下使用指令rz传入虚拟机中,然后解压一下就可以了.

image-20230909151825742

[qkj@localhost install]$ rz -E 

[qkj@localhost install]$ ll
total 141256
-rw-r--r--. 1 qkj qkj 144645738 Sep  9 00:15 boost_1_83_0.tar.gz
[qkj@localhost install]$ tar xzf boost_1_83_0.tar.gz 
[qkj@localhost install]$ ll
total 141260
drwxr-xr-x. 8 qkj qkj      4096 Aug  8 14:40 boost_1_83_0
-rw-r--r--. 1 qkj qkj 144645738 Sep  9 00:15 boost_1_83_0.tar.gz
[qkj@localhost install]$ 

下面看一下这个库的内容.

[qkj@localhost install]$ cd boost_1_83_0/
[qkj@localhost boost_1_83_0]$ ll
total 112
drwxr-xr-x. 139 qkj qkj  8192 Aug  8 14:40 boost
-rw-r--r--.   1 qkj qkj   851 Aug  8 14:02 boost-build.jam
-rw-r--r--.   1 qkj qkj 20245 Aug  8 14:02 boostcpp.jam
-rw-r--r--.   1 qkj qkj   989 Aug  8 14:02 boost.css
-rw-r--r--.   1 qkj qkj  6308 Aug  8 14:02 boost.png
-rw-r--r--.   1 qkj qkj  2486 Aug  8 14:02 bootstrap.bat
-rwxr-xr-x.   1 qkj qkj 10811 Aug  8 14:02 bootstrap.sh
drwxr-xr-x.   7 qkj qkj   196 Aug  8 14:14 doc
-rw-r--r--.   1 qkj qkj   769 Aug  8 14:02 index.htm
-rw-r--r--.   1 qkj qkj  5418 Aug  8 14:40 index.html
-rw-r--r--.   1 qkj qkj   291 Aug  8 14:02 INSTALL
-rw-r--r--.   1 qkj qkj 11947 Aug  8 14:02 Jamroot
drwxr-xr-x. 148 qkj qkj  4096 Aug  8 14:40 libs
-rw-r--r--.   1 qkj qkj  1338 Aug  8 14:02 LICENSE_1_0.txt
drwxr-xr-x.   4 qkj qkj   159 Aug  8 14:02 more
-rw-r--r--.   1 qkj qkj   542 Aug  8 14:02 README.md
-rw-r--r--.   1 qkj qkj  2608 Aug  8 14:02 rst.css
drwxr-xr-x.   2 qkj qkj   171 Aug  8 14:02 status
drwxr-xr-x.  14 qkj qkj   256 Aug  8 14:02 tools
[qkj@localhost boost_1_83_0]$ 

这里面就是我们boost库的全部内容,为了我们的项目简单一些,这里我们使用boost里面的doc里面的html目录下的的html文件.如果我们想要搭建所有的html文件,这里在后面去做.

boost_1_83_0/doc/html
[qkj@localhost doc]$ cd html/
[qkj@localhost html]$ ll
total 2900
-rw-r--r--.  1 qkj qkj   3476 Aug  8 14:24 about.html
drwxr-xr-x.  2 qkj qkj     82 Aug  8 14:25 accumulators
-rw-r--r--.  1 qkj qkj   5858 Aug  8 14:25 accumulators.html
drwxr-xr-x.  2 qkj qkj    168 Aug  8 14:26 align
-rw-r--r--.  1 qkj qkj   4440 Aug  8 14:26 align.html
drwxr-xr-x.  2 qkj qkj     78 Aug  8 14:26 any
-rw-r--r--.  1 qkj qkj   9011 Aug  8 14:26 any.html
drwxr-xr-x.  3 qkj qkj     78 Aug  8 14:26 array
-rw-r--r--.  1 qkj qkj   8377 Aug  8 14:26 array.html
-rw-r--r--.  1 qkj qkj  36597 Aug  8 14:30 array_types.html
-rw-r--r--.  1 qkj qkj 286811 Aug  8 14:29 asio_HTML.manifest
-rw-r--r--.  1 qkj qkj   6685 Aug  8 14:35 Assignable.html
-rw-r--r--.  1 qkj qkj    700 Aug  8 14:02 atomic.html
-rw-r--r--.  1 qkj qkj  20627 Aug  8 14:30 auxiliary.html
drwxr-xr-x.  2 qkj qkj     31 Aug  8 14:02 bbv2
...

下面我们要做的就是就是把boost_1_83_0/doc/html里面的所有内容保存到一个文件中.

[qkj@localhost boost_searcher]$ mkdir data/input -p
[qkj@localhost boost_searcher]$ cp -rf ../../install/boost_1_83_0/doc/html/* data/input/

我们看一下.

[qkj@localhost boost_searcher]$ cd data/input/
[qkj@localhost input]$ ll
total 2900
-rw-r--r--.  1 qkj qkj   3476 Sep  9 00:31 about.html
drwxr-xr-x.  2 qkj qkj     82 Sep  9 00:31 accumulators
-rw-r--r--.  1 qkj qkj   5858 Sep  9 00:31 accumulators.html
drwxr-xr-x.  2 qkj qkj    168 Sep  9 00:31 align
-rw-r--r--.  1 qkj qkj   4440 Sep  9 00:31 align.html
drwxr-xr-x.  2 qkj qkj     78 Sep  9 00:31 any
-rw-r--r--.  1 qkj qkj   9011 Sep  9 00:31 any.html
drwxr-xr-x.  3 qkj qkj     78 Sep  9 00:31 array
-rw-r--r--.  1 qkj qkj   8377 Sep  9 00:31 array.html

下面就可以去去标签了,这里创建一个文件.

[qkj@localhost boost_searcher]$ touch parser.cc

认识标签

在谈去标签之前,我们需要先认识一下标签.,我们随便打开的一个html文件.

DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">    
<html>    
<head>    
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">    
<title>Chapter 45. Boost.YAPtitle>    
<link rel="stylesheet" href="../../doc/src/boostbook.css" type="text/css">    
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">    
<link rel="home" href="index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">    
<link rel="up" href="libraries.html" title="Part I. The Boost C++ Libraries (BoostBook Subset)">    
<link rel="prev" href="xpressive/appendices.html" title="Appendices">    
<link rel="next" href="boost_yap/manual.html" title="Manual">    
<meta name="viewport" content="width=device-width, initial-scale=1">    
head>    
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">    
<table cellpadding="2" width="100%"><tr>    
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../boost.png">td>             
<td align="center"><a href="../../index.html">Homea>td>    
<td align="center"><a href="../../libs/libraries.htm">Librariesa>td>    
<td align="center"><a href="http://www.boost.org/users/people.html">Peoplea>td>  

像这种由<>包含的就是标签,一般而言,标签是成对出现的.这些标签对我们来说现在是没有价值的.我们需要把它给清晰了.对与清晰的数据我们也保存在一个文件中.

[qkj@localhost boost_searcher]$ mkdir data/raw_html -p
[qkj@localhost boost_searcher]$ cd data/
[qkj@localhost data]$ ll
total 16
drwxrwxr-x. 58 qkj qkj 12288 Sep  9 00:31 input     // 这里保存源html
drwxrwxr-x.  2 qkj qkj     6 Sep  9 00:44 raw_html  // 这里保存清晰后的html
[qkj@localhost data]$  

下面说一下我们该如何保存这些清晰后的文档内容,看一我们源html文件有多少个.

[qkj@localhost input]$ ls -Rl | grep -E "*.html" | wc -l
8581
[qkj@localhost input]$

这里我们可以对每一个源html都创建一个文件,但是这里有些多了,不如我们把所有的文档清洗好之后结果放在一个文件中,文件与文件之间使用’\3’隔开,就像下面的格式

XXXXXXXXXXXXXXXXX\3YYYYYYYYYYYYYYYYYYYYY\3ZZZZZZZZZZZZZZZZZZZZZZZZZ\3

这里解释一下我们为何使用’\3’.这是因为在ASCII表中 , 控制字符是不可显示字符 , 即无法打印。在我们获取的文档内容(即data/input中的html网页文件)中,里面基本上都是可打印字符,基本上不会有不可显示的控制字符。如此以来也就不会污染我们的文档内容啦。

不过我们不适用上面的格式,这里我们想办法把一个文档的’\n’全部去掉,然后我们使用这样的格式.

类似:title\3content\3url \n title\3content\3url \n title\3content\3url \n ...
方便我们getline(ifsream, line),直接获取文档的全部内容:title\3content\3url

我们创建一个文件来保存我们去标签之后的内容.

drwxrwxr-x. 58 qkj qkj 12288 Sep  9 01:03 input
drwxrwxr-x.  2 qkj qkj     6 Sep  9 01:03 raw_html
[qkj@localhost data]$ 
[qkj@localhost data]$ cd raw_html/
[qkj@localhost raw_html]$ touch raw.txt
[qkj@localhost raw_html]$ ll
total 0
-rw-rw-r--. 1 qkj qkj 0 Sep  9 02:32 raw.txt

清晰标签框架

下面我们开始编写parser.cc简单框架内,我们看一下.

#include 
#include 
#include 
#include 
// 这是一个目录,下面放的是所有的html网页
const std::string src_path = "data/input";

// 下面是一个文本文件,该文件保存所有的 网页清洗后的数据
const std::string output = "data/raw_html/raw.txt";

// 解析网页格式
typedef struct DocInfo
{
  std::string title;   // 文档标题
  std::string content; // 文旦内容
  std::string url;     // 该文档在官网的的url
} DocInfo_t;

static bool EnumFile(const std::string &src_path, std::vector<std::string> *file_list);
static bool ParseHtml(const std::vector<std::string> &file_list, std::vector<DocInfo_t> *results);
static bool SaveHtml(const std::vector<DocInfo_t> &results, const std::string &output);

int main(void)
{
  // 保存所有的 html 的文件名
  std::vector<std::string> file_list;

  // 第一步: EnumFile 枚举所有的文件名(带路径),仅限 网页,方便后期对一个一个文件进行读取
  if (false == EnumFile(src_path, &file_list))
  {
    std::cerr << "枚举文件名失败" << std::endl;
    return 1;
  }

  // 第二部:读取每一个文件的内容,进行解析,解析的格式 为DocInfo_t
  std::vector<DocInfo_t> results;
  if (false == ParseHtml(file_list, &results))
  {
    std::cerr << "解析文件失败" << std::endl;
    return 2;
  }

  // 第三步: 把解析文件的内容写入到output中,按照\3\n 作为每一个文档的分割符
  if (false == SaveHtml(results, output))
  {
    std::cerr << "保存文件失败" << std::endl;
    return 3;
  }
  return 0;
}

我们的的基本思路是下面这样的.

  • 拿到我们所有的源html文件名,然后把这些文件名保存在一个数组中
  • 依次遍历数组,把文件进行去标签,然后把去掉的内容整理成一个DocInfo_t结构体,里面保存title,content,url, 结果放在一个数组中
  • 遍历结构体数组,然后把内容写入到我们的目的文件中,按照一定的格式.

Boost库的安装

在实现上面的接口前,我们这里需要下载一个boost库,这是因为我们需要使用他们的函数.

[qkj@localhost BoostSearchEngine]$ sudo yum install -y boost-devel
[sudo] password for qkj: 

我们这里简单认识一下boost,下面是使用手册.

Boost搜索引擎_第5张图片

我们要使用是的关于文件的函数,这里我们看一下.

image-20230909162857940

EnumFile函数实现

下面开始EnumFil函数的实现,它的功能是把我们给定src_path目录下的所有后缀是html的文件名字给保存下了,存在在一个file_list数组中.

static bool EnumFile(const std::string &src_path, std::vector<std::string> *file_list)

具体的实现是.

static bool EnumFile(const std::string &src_path, std::vector<std::string> *file_list)
{
  assert(file_list);
  namespace fs = boost::filesystem; // 这是一个习惯, C++支持
  fs::path root_path(src_path);     // 定义一个path对象

  if (fs::exists(root_path) == false) // 判断路径是不是存在
  {
    std::cerr << src_path << " 路径是不存在的" << std::endl;
    return false;
  }

  // 定义一个空的迭代器, 用来判断 迭代器递归结束
  fs::recursive_directory_iterator end;
  for (fs::recursive_directory_iterator iter(root_path); iter != end; iter++)
  {
    // 保证是普通的文件
    if (fs::is_regular_file(*iter) == false)
    {
      // 这里是目录一类的
      continue;
    }

    // 普通文件需要 html 文件后缀结束
    if (iter->path().extension() != ".html")
    {
      continue;
    }

     std::cout << "debug: " << iter->path().string() << std::endl;

    // 此时一定 是以 html 后缀结尾的普通文件
    file_list->push_back(iter->path().string());
  }

  return true;
}

下面我们测试一下,写一些Makefile.

cc=g++
parser:parser.cc 
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem
.PHONY:clean
clean:
	rm parser

下面运行一下,我们发现成功了.

[qkj@localhost BoostSearchEngine]$ make
g++ -o parser parser.cc -std=c++11 -lboost_system -lboost_filesystem
[qkj@localhost BoostSearchEngine]$ ll
total 104
drwxrwxr-x. 4 qkj qkj    35 Sep  9 01:03 data
-rw-rw-r--. 1 qkj qkj   117 Sep  9 01:41 Makefile
-rwxrwxr-x. 1 qkj qkj 89152 Sep  9 01:43 parser
-rw-rw-r--. 1 qkj qkj  8398 Sep  9 01:43 parser.cc
[qkj@localhost BoostSearchEngine]$ ./parser 
debug: data/input/about.html
debug: data/input/accumulators/user_s_guide.html
debug: data/input/accumulators/acknowledgements.html
debug: data/input/accumulators/reference.html
debug: data/input/accumulators.html
...

ParseHtml实现

这里我们开始解析我们的每一个html目录.

static bool ParseHtml(const std::vector<std::string> &file_list, std::vector<DocInfo_t> *results)

下面是我们的框架.

static bool ParseTitle(const std::string &file, std::string *title);
static bool ParseContent(const std::string &file, std::string *content);
static bool ParseUrl(const std::string &file_path, std::string *url);

static bool ParseHtml(const std::vector<std::string> &file_list, std::vector<DocInfo_t> *results)
{
  assert(results);
  for (auto &file_path : file_list)
  {
    // 1. 读取文件
    std::string result;
    if (false == ns_util::FileUtil::ReadFile(file_path, &result))
    {
      continue;
    }

    DocInfo_t doc;
    // 2. 提取title
    if (false == ParseTitle(result, &doc.title))
    {
      continue;
    }
    // 3. 提取content  本质时 去标签
    if (false == ParseContent(result, &doc.content))
    {
      continue;
    }
    // 4. 提取url
    if (false == ParseUrl(file_path, &doc.url))
    {
      continue;
    }
    // 到这里一定时完成了解析任务
    results->push_back(std::move(doc)); // 右值引用
  }
  return true;
}

我们说一下我们的流程

  • 对于每一个文件,我们把它读取到一个字符串中
  • 根据字符串拿到title
  • 根据字符串拿到content
  • 根据字符串拿到url

下面我们分别实现这些函数的功能.

读取文件内容

对于这个函数,我们把它放在一个工具集中,后面可能会使用到.

#pragma once
#include 
#include 
#include 
#include 
// 这是一个工具集
namespace ns_util
{
  /// @brief  这是为了解析文件
  class FileUtil
  {
  public:
    /// @brief 读取文件内容到 out中
    /// @param file_path
    /// @param out
    /// @return
    static bool ReadFile(const std::string &file_path, std::string *out)
    {
      assert(out);
      std::ifstream in(file_path, std::ios::in);
      if (in.is_open() == false)
      {
        std::cerr << file_path << " 打开失败" << std::endl;
        return false;
      }

      std::string line;
      // 注意 getline 不会 读取 \n
      while (std::getline(in, line))
      {
        *out += line;
      }

      in.close();
      return true;
    }
  };
}

提取titile

我们这里继续看一下我们的一个html文件,title是在一个标签里面的.

image-20230909165910185

下面根据字符串来进行提取title.

static bool ParseTitle(const std::string &file, std::string *title)
{
  assert(title);
  std::size_t begin = file.find(""</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>begin <span class="token operator">==</span> std<span class="token double-colon punctuation">::</span>string<span class="token double-colon punctuation">::</span>npos<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  std<span class="token double-colon punctuation">::</span>size_t end <span class="token operator">=</span> file<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span><span class="token string">""); // 反方向查
  if (end == std::string::npos)
  {
    return false;
  }

  begin += std::string(""</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>begin <span class="token operator">></span> end<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token operator">*</span>title <span class="token operator">=</span> file<span class="token punctuation">.</span><span class="token function">substr</span><span class="token punctuation">(</span>begin<span class="token punctuation">,</span> end <span class="token operator">-</span> begin<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h4>提取content</h4> 
  <p>这里我们获取content,不是把所有的内容都拿出来,而是要去标签,这里需要借助一个状态机.</p> 
  <p>我们知道标签是有<code><></code>这样的表示的.那么我们这里使用一个状态机.我们默认第一个字符是<code><</code></p> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">ParseContent</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>file<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>content<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token function">assert</span><span class="token punctuation">(</span>content<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 这就是我们去标签最重要的地方</span>
  <span class="token comment">// 我们这里使用一个简单的状态机</span>
  <span class="token keyword">enum</span> <span class="token class-name">status</span>
  <span class="token punctuation">{</span>
    LABLE<span class="token punctuation">,</span>
    CONTENT
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
  
  <span class="token keyword">enum</span> <span class="token class-name">status</span> s <span class="token operator">=</span> LABLE<span class="token punctuation">;</span> <span class="token comment">// 默认第一个是 '<'</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">char</span> ch <span class="token operator">:</span> file<span class="token punctuation">)</span> <span class="token comment">// 注意这里我没有使用引用,后面解释</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">switch</span> <span class="token punctuation">(</span>s<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
    <span class="token keyword">case</span> LABLE<span class="token operator">:</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>ch <span class="token operator">==</span> <span class="token char">'>'</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 此时意味这当前的标签被处理完毕</span>
        s <span class="token operator">=</span> CONTENT<span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">break</span><span class="token punctuation">;</span>

    <span class="token keyword">case</span> CONTENT<span class="token operator">:</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>ch <span class="token operator">==</span> <span class="token char">'<'</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
          <span class="token comment">// 这里有可能是<><>这样的情况</span>
        s <span class="token operator">=</span> LABLE<span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">else</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 这里有一个细节 我们不想要'\n' 字符</span>
        <span class="token comment">// 我们希望用'\n' 作为分隔符</span>
        <span class="token comment">// 注意,这个应该不会出现\n,</span>
        <span class="token comment">// 毕竟我们读取文件的时候使用的getline,可是不我们不能把希望寄托到被人身上</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>ch <span class="token operator">==</span> <span class="token char">'\n'</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          ch <span class="token operator">=</span> <span class="token char">' '</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        content<span class="token operator">-></span><span class="token function">push_back</span><span class="token punctuation">(</span>ch<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">break</span><span class="token punctuation">;</span>

    <span class="token keyword">default</span><span class="token operator">:</span>
      <span class="token keyword">break</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h4>提取url</h4> 
  <p>这里面有一个需要谈的.我们这里是要凭借url,那么我么看一下官网的url和我们的本地的url是有什么关系的.</p> 
  <pre><code>官网url: https://www.boost.org/doc/libs/1_83_0/doc/html/accumulators.html
本地url: data/input/accumulators.html                   // 这是因为为我们把doc/html/里面的内容拷贝到data/input中的

// 这里我们要拼接url
url_head = "https://www.boost.org/doc/libs/1_83_0/doc/html";
url_tail = [data/input](删除) /accumulators.html
         => url_tail = /accumulators.html

url = url_head + url_tail ; 相当于形成了一个官网链接
</code></pre> 
  <p>下面就是我们的代码</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">ParseUrl</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>file_path<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>url<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token function">assert</span><span class="token punctuation">(</span>url<span class="token punctuation">)</span><span class="token punctuation">;</span>
    
  <span class="token comment">//  url_head = "https://www.boost.org/doc/libs/1_78_0/doc/html"</span>
  <span class="token comment">//  url_tail = "/accumulators.html"</span>
  std<span class="token double-colon punctuation">::</span>string url_head <span class="token operator">=</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html"</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>string url_tail <span class="token operator">=</span> file_path<span class="token punctuation">.</span><span class="token function">substr</span><span class="token punctuation">(</span>src_path<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token operator">*</span>url <span class="token operator">=</span> url_head <span class="token operator">+</span> url_tail<span class="token punctuation">;</span>

  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面我们测试验证一下,使用一个函数.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">ShowDoc</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo_t <span class="token operator">&</span>doc<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"title: "</span> <span class="token operator"><<</span> doc<span class="token punctuation">.</span>title <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"content: "</span> <span class="token operator"><<</span> doc<span class="token punctuation">.</span>content <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"url: "</span> <span class="token operator"><<</span> doc<span class="token punctuation">.</span>url <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">ParseHtml</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">&</span>file_list<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>DocInfo_t<span class="token operator">></span> <span class="token operator">*</span>results<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token function">assert</span><span class="token punctuation">(</span>results<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>file_path <span class="token operator">:</span> file_list<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// 1. 读取文件</span>
    std<span class="token double-colon punctuation">::</span>string result<span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">FileUtil</span><span class="token double-colon punctuation">::</span><span class="token function">ReadFile</span><span class="token punctuation">(</span>file_path<span class="token punctuation">,</span> <span class="token operator">&</span>result<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    DocInfo_t doc<span class="token punctuation">;</span>
    <span class="token comment">// 2. 提取title</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> <span class="token function">ParseTitle</span><span class="token punctuation">(</span>result<span class="token punctuation">,</span> <span class="token operator">&</span>doc<span class="token punctuation">.</span>title<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// 3. 提取content  本质时 去标签</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> <span class="token function">ParseContent</span><span class="token punctuation">(</span>result<span class="token punctuation">,</span> <span class="token operator">&</span>doc<span class="token punctuation">.</span>content<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// 4. 提取url</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> <span class="token function">ParseUrl</span><span class="token punctuation">(</span>file_path<span class="token punctuation">,</span> <span class="token operator">&</span>doc<span class="token punctuation">.</span>url<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// for debug</span>
    <span class="token function">ShowDoc</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// break;</span>
    <span class="token comment">// 到这里一定时完成了解析任务</span>
    results<span class="token operator">-></span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 右值引用</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这个是我们的测定结果.</p> 
  <pre><code>title: Struct template result<This(InputIterator, InputIterator)>
content: Struct template result<This(InputIterator, InputIterator)>HomeLibrariesPeopleFAQMoreStruct template result<This(InputIterator, InputIterator)>boost::proto::functional::distance::result<This(InputIterator, InputIterator)>Synopsis// In header: <boost/proto/functional/std/iterator.hpp>template<typename This, typename InputIterator> struct result<This(InputIterator, InputIterator)> {  // types  typedef typename std::iterator_traits<      typename boost::remove_const<        typename boost::remove_reference<InputIterator>::type      >::type    >::difference_type type;};Copyright © 2008 Eric Niebler        Distributed under the Boost Software License, Version 1.0. (See accompanying        file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)      
url: https://www.boost.org/doc/libs/1_83_0/doc/html/boost/proto/functional/distance/resu_1_3_32_5_26_2_1_1_2_4.html
</code></pre> 
  <p>我们拿到这个url去官网上看看是不是,我们发现是的.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/afb19f838ac94b9c9f9b617244ba27b4.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/afb19f838ac94b9c9f9b617244ba27b4.jpg" alt="Boost搜索引擎_第6张图片" width="650" height="255" style="border:1px solid black;"></a></p> 
  <h3><code>SaveHtml</code>实现</h3> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">SaveHtml</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>DocInfo_t<span class="token operator">></span> <span class="token operator">&</span>results<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>output<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <p>我们已经得到每一个文件的结构体了,下面我们开始保存文件到要求的文件中.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">SaveHtml</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>DocInfo_t<span class="token operator">></span> <span class="token operator">&</span>results<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>output<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">SEP</span> <span class="token string">"\3"</span></span>
  <span class="token comment">// 我们按照下面的方式,要知道我们把文档的内容去掉了\n</span>
  <span class="token comment">// title\3content\3url\n title\3content\3url\n title\3content\3url\n return true;</span>

  <span class="token comment">// explicit basic_ofstream (const char* filename,</span>
  <span class="token comment">//                       ios_base::openmode mode = ios_base::out);</span>
  std<span class="token double-colon punctuation">::</span>ofstream <span class="token function">out</span><span class="token punctuation">(</span>output<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>out <span class="token operator">|</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>out<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"打开文件失败 "</span> <span class="token operator"><<</span> output <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>e <span class="token operator">:</span> results<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string str <span class="token operator">=</span> e<span class="token punctuation">.</span>title<span class="token punctuation">;</span>
    str <span class="token operator">+=</span> SEP<span class="token punctuation">;</span>

    str <span class="token operator">+=</span> e<span class="token punctuation">.</span>content<span class="token punctuation">;</span>
    str <span class="token operator">+=</span> SEP<span class="token punctuation">;</span>

    str <span class="token operator">+=</span> e<span class="token punctuation">.</span>url<span class="token punctuation">;</span>
    str <span class="token operator">+=</span> <span class="token string">"\n"</span><span class="token punctuation">;</span>
    out<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>str<span class="token punctuation">.</span><span class="token function">c_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> str<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  out<span class="token punctuation">.</span><span class="token function">close</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这里验证是不是保存了.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/539b06fcc75847068f3982455af8a815.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/539b06fcc75847068f3982455af8a815.jpg" alt="Boost搜索引擎_第7张图片" width="650" height="106" style="border:1px solid black;"></a></p> 
  <p>这里我们验证下是不是保存完全了.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ls</span> ./data/input/ <span class="token parameter variable">-Rl</span> <span class="token operator">|</span> <span class="token function">grep</span> <span class="token parameter variable">-E</span> <span class="token string">"*.html"</span> <span class="token operator">|</span> <span class="token function">wc</span> <span class="token parameter variable">-l</span>
<span class="token number">8581</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">cat</span> ./data/raw_html/raw.txt <span class="token operator">|</span> <span class="token function">wc</span> <span class="token parameter variable">-l</span>
<span class="token number">8581</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <h1>建立索引</h1> 
  <p>下面我们就要建立索引的,建立索引实际上就是构建存储+搜索的数据结构,来加快我们对于关键字->文档ID->文档内容的搜索过程。根据上面谈的,我们建立正派索引和倒排索引.</p> 
  <h2>jieba安装与使用</h2> 
  <p>对于分词,这里我们使用cppjieba分词工具,我们执行下面的命令就可以了.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">git</span> clone https://github.com/yanyiwu/cppjieba.git
</code></pre> 
  <p>这里我们看一下cppjieba的具体内容.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ tree cppjieba/
cppjieba/
├── ChangeLog.md
├── CMakeLists.txt
├── deps
│   ├── CMakeLists.txt
│   ├── gtest
│   │   ├── CMakeLists.txt
│   │   ├── include
│   │   │   └── gtest
│   │   │       ├── gtest-death-test.h
│   │   │       ├── gtest.h
│   │   │       ├── gtest-message.h
│   │   │       ├── gtest-param-test.h
│   │   │       ├── gtest-param-test.h.pump
│   │   │       ├── gtest_pred_impl.h
│   │   │       ├── gtest-printers.h
│   │   │       ├── gtest_prod.h
│   │   │       ├── gtest-spi.h
│   │   │       ├── gtest-test-part.h
│   │   │       ├── gtest-typed-test.h
│   │   │       └── internal
│   │   │           ├── gtest-death-test-internal.h
│   │   │           ├── gtest-filepath.h
│   │   │           ├── gtest-internal.h
│   │   │           ├── gtest-linked_ptr.h
│   │   │           ├── gtest-param-util-generated.h
│   │   │           ├── gtest-param-util-generated.h.pump
│   │   │           ├── gtest-param-util.h
│   │   │           ├── gtest-port.h
│   │   │           ├── gtest-string.h
│   │   │           ├── gtest-tuple.h
│   │   │           ├── gtest-tuple.h.pump
│   │   │           ├── gtest-type-util.h
│   │   │           └── gtest-type-util.h.pump
│   │   └── src
│   │       ├── gtest-all.cc
│   │       ├── gtest.cc
│   │       ├── gtest-death-test.cc
│   │       ├── gtest-filepath.cc
│   │       ├── gtest-internal-inl.h
│   │       ├── gtest_main.cc
│   │       ├── gtest-port.cc
│   │       ├── gtest-printers.cc
│   │       ├── gtest-test-part.cc
│   │       └── gtest-typed-test.cc
│   └── limonp
├── dict
│   ├── hmm_model.utf8
│   ├── idf.utf8
│   ├── jieba.dict.utf8
│   ├── pos_dict
│   │   ├── char_state_tab.utf8
│   │   ├── prob_emit.utf8
│   │   ├── prob_start.utf8
│   │   └── prob_trans.utf8
│   ├── README.md
│   ├── stop_words.utf8
│   └── user.dict.utf8
├── include
│   └── cppjieba
│       ├── DictTrie.hpp
│       ├── FullSegment.hpp
│       ├── HMMModel.hpp
│       ├── HMMSegment.hpp
│       ├── Jieba.hpp
│       ├── KeywordExtractor.hpp
│       ├── MixSegment.hpp
│       ├── MPSegment.hpp
│       ├── PosTagger.hpp
│       ├── PreFilter.hpp
│       ├── QuerySegment.hpp
│       ├── SegmentBase.hpp
│       ├── SegmentTagged.hpp
│       ├── TextRankExtractor.hpp
│       ├── Trie.hpp
│       └── Unicode.hpp
├── LICENSE
├── README_EN.md
├── README.md
└── <span class="token builtin class-name">test</span>
    ├── CMakeLists.txt
    ├── demo.cpp
    ├── load_test.cpp
    ├── testdata
    │   ├── curl.res
    │   ├── extra_dict
    │   │   └── jieba.dict.small.utf8
    │   ├── gbk_dict
    │   │   ├── hmm_model.gbk
    │   │   └── jieba.dict.gbk
    │   ├── jieba.dict.0.1.utf8
    │   ├── jieba.dict.0.utf8
    │   ├── jieba.dict.1.utf8
    │   ├── jieba.dict.2.utf8
    │   ├── load_test.urls
    │   ├── review.100
    │   ├── review.100.res
    │   ├── server.conf
    │   ├── testlines.gbk
    │   ├── testlines.utf8
    │   ├── userdict.2.utf8
    │   ├── userdict.english
    │   ├── userdict.utf8
    │   └── weicheng.utf8
    └── unittest
        ├── CMakeLists.txt
        ├── gtest_main.cpp
        ├── jieba_test.cpp
        ├── keyword_extractor_test.cpp
        ├── pos_tagger_test.cpp
        ├── pre_filter_test.cpp
        ├── segments_test.cpp
        ├── textrank_test.cpp
        ├── trie_test.cpp
        └── unicode_test.cpp

<span class="token number">16</span> directories, <span class="token number">98</span> files
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里我们要关注的是两个文件.</p> 
  <ul> 
   <li>cppjieba/include : 我们的头文件</li> 
   <li>cppjiba/dict : 我们的字典</li> 
  </ul> 
  <blockquote> 
   <p>下面我们开始jiebba分词的使用,里面存在一个demo.cpp文件供我们测试在,这里我们把它拷贝到一个位置.</p> 
  </blockquote> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ <span class="token builtin class-name">pwd</span>
/home/qkj/install/cppjieba/test
<span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ ll
total <span class="token number">16</span>
-rw-rw-r--. <span class="token number">1</span> qkj qkj  <span class="token number">148</span> Sep  <span class="token number">9</span> 03:38 CMakeLists.txt
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">2797</span> Sep  <span class="token number">9</span> 03:38 demo.cpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">1532</span> Sep  <span class="token number">9</span> 03:38 load_test.cpp
drwxrwxr-x. <span class="token number">4</span> qkj qkj <span class="token number">4096</span> Sep  <span class="token number">9</span> 03:38 testdata
drwxrwxr-x. <span class="token number">2</span> qkj qkj  <span class="token number">255</span> Sep  <span class="token number">9</span> 03:38 unittest
<span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ <span class="token function">cp</span> demo.cpp <span class="token punctuation">..</span>/<span class="token punctuation">..</span>
<span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ <span class="token builtin class-name">cd</span> <span class="token punctuation">..</span>/<span class="token punctuation">..</span>/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">8</span>
drwxr-xr-x. <span class="token number">8</span> qkj qkj <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span>:40 boost_1_83_0
drwxrwxr-x. <span class="token number">8</span> qkj qkj  <span class="token number">215</span> Sep  <span class="token number">9</span> 03:38 cppjieba
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">2797</span> Sep  <span class="token number">9</span> 03:49 demo.cpp
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>首先,我们不能直接编译,它会报错.</p> 
  <pre><code class="prism language-cpp"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g<span class="token operator">++</span> demo<span class="token punctuation">.</span>cpp 
demo<span class="token punctuation">.</span>cpp<span class="token operator">:</span><span class="token number">1</span><span class="token operator">:</span><span class="token number">10</span><span class="token operator">:</span> fatal error<span class="token operator">:</span> cppjieba<span class="token operator">/</span>Jieba<span class="token punctuation">.</span>hpp<span class="token operator">:</span> No such file <span class="token operator">or</span> directory
 <span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cppjieba/Jieba.hpp"</span></span>
          <span class="token operator">^</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span>
compilation terminated<span class="token punctuation">.</span>
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这是因为我们这里的库和头文件的路径是不对的,这里添加软链接.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span>  cppjieba/include/ inc
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span>  cppjieba/dict/ dict
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">8</span>
drwxr-xr-x. <span class="token number">8</span> qkj qkj <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span>:40 boost_1_83_0
drwxrwxr-x. <span class="token number">8</span> qkj qkj  <span class="token number">215</span> Sep  <span class="token number">9</span> 03:38 cppjieba
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">2797</span> Sep  <span class="token number">9</span> 03:49 demo.cpp
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">14</span> Sep  <span class="token number">9</span> 03:50 dict -<span class="token operator">></span> cppjieba/dict/
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">17</span> Sep  <span class="token number">9</span> 03:50 inc -<span class="token operator">></span> cppjieba/include/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">cp</span> <span class="token parameter variable">-rf</span> cppjieba/deps/limonp/ cppjieba/include/cppjieba/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>下面我们要修改demo.cpp文件.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/ac95d7d621a54572883c91ab6a6b566a.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/ac95d7d621a54572883c91ab6a6b566a.jpg" alt="Boost搜索引擎_第8张图片" width="650" height="229" style="border:1px solid black;"></a></p> 
  <p>下面我们继续编译,我们发现还是出现错误.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g++ demo.cpp 
In <span class="token function">file</span> included from inc/cppjieba/Jieba.hpp:4,
                 from demo.cpp:1:
inc/cppjieba/QuerySegment.hpp:7:10: fatal error: limonp/Logging.hpp: No such <span class="token function">file</span> or directory
 <span class="token comment">#include "limonp/Logging.hpp"</span>
          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
</code></pre> 
  <p>这是因为cppjieba/deps/limonp实际上是空文件夹</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token builtin class-name">cd</span>  cppjieba/include/cppjieba/limonp/
<span class="token punctuation">[</span>qkj@localhost limonp<span class="token punctuation">]</span>$ ll
total <span class="token number">0</span>
<span class="token punctuation">[</span>qkj@localhost limonp<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里需要我们手动去下载这个目录.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">git</span> clone https://github.com/yanyiwu/limonp.git
</code></pre> 
  <p>然后把我们下载好的目录拷贝到cppjieba/deps/limonp,然后重新拷贝到cppjieba/include/cppjieba/.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">cp</span> <span class="token parameter variable">-rf</span> limonp/include/limonp/ cppjieba/deps/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">cp</span> <span class="token parameter variable">-rf</span> cppjieba/deps/limonp/ cppjieba/include/cppjieba/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这样就可以了,我们这里编译一下.</p> 
  <pre><code class="prism language-cpp"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g<span class="token operator">++</span> demo<span class="token punctuation">.</span>cpp <span class="token operator">-</span>std<span class="token operator">=</span>c<span class="token operator">++</span><span class="token number">11</span>
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">480</span>
<span class="token operator">-</span>rwxrwxr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj <span class="token number">482896</span> Sep  <span class="token number">9</span> <span class="token number">05</span><span class="token operator">:</span><span class="token number">50</span> a<span class="token punctuation">.</span>out
drwxr<span class="token operator">-</span>xr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">8</span> qkj qkj   <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span><span class="token operator">:</span><span class="token number">40</span> boost_1_83_0
drwxrwxr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">8</span> qkj qkj    <span class="token number">215</span> Sep  <span class="token number">9</span> <span class="token number">03</span><span class="token operator">:</span><span class="token number">38</span> cppjieba
<span class="token operator">-</span>rw<span class="token operator">-</span>rw<span class="token operator">-</span>r<span class="token operator">--</span><span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj   <span class="token number">2852</span> Sep  <span class="token number">9</span> <span class="token number">05</span><span class="token operator">:</span><span class="token number">28</span> demo<span class="token punctuation">.</span>cpp
lrwxrwxrwx<span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj     <span class="token number">14</span> Sep  <span class="token number">9</span> <span class="token number">03</span><span class="token operator">:</span><span class="token number">50</span> dict <span class="token operator">-></span> cppjieba<span class="token operator">/</span>dict<span class="token operator">/</span>
lrwxrwxrwx<span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj     <span class="token number">17</span> Sep  <span class="token number">9</span> <span class="token number">03</span><span class="token operator">:</span><span class="token number">50</span> inc <span class="token operator">-></span> cppjieba<span class="token operator">/</span>include<span class="token operator">/</span>
drwxrwxr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">6</span> qkj qkj    <span class="token number">171</span> Sep  <span class="token number">9</span> <span class="token number">05</span><span class="token operator">:</span><span class="token number">46</span> limonp
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token punctuation">.</span><span class="token operator">/</span>a<span class="token punctuation">.</span>out 
他来到了网易杭研大厦
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> Cut With HMM
他<span class="token operator">/</span>来到<span class="token operator">/</span>了<span class="token operator">/</span>网易<span class="token operator">/</span>杭研<span class="token operator">/</span>大厦
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> Cut Without HMM 
他<span class="token operator">/</span>来到<span class="token operator">/</span>了<span class="token operator">/</span>网易<span class="token operator">/</span>杭<span class="token operator">/</span>研<span class="token operator">/</span>大厦
我来到北京清华大学
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> CutAll
我<span class="token operator">/</span>来到<span class="token operator">/</span>北京<span class="token operator">/</span>清华<span class="token operator">/</span>清华大学<span class="token operator">/</span>华大<span class="token operator">/</span>大学
小明硕士毕业于中国科学院计算所,后在日本京都大学深造
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> CutForSearch
小明<span class="token operator">/</span>硕士<span class="token operator">/</span>毕业<span class="token operator">/</span>于<span class="token operator">/</span>中国<span class="token operator">/</span>科学<span class="token operator">/</span>学院<span class="token operator">/</span>科学院<span class="token operator">/</span>中国科学院<span class="token operator">/</span>计算<span class="token operator">/</span>计算所<span class="token operator">/</span>,<span class="token operator">/</span>后<span class="token operator">/</span>在<span class="token operator">/</span>日本<span class="token operator">/</span>京都<span class="token operator">/</span>大学<span class="token operator">/</span>日本京都大学<span class="token operator">/</span>深造
</code></pre> 
  <h2>索引框架</h2> 
  <p>下面我们创建一个文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> index.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">124</span>
drwxrwxr-x. <span class="token number">4</span> qkj qkj     <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
-rw-rw-r--. <span class="token number">1</span> qkj qkj      <span class="token number">0</span> Sep  <span class="token number">9</span> 02:48 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">117</span> Sep  <span class="token number">9</span> 01:41 Makefile
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">110008</span> Sep  <span class="token number">9</span> 02:48 parser
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">783</span> Sep  <span class="token number">9</span> 02:48 util.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里我们需要明确是我们要建立正排和倒排索引.并且我们还要提供一个两个查找的接口.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">namespace</span> ns_index
<span class="token punctuation">{</span>
  <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string title<span class="token punctuation">;</span>   <span class="token comment">// 文档标题</span>
    std<span class="token double-colon punctuation">::</span>string content<span class="token punctuation">;</span> <span class="token comment">// 文档内容</span>
    std<span class="token double-colon punctuation">::</span>string url<span class="token punctuation">;</span>     <span class="token comment">// 官网url</span>

    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦的id 暂时不做理解</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">/// @brief 作为倒排索引的辅助</span>
  <span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span>  <span class="token comment">// 文旦id</span>
    std<span class="token double-colon punctuation">::</span>string word<span class="token punctuation">;</span> <span class="token comment">// 关键字</span>
    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>       <span class="token comment">// 权重 -->后面解释</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">// 倒排拉链  -- 根据用一个关键字 来拿到一组的InvertedElem</span>
  <span class="token keyword">typedef</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span><span class="token operator">></span> InvertedList<span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Index</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>

    <span class="token comment">/// @brief 根据doc_id来获取正派索引 ,也就是文旦内容</span>
    <span class="token comment">/// @param doc_id  文旦id</span>
    <span class="token comment">/// @return 返回文档结构体的地址</span>
    <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span> <span class="token operator">*</span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据关键字 获取倒排拉链</span>
    <span class="token comment">/// @param word 关键</span>
    <span class="token comment">/// @return</span>
    InvertedList <span class="token operator">*</span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
		<span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据目录 文件 构建 正派和倒排索引,这里是最重的一步</span>
    <span class="token comment">/// @param src_path 去标签后目录文件目录</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 建立正排</span>
      <span class="token comment">// 建立倒排</span>
      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
      
    <span class="token comment">/// @brief 根据字符串建立正派索引  也就是根据文旦id找到 文档内容</span>
    <span class="token comment">/// @param line 一个字符串,该字符串保留一个html文档的所有内容</span>
    <span class="token comment">/// @return</span>
    DocInfo <span class="token operator">*</span><span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>line<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
<span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">// 这两个结构不暴露给外部</span>
    <span class="token comment">/// @brief 根据一个文档内容的结构体建立倒排索引,需要经行分词 </span>
    <span class="token comment">/// @param doc  这个是一个结构体</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo <span class="token operator">&</span>doc<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">// 正排索引 -- 根据vector下标可以更加高效作为id找到内容</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">DocInfo</span><span class="token operator">></span> forward_index<span class="token punctuation">;</span>
    <span class="token comment">// 倒排索引 一个关键字 可能在很多的文档中出现,一定是一个关键字和一组InvertedElem对应</span>
    std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> InvertedList<span class="token operator">></span> inverted_index<span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面我们依次实现这里面的函数.</p> 
  <h3>BuildIndex 构建索引</h3> 
  <pre><code class="prism language-cpp"><span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <p>这个是根据我们已经清洗好的数据,通过它来构建索引.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  std<span class="token double-colon punctuation">::</span>ifstream <span class="token function">in</span><span class="token punctuation">(</span>src_path<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>in <span class="token operator">|</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>in<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"文件目录 "</span> <span class="token operator"><<</span> src_path <span class="token operator"><<</span> <span class="token string">"无效"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">int</span> count <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span> <span class="token comment">// 他的作用是让我们看到构建索引的过程</span>
  std<span class="token double-colon punctuation">::</span>string line<span class="token punctuation">;</span> 
  <span class="token keyword">while</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>in<span class="token punctuation">,</span> line<span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// 此时我们已经提取到每一个html内容了</span>
    <span class="token comment">// 建立正派索引</span>
    DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> <span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span>line<span class="token punctuation">)</span><span class="token punctuation">;</span> 
    
    <span class="token keyword">if</span> <span class="token punctuation">(</span>doc <span class="token operator">==</span> <span class="token keyword">nullptr</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"建立一个正派索引失败"</span> <span class="token operator"><<</span> line <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 建立 倒排索引</span>
    <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token operator">*</span>doc<span class="token punctuation">)</span><span class="token punctuation">;</span>
    count<span class="token operator">++</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span>count <span class="token operator">%</span> <span class="token number">50</span> <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 后期加上一个进度条</span>
       std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"当前已经处理了 索引文档 "</span> <span class="token operator"><<</span> count <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h4>建立正排索引</h4> 
  <p>这个是在是太好实现了,我们数组下标天然是我们的文档ID,只需要把清晰后每一个文档的内容处理成结构体,然后添加到数组中就可以了.</p> 
  <pre><code class="prism language-cpp"><span class="token comment">/// @brief 根据字符串建立正派索引  也就是根据文旦id找到 文档内容</span>
<span class="token comment">/// @param line 一个字符串,该字符串保留一个html文档的所有内容</span>
<span class="token comment">/// @return</span>
DocInfo <span class="token operator">*</span><span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>line<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// title\3content\3url\n</span>

  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> results<span class="token punctuation">;</span>
  <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string sep <span class="token operator">=</span> <span class="token string">"\3"</span><span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">StringUtil</span><span class="token double-colon punctuation">::</span><span class="token function">Split</span><span class="token punctuation">(</span>line<span class="token punctuation">,</span> <span class="token operator">&</span>results<span class="token punctuation">,</span> sep<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是工具集里面切分字符串</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>results<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token number">3</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>

  DocInfo doc<span class="token punctuation">;</span>
  doc<span class="token punctuation">.</span>title <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
  doc<span class="token punctuation">.</span>content <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
  doc<span class="token punctuation">.</span>url <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
  <span class="token comment">// 文档id,就是数组下标</span>
  doc<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 注意这里是 正派拉链</span>

  forward_index<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>把工具集里面的代码写一下.</p> 
  <pre><code class="prism language-cpp"><span class="token comment">/// @brief 字符串切分</span>
<span class="token keyword">class</span> <span class="token class-name">StringUtil</span>
<span class="token punctuation">{</span>
<span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">Split</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>target<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string sep<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token function">assert</span><span class="token punctuation">(</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token comment">// 我们这里使用现成的切分函数</span>
      boost<span class="token double-colon punctuation">::</span><span class="token function">split</span><span class="token punctuation">(</span><span class="token operator">*</span>out<span class="token punctuation">,</span> target<span class="token punctuation">,</span> boost<span class="token double-colon punctuation">::</span><span class="token function">is_any_of</span><span class="token punctuation">(</span>sep<span class="token punctuation">)</span><span class="token punctuation">,</span>
                   boost<span class="token double-colon punctuation">::</span>token_compress_on<span class="token punctuation">)</span><span class="token punctuation">;</span>
	<span class="token punctuation">}</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>
</code></pre> 
  <h4>建立倒排索引</h4> 
  <p>下面我们开始根据最新的结构体建立倒排索引.这里我们需要分词.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">struct</span> <span class="token class-name">word_cnt</span>
<span class="token punctuation">{</span>
  <span class="token keyword">int</span> title_cnt<span class="token punctuation">;</span>
  <span class="token keyword">int</span> content_cnt<span class="token punctuation">;</span>
  <span class="token function">word_cnt</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">title_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">content_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>

<span class="token keyword">bool</span> <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo <span class="token operator">&</span>doc<span class="token punctuation">)</span>
<span class="token punctuation">{</span>

  <span class="token comment">// 用来暂存 词频</span>
  std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> word_cnt<span class="token operator">></span> word_map<span class="token punctuation">;</span>
  
  <span class="token comment">// 1.对标题 分词</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> title_words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>title<span class="token punctuation">,</span> <span class="token operator">&</span>title_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token comment">// 不区分大小写</span>
  <span class="token comment">// 那么用户也不因该区分大小写</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> title_words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> 
    word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>title_cnt<span class="token operator">++</span><span class="token punctuation">;</span> <span class="token comment">// 解释一下</span>
  <span class="token punctuation">}</span>

    
  <span class="token comment">// 对文档内容分词</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> content_words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token operator">&</span>content_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> s <span class="token operator">:</span> content_words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
    word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>content_cnt<span class="token operator">++</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  
  <span class="token comment">// 到这里每一个词都有它的在标题和内容中出现的次数</span>
   
  <span class="token comment">// 3 构建倒排拉链</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>word_pair <span class="token operator">:</span> word_map<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">/*
    struct InvertedElem
    {
        uint64_t doc_id;  // 文旦id
        std::string word; // 关键字
        int weight;       // 权重 -->后面解释
    };
    */</span>
    
    InvertedElem item<span class="token punctuation">;</span> 
    
    item<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> doc<span class="token punctuation">.</span>doc_id<span class="token punctuation">;</span> <span class="token comment">// 这里解释了上面我们为何添加了id</span>
    item<span class="token punctuation">.</span>word <span class="token operator">=</span> word_pair<span class="token punctuation">.</span>first<span class="token punctuation">;</span>
    item<span class="token punctuation">.</span>weight <span class="token operator">=</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span>word_pair<span class="token punctuation">.</span>second<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是计算权重的</span>
    
    
    <span class="token comment">// 加入倒排拉链中</span>
    <span class="token comment">// typedef std::vector<struct InvertedElem> InvertedList;</span>
    <span class="token comment">// std::unordered_map<std::string, InvertedList> inverted_index;</span>
    InvertedList <span class="token operator">&</span>inverted_list <span class="token operator">=</span> inverted_index<span class="token punctuation">[</span>word_pair<span class="token punctuation">.</span>first<span class="token punctuation">]</span><span class="token punctuation">;</span>
    inverted_list<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>item<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h5>引入jieba</h5> 
  <p>由于倒排索引需要分词,这里我们引入jiebe,这里我们把切分字符串写成一个工具.这是使用软链接.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /home/qkj/install/cppjieba/include/cppjieba cppjieba
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /home/qkj/install/cppjieba/dict/ dict
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">24</span>
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">43</span> Sep  <span class="token number">9</span> 06:00 cppjieba -<span class="token operator">></span> /home/qkj/install/cppjieba/include/cppjieba
drwxrwxr-x. <span class="token number">4</span> qkj qkj   <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">32</span> Sep  <span class="token number">9</span> 06:01 dict -<span class="token operator">></span> /home/qkj/install/cppjieba/dict/
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">6379</span> Sep  <span class="token number">9</span> 03:15 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj  <span class="token number">117</span> Sep  <span class="token number">9</span> 01:41 Makefile
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">1199</span> Sep  <span class="token number">9</span> 03:15 util.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里就可以编写我们的切词工具了.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> DICT_PATH <span class="token operator">=</span> <span class="token string">"./dict/jieba.dict.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> HMM_PATH <span class="token operator">=</span> <span class="token string">"./dict/hmm_model.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> USER_DICT_PATH <span class="token operator">=</span> <span class="token string">"./dict/user.dict.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> IDF_PATH <span class="token operator">=</span> <span class="token string">"./dict/idf.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> STOP_WORD_PATH <span class="token operator">=</span> <span class="token string">"./dict/stop_words.utf8"</span><span class="token punctuation">;</span>

<span class="token comment">/// @brief 这是一个jieba分词</span>
<span class="token keyword">class</span> <span class="token class-name">JiebaUtil</span>
<span class="token punctuation">{</span>
<span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">CutString</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
    	<span class="token function">assert</span><span class="token punctuation">(</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
    	jieba<span class="token punctuation">.</span><span class="token function">CutForSearch</span><span class="token punctuation">(</span>src<span class="token punctuation">,</span> <span class="token operator">*</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
	<span class="token punctuation">}</span>
<span class="token keyword">private</span><span class="token operator">:</span>
	<span class="token keyword">static</span> cppjieba<span class="token double-colon punctuation">::</span>Jieba jieba<span class="token punctuation">;</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>
cppjieba<span class="token double-colon punctuation">::</span>Jieba <span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">jieba</span><span class="token punctuation">(</span>DICT_PATH<span class="token punctuation">,</span> HMM_PATH<span class="token punctuation">,</span> USER_DICT_PATH<span class="token punctuation">,</span> IDF_PATH<span class="token punctuation">,</span> STOP_WORD_PATH<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <h5>权重计算</h5> 
  <p>先来解释一下什么是权重,可以这么理解.对于搜索频率高的单词,我们认为它的权重高.同时对一个文档,如果关键字出现的次数越多,起权重越大.这里我么权重结算简单些.</p> 
  <pre><code class="prism language-cpp">    <span class="token keyword">int</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">struct</span> <span class="token class-name">word_cnt</span> <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">X</span> <span class="token expression"><span class="token number">10</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">Y</span> <span class="token expression"><span class="token number">1</span></span></span>
      <span class="token keyword">return</span> X <span class="token operator">*</span> word<span class="token punctuation">.</span>title_cnt <span class="token operator">+</span> Y <span class="token operator">*</span> word<span class="token punctuation">.</span>content_cnt<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
</code></pre> 
  <p>那么权重有什么作用呢?这里可以等我们搜索的时候,一个关键字可以对应多个文档,那么此时我们可以把权重高的放在前面.</p> 
  <p>现在我们的结构是这样的.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/ccc26dd4848f41eb87fd09db422de84b.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/ccc26dd4848f41eb87fd09db422de84b.jpg" alt="Boost搜索引擎_第9张图片" width="650" height="250" style="border:1px solid black;"></a></p> 
  <h3><code>GetForwardIndex</code></h3> 
  <p>这个是根据文档的id找到文档的内容.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">struct</span> <span class="token class-name">DocInfo</span> <span class="token operator">*</span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>doc_id <span class="token operator"><</span> <span class="token number">0</span> <span class="token operator">||</span> doc_id <span class="token operator">>=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"索引id "</span> <span class="token operator"><<</span> doc_id <span class="token operator"><<</span> <span class="token string">" 越界了"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>doc_id<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h3><code>GetInvertedList</code></h3> 
  <p>这个是根据关键字拿到倒排拉链.</p> 
  <pre><code class="prism language-cpp">InvertedList <span class="token operator">*</span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token keyword">auto</span> it <span class="token operator">=</span> inverted_index<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>it <span class="token operator">==</span> inverted_index<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"关键字 "</span> <span class="token operator"><<</span> word <span class="token operator"><<</span> <span class="token string">" 不存在"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>it<span class="token operator">-></span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这里还剩下一个小工作,后面我们把index设置为单例模式.</p> 
  <h2>设置成单例</h2> 
  <p>下面我们把index设置成单例模式,一来,我们其实在boost搜索引擎项目当中,事实上不需要建立多个Index索引对象,只需要建立一个索引对象就可以完成查找工作了二来,我们建立一个索引对象的成本事实上是极高的,因为我们需要将所有的网页信息分词,统计,填充,插入,效率上会受极大损失。</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">namespace</span> ns_index
<span class="token punctuation">{</span>
  <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string title<span class="token punctuation">;</span>   <span class="token comment">// 文档标题</span>
    std<span class="token double-colon punctuation">::</span>string content<span class="token punctuation">;</span> <span class="token comment">// 文档内容</span>
    std<span class="token double-colon punctuation">::</span>string url<span class="token punctuation">;</span>     <span class="token comment">// 官网url</span>

    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦的id 暂时不做理解</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">/// @brief 作为倒排索引的辅助</span>
  <span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span>  <span class="token comment">// 文旦id</span>
    std<span class="token double-colon punctuation">::</span>string word<span class="token punctuation">;</span> <span class="token comment">// 关键字</span>
    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>       <span class="token comment">// 权重</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">// 倒排拉链  -- 根据用一个关键字 来拿到一组的InvertedElem</span>
  <span class="token keyword">typedef</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span><span class="token operator">></span> InvertedList<span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Index</span>
  <span class="token punctuation">{</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token function">Index</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token function">Index</span><span class="token punctuation">(</span><span class="token keyword">const</span> Index <span class="token operator">&</span><span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">delete</span><span class="token punctuation">;</span>
    Index <span class="token operator">&</span><span class="token keyword">operator</span><span class="token operator">=</span><span class="token punctuation">(</span><span class="token keyword">const</span> Index <span class="token operator">&</span><span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">delete</span><span class="token punctuation">;</span>
    <span class="token keyword">static</span> Index <span class="token operator">*</span>instance<span class="token punctuation">;</span>
    <span class="token keyword">static</span> std<span class="token double-colon punctuation">::</span>mutex mtx<span class="token punctuation">;</span>

  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token operator">~</span><span class="token function">Index</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
    <span class="token punctuation">}</span>
    <span class="token keyword">static</span> Index <span class="token operator">*</span><span class="token function">GetInstance</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 线程不安全,加锁</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> instance<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        mtx<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>instance <span class="token operator">==</span> <span class="token keyword">nullptr</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          instance <span class="token operator">=</span> <span class="token keyword">new</span> Index<span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        mtx<span class="token punctuation">.</span><span class="token function">unlock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">return</span> instance<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据doc_id来获取正派索引 ,也就是文旦内容</span>
    <span class="token comment">/// @param doc_id  文旦id</span>
    <span class="token comment">/// @return 返回文档结构体的地址</span>
    <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span> <span class="token operator">*</span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>doc_id <span class="token operator"><</span> <span class="token number">0</span> <span class="token operator">||</span> doc_id <span class="token operator">>=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"索引id "</span> <span class="token operator"><<</span> doc_id <span class="token operator"><<</span> <span class="token string">" 越界了"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>doc_id<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据关键字 获取倒排拉链</span>
    <span class="token comment">/// @param word 关键</span>
    <span class="token comment">/// @return</span>
    InvertedList <span class="token operator">*</span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">auto</span> it <span class="token operator">=</span> inverted_index<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>it <span class="token operator">==</span> inverted_index<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"关键字 "</span> <span class="token operator"><<</span> word <span class="token operator"><<</span> <span class="token string">" 不存在"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>it<span class="token operator">-></span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据目录 文件 构建 正派和倒排索引,这里是最重的一步</span>
    <span class="token comment">/// @param src_path 去标签后目录文件目录</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      std<span class="token double-colon punctuation">::</span>ifstream <span class="token function">in</span><span class="token punctuation">(</span>src_path<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>in <span class="token operator">|</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">if</span> <span class="token punctuation">(</span>in<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"文件目录 "</span> <span class="token operator"><<</span> src_path <span class="token operator"><<</span> <span class="token string">"无效"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">int</span> count <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
      std<span class="token double-colon punctuation">::</span>string line<span class="token punctuation">;</span>
      <span class="token keyword">while</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>in<span class="token punctuation">,</span> line<span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 此时我们已经提取到每一个html内容了</span>
        <span class="token comment">// 建立正派索引</span>
        DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> <span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span>line<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>doc <span class="token operator">==</span> <span class="token keyword">nullptr</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"建立一个正派索引失败"</span> <span class="token operator"><<</span> line <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
          <span class="token keyword">continue</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>

        <span class="token comment">// 建立 倒排索引</span>
        <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token operator">*</span>doc<span class="token punctuation">)</span><span class="token punctuation">;</span>
        count<span class="token operator">++</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>count <span class="token operator">%</span> <span class="token number">50</span> <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token comment">// 后期加上一个进度条</span>
          <span class="token comment">// LOG(NORMAL, "当前已经处理了 " + std::to_string(count) + " 个文档");</span>
          std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"当前已经处理了 索引文档 "</span> <span class="token operator"><<</span> count <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">/// @brief 根据字符串建立正派索引  也就是根据文旦id找到 文档内容</span>
    <span class="token comment">/// @param line 一个字符串,该字符串保留一个html文档的所有内容</span>
    <span class="token comment">/// @return</span>
    DocInfo <span class="token operator">*</span><span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>line<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// title\3content\3url\n</span>

      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> results<span class="token punctuation">;</span>
      <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string sep <span class="token operator">=</span> <span class="token string">"\3"</span><span class="token punctuation">;</span>
      
     ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">StringUtil</span><span class="token double-colon punctuation">::</span><span class="token function">Split</span><span class="token punctuation">(</span>line<span class="token punctuation">,</span> <span class="token operator">&</span>results<span class="token punctuation">,</span> sep<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">if</span> <span class="token punctuation">(</span>results<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token number">3</span><span class="token punctuation">)</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>

      DocInfo doc<span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>title <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>content <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>url <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 注意这里是 正派拉链</span>

      forward_index<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 为了词频统计</span>
    <span class="token keyword">struct</span> <span class="token class-name">word_cnt</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">int</span> title_cnt<span class="token punctuation">;</span>
      <span class="token keyword">int</span> content_cnt<span class="token punctuation">;</span>
      <span class="token function">word_cnt</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">title_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">content_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token punctuation">}</span><span class="token punctuation">;</span>

    <span class="token comment">/// @brief 根据一个文档内容的结构体建立倒排索引,需要经行分词  --</span>
    <span class="token comment">/// @param doc  这个是一个结构体</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo <span class="token operator">&</span>doc<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>

      <span class="token comment">// 用来暂存 词频</span>
      std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> word_cnt<span class="token operator">></span> word_map<span class="token punctuation">;</span>
      <span class="token comment">// 1.对标题 分词</span>
      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> title_words<span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>title<span class="token punctuation">,</span> <span class="token operator">&</span>title_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token comment">// 不区分大小写</span>
      <span class="token comment">// 那么用户也不因该区分大小写</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> title_words<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
        word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>title_cnt<span class="token operator">++</span><span class="token punctuation">;</span> <span class="token comment">// 解释一下</span>
      <span class="token punctuation">}</span>

      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> content_words<span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token operator">&</span>content_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> s <span class="token operator">:</span> content_words<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
        word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>content_cnt<span class="token operator">++</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token comment">// 3 构建倒排拉链</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>word_pair <span class="token operator">:</span> word_map<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        InvertedElem item<span class="token punctuation">;</span>
        item<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> doc<span class="token punctuation">.</span>doc_id<span class="token punctuation">;</span> <span class="token comment">// 这里解释了上面我们为何添加了id</span>
        item<span class="token punctuation">.</span>word <span class="token operator">=</span> word_pair<span class="token punctuation">.</span>first<span class="token punctuation">;</span>
        item<span class="token punctuation">.</span>weight <span class="token operator">=</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span>word_pair<span class="token punctuation">.</span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>

        <span class="token comment">// 加入倒排拉链中</span>
        InvertedList <span class="token operator">&</span>inverted_list <span class="token operator">=</span> inverted_index<span class="token punctuation">[</span>word_pair<span class="token punctuation">.</span>first<span class="token punctuation">]</span><span class="token punctuation">;</span>
        inverted_list<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>item<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">/// @brief 构建权重</span>
    <span class="token comment">/// @param word</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">int</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">struct</span> <span class="token class-name">word_cnt</span> <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">X</span> <span class="token expression"><span class="token number">10</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">Y</span> <span class="token expression"><span class="token number">1</span></span></span>
      <span class="token keyword">return</span> X <span class="token operator">*</span> word<span class="token punctuation">.</span>title_cnt <span class="token operator">+</span> Y <span class="token operator">*</span> word<span class="token punctuation">.</span>content_cnt<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">// 正排索引 -- 根据vector下标可以更加高效作为id找到内容</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">DocInfo</span><span class="token operator">></span>
        forward_index<span class="token punctuation">;</span>
    <span class="token comment">// 倒排索引 一个关键字 可能在很多的文档中出现,一定是一个关键字和一组InvertedElem对应</span>
    std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> InvertedList<span class="token operator">></span> inverted_index<span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  Index <span class="token operator">*</span>Index<span class="token double-colon punctuation">::</span>instance <span class="token operator">=</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>mutex Index<span class="token double-colon punctuation">::</span>mtx<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h1>搜索引擎模块</h1> 
  <p>下面我们开始编写搜索模块,这里我们先来写出基本代码结构.我们也创建一个文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> searcher.hpp 
</code></pre> 
  <p>下面是我们的框架.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">namespace</span> ns_searcher
<span class="token punctuation">{</span>

  <span class="token keyword">struct</span> <span class="token class-name">InvertedElemPrint</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦id</span>

    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>                     <span class="token comment">// 权重</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span> <span class="token comment">// 关键字></span>
    <span class="token function">InvertedElemPrint</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">doc_id</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">weight</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Searcher</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token operator">~</span><span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token comment">//input 这个是我们去标签后面的文件</span>
    <span class="token keyword">void</span> <span class="token function">InitSearcher</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>input<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token comment">// 1. 获取index</span>
        <span class="token comment">// 2. 根绝index建立索引</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// query: 这个是我们要搜索的词或者是语句</span>
    <span class="token comment">// json_string: 这个是我们结果,是一个json串</span>
    <span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token comment">//1. 分词 我们的搜索的语句,注意转成小写</span>
        <span class="token comment">//2. 根据关键字,拿到倒排拉链,</span>
        <span class="token comment">//3. 合并排序: 根据我们的结果按照权重进行降序排序</span>
        <span class="token comment">//4. 构建json串</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    ns_index<span class="token double-colon punctuation">::</span>Index <span class="token operator">*</span>index<span class="token punctuation">;</span> <span class="token comment">// 提供系统经行查找索引</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>InitSearcher</h2> 
  <p>这个是我们初始化的工作,一共两个内容.</p> 
  <ul> 
   <li>拿到index对象</li> 
   <li>根据index建立索引</li> 
  </ul> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">InitSearcher</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>input<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 获取创建index对象</span>
  index <span class="token operator">=</span> ns_index<span class="token double-colon punctuation">::</span><span class="token class-name">Index</span><span class="token double-colon punctuation">::</span><span class="token function">GetInstance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// std::cout << "获取单例成功" << std::endl;</span>
  <span class="token comment">//  根据index对象建立索引</span>
  index<span class="token operator">-></span><span class="token function">BuildIndex</span><span class="token punctuation">(</span>input<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// std::cout << "建立正派倒排索引成功" << std::endl;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>Search</h2> 
  <p>这个是我们查找实现的具体流程.我们输入我们想要查找的内容,下面是我们函数的流程</p> 
  <ul> 
   <li>切分输入的内容,小写的保存在数组中</li> 
   <li>根据额数组的每一个元素,拿到倒排拉链,然后把所有的倒排拉量的内容保存在一个拉链中</li> 
   <li>我们以降序的方式排序整个拉链</li> 
   <li>根据拉链的id找到文档内容,构建json串</li> 
  </ul> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 1 分词  先来分词后面在进行查找</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>words<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 2 根据分词结果依次触发  搜索</span>
  ns_index<span class="token double-colon punctuation">::</span>InvertedList inverted_list_all<span class="token punctuation">;</span> <span class="token comment">// 保存所有的倒排拉链里面的内容</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 建立索引的时候是忽略大小写的,我们搜索的时候也需要</span>

    <span class="token comment">// 先查倒排</span>
    ns_index<span class="token double-colon punctuation">::</span>InvertedList <span class="token operator">*</span>inverted_list <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> inverted_list<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 此时找到了 保存所有的 拉链里面的值</span>
    <span class="token comment">// 不完美 一个词可能和多个文档相关 一个文档可以和多个关键词相关.</span>
    inverted_list_all<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  std<span class="token double-colon punctuation">::</span><span class="token function">sort</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
            <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e1<span class="token punctuation">,</span> <span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e2<span class="token punctuation">)</span>
            <span class="token punctuation">{</span>
              <span class="token keyword">return</span> e1<span class="token punctuation">.</span>weight <span class="token operator">></span> e2<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
            <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
<span class="token punctuation">}</span>
<span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <p>上面我们的实现有一个完美的地方,我们知道一个词可以映射到多个文档的id,那么多个关键字映射的文档id,就有可能进行冲突.例如下面的例子.</p> 
  <table> 
   <thead> 
    <tr> 
     <th>关键字</th> 
     <th>文档ID</th> 
     <th></th> 
    </tr> 
   </thead> 
   <tbody> 
    <tr> 
     <td>你好</td> 
     <td>1, 2</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>我</td> 
     <td>1, 2</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>是</td> 
     <td>1, 2</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>大学生</td> 
     <td>1</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>社会人</td> 
     <td>2</td> 
     <td></td> 
    </tr> 
   </tbody> 
  </table> 
  <blockquote> 
   <p>我们把"你好,我"进行分词,然后得到拉链,放在总拉链里面,这就是[文档1, 文档2,文档1, 文档2],这我们后期弥补.</p> 
  </blockquote> 
  <h3>jsoncpp安装与使用</h3> 
  <p>下面我们需要说一下<code>jsoncpp</code>的安装与使用.毕竟我们这里要构建json串.json是序列化和反序列化的.</p> 
  <pre><code class="prism language-cpp"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ sudo yum install <span class="token operator">-</span>y jsoncpp<span class="token operator">-</span>devel
</code></pre> 
  <p>下面我们使用一下json.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">touch</span> test.cc
</code></pre> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><iostream></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><string></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><jsoncpp/json/json.h></span></span>

<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
  
  Json<span class="token double-colon punctuation">::</span>Value item1<span class="token punctuation">;</span>
  item1<span class="token punctuation">[</span><span class="token string">"key1"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value11"</span><span class="token punctuation">;</span>
  item1<span class="token punctuation">[</span><span class="token string">"key2"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value22"</span><span class="token punctuation">;</span>

  Json<span class="token double-colon punctuation">::</span>Value item2<span class="token punctuation">;</span>
  item2<span class="token punctuation">[</span><span class="token string">"key1"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value1"</span><span class="token punctuation">;</span>
  item2<span class="token punctuation">[</span><span class="token string">"key2"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value2"</span><span class="token punctuation">;</span>

  root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>item1<span class="token punctuation">)</span><span class="token punctuation">;</span>
  root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>item2<span class="token punctuation">)</span><span class="token punctuation">;</span>

  Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>string s <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> s <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面就是我们的结果.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g++ test.cc  <span class="token parameter variable">-ljsoncpp</span>
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ./a.out 
<span class="token punctuation">[</span>
   <span class="token punctuation">{</span>
      <span class="token string">"key1"</span> <span class="token builtin class-name">:</span> <span class="token string">"value11"</span>,
      <span class="token string">"key2"</span> <span class="token builtin class-name">:</span> <span class="token string">"value22"</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">{</span>
      <span class="token string">"key1"</span> <span class="token builtin class-name">:</span> <span class="token string">"value1"</span>,
      <span class="token string">"key2"</span> <span class="token builtin class-name">:</span> <span class="token string">"value2"</span>
   <span class="token punctuation">}</span>
<span class="token punctuation">]</span>

<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>下面我们继续编写这个代码.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 1 分词  先来分词后面在进行查找</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>words<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 2 根据分词结果依次触发  搜索</span>
  ns_index<span class="token double-colon punctuation">::</span>InvertedList inverted_list_all<span class="token punctuation">;</span> <span class="token comment">// 保存所有的倒排拉链里面的内容</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 建立索引的时候是忽略大小写的,我们搜索的时候也需要</span>

    <span class="token comment">// 先查倒排</span>
    ns_index<span class="token double-colon punctuation">::</span>InvertedList <span class="token operator">*</span>inverted_list <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> inverted_list<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 此时找到了 保存所有的 拉链里面的值</span>
    <span class="token comment">// 不完美 一个词可能和多个文档相关 一个文档可以和多个关键词相关.</span>
    inverted_list_all<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  std<span class="token double-colon punctuation">::</span><span class="token function">sort</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
            <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e1<span class="token punctuation">,</span> <span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e2<span class="token punctuation">)</span>
            <span class="token punctuation">{</span>
              <span class="token keyword">return</span> e1<span class="token punctuation">.</span>weight <span class="token operator">></span> e2<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
            <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
  Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> inverted_list_all<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// 此时拿到正派</span>
    ns_index<span class="token double-colon punctuation">::</span>DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span>item<span class="token punctuation">.</span>doc_id<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> doc<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 获取了 文档内容</span>
    Json<span class="token double-colon punctuation">::</span>Value elem<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"title"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>title<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"desc"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>content<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"url"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>url<span class="token punctuation">;</span>

    root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>elem<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是有序的</span>
  <span class="token punctuation">}</span>

   Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span> <span class="token comment">// 这里我们暂时用这个格式</span>
  <span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>搜索测试</h2> 
  <p>下面我们这里统一做一个搜索测试.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"searcher.hpp"</span></span>
<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string input <span class="token operator">=</span> <span class="token string">"data/raw_html/raw.txt"</span><span class="token punctuation">;</span>
<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  ns_searcher<span class="token double-colon punctuation">::</span>Searcher <span class="token operator">*</span>search <span class="token operator">=</span> <span class="token keyword">new</span> ns_searcher<span class="token double-colon punctuation">::</span><span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  search<span class="token operator">-></span><span class="token function">InitSearcher</span><span class="token punctuation">(</span>input<span class="token punctuation">)</span><span class="token punctuation">;</span>

  std<span class="token double-colon punctuation">::</span>string query<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>string json_string<span class="token punctuation">;</span>
  
  <span class="token keyword">while</span> <span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"请输入关键字# "</span><span class="token punctuation">;</span>
    <span class="token comment">//std::cin >> query;</span>
    std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>cin<span class="token punctuation">,</span> query<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">//std::cout << query;</span>
    search<span class="token operator">-></span><span class="token function">Search</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>json_string<span class="token punctuation">)</span><span class="token punctuation">;</span>
    std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> json_string <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面是Mekefile.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
SSVR=search_server 

.PHONY:all
all:$(PARSER) $(SSVR)

$(SSVR):server.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

.PHONY:clean
clean:
	rm -f $(PARSER) $(SSVR)
</code></pre> 
  <p>下面我们测试一下.这是一个html文档的内容,我们的内容实在是太多了.此时这我们应该把内容给裁出来一部分.这样比较好.</p> 
  <pre><code>{
      "desc" : "Struct template bound_launcherHomeLibrariesPeopleFAQMoreStruct template bound_launcherboost::process::v2::bound_launcher — Utility class to bind initializers to a launcher. Synopsis// In header: <boost/process/v2/bind_launcher.hpp>template<typename Launcher, typename ... Init> struct bound_launcher {  // construct/copy/destruct  template<typename Launcher_, typename ... Init_>     bound_launcher(Launcher_ &&, Init_ &&...);  // public member functions  template<typename ExecutionContext, typename Args, typename ... Inits>     auto operator()(ExecutionContext &,                     const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  template<typename ExecutionContext, typename Args, typename ... Inits>     auto operator()(ExecutionContext &, error_code &,                     const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  template<typename Executor, typename Args, typename ... Inits>     auto operator()(Executor,                     const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  template<typename Executor, typename Args, typename ... Inits>     auto operator()(Executor, error_code &,                     const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  // private member functions  template<std::size_t ... Idx, typename ExecutionContext, typename Args,            typename ... Inits>     auto invoke(unspecified, ExecutionContext &,                 const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                 Args &&, Inits &&...);  template<std::size_t ... Idx, typename ExecutionContext, typename Args,            typename ... Inits>     auto invoke(unspecified, ExecutionContext &, error_code &,                 const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                 Args &&, Inits &&...);  template<std::size_t ... Idx, typename Executor, typename Args,            typename ... Inits>     auto invoke(unspecified, Executor,                 const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                 Args &&, Inits &&...);  template<std::size_t ... Idx, typename Executor, typename Args,            typename ... Inits>     auto invoke(unspecified, Executor, error_code &,                 const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                 Args &&, Inits &&...);};DescriptionThis can be used when multiple processes shared some settings, e.g. Template Parameterstypename LauncherThe inner launcher to be used typename ... Initbound_launcher         public       construct/copy/destructtemplate<typename Launcher_, typename ... Init_>   bound_launcher(Launcher_ && l, Init_ &&... init);bound_launcher public member functionstemplate<typename ExecutionContext, typename Args, typename ... Inits>   auto operator()(ExecutionContext & context,                   const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);template<typename ExecutionContext, typename Args, typename ... Inits>   auto operator()(ExecutionContext & context, error_code & ec,                   const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);template<typename Executor, typename Args, typename ... Inits>   auto operator()(Executor exec,                   const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);template<typename Executor, typename Args, typename ... Inits>   auto operator()(Executor exec, error_code & ec,                   const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);bound_launcher private member functionstemplate<std::size_t ... Idx, typename ExecutionContext, typename Args,          typename ... Inits>   auto invoke(unspecified, ExecutionContext & context,               const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);template<std::size_t ... Idx, typename ExecutionContext, typename Args,          typename ... Inits>   auto invoke(unspecified, ExecutionContext & context, error_code & ec,               const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);template<std::size_t ... Idx, typename Executor, typename Args,          typename ... Inits>   auto invoke(unspecified, Executor exec,               const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);template<std::size_t ... Idx, typename Executor, typename Args,          typename ... Inits>   auto invoke(unspecified, Executor exec, error_code & ec,               const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);Copyright © 2006-2012 Julio M. Merino Vidal, Ilya Sokolov,      Felipe Tanus, Jeff Flinn, Boris SchaelingCopyright © 2016 Klemens D. Morgenstern        Distributed under the Boost Software License, Version 1.0. (See accompanying        file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)      ",
      "title" : "Struct template bound_launcher",
      "url" : "https://www.boost.org/doc/libs/1_83_0/doc/html/boost/process/v2/bound_launcher.html"
   },
</code></pre> 
  <h2>获取摘要</h2> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// ...</span>
  <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
  Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> inverted_list_all<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// ....</span>
    <span class="token comment">// 获取了 文档内容</span>
    Json<span class="token double-colon punctuation">::</span>Value elem<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"title"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>title<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"desc"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token function">make_summary</span><span class="token punctuation">(</span>doc<span class="token operator">-></span>content<span class="token punctuation">,</span> item<span class="token punctuation">.</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 我们需要根据关键字来提取摘要</span>
    elem<span class="token punctuation">[</span><span class="token string">"url"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>url<span class="token punctuation">;</span>

    root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>elem<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是有序的</span>
  <span class="token punctuation">}</span>

   Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span> <span class="token comment">// 这里我们暂时用这个格式</span>
  <span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>首先我们可以随便切分,但是一般我们想要与搜索关键字相关的内容.</p> 
  <pre><code class="prism language-cpp">std<span class="token double-colon punctuation">::</span>string <span class="token function">make_summary</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>content<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 这里有点问题  content是正排索引的里面的内容,是区分大小写的 是文档内容,不区分大小写  word 确是 小的的</span>
  <span class="token comment">//  这里获取摘要有点问题,关键字不一定会出现在内容中, 注意是非常小的概率</span>
  <span class="token comment">// std::size_t pos = content.find(words);</span>
  <span class="token comment">// if (pos == std::string::npos)</span>
  <span class="token comment">//   return "Node";</span>

  <span class="token keyword">auto</span> item <span class="token operator">=</span> std<span class="token double-colon punctuation">::</span><span class="token function">search</span><span class="token punctuation">(</span>content<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> content<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> word<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> word<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
                          <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">int</span> x<span class="token punctuation">,</span> <span class="token keyword">int</span> y<span class="token punctuation">)</span>
                          <span class="token punctuation">{</span>
                            <span class="token keyword">return</span> std<span class="token double-colon punctuation">::</span><span class="token function">tolower</span><span class="token punctuation">(</span>x<span class="token punctuation">)</span> <span class="token operator">==</span> std<span class="token double-colon punctuation">::</span><span class="token function">tolower</span><span class="token punctuation">(</span>y<span class="token punctuation">)</span><span class="token punctuation">;</span>
                          <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>item <span class="token operator">==</span> content<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> <span class="token string">"Node"</span><span class="token punctuation">;</span>

  <span class="token comment">// 找到了 计算 跌打器到begin的距离</span>
  std<span class="token double-colon punctuation">::</span>size_t pos <span class="token operator">=</span> std<span class="token double-colon punctuation">::</span><span class="token function">distance</span><span class="token punctuation">(</span>content<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> item<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>size_t prev_step <span class="token operator">=</span> <span class="token number">50</span><span class="token punctuation">;</span>
  <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>size_t next_step <span class="token operator">=</span> <span class="token number">100</span><span class="token punctuation">;</span>
  <span class="token comment">// 先前找 50个 向后找 50个</span>
  std<span class="token double-colon punctuation">::</span>size_t begin <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
  <span class="token comment">// 注意szie_t是一个无符号数,这里我们-1 绝对有问题</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>pos <span class="token operator">></span> prev_step<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    begin <span class="token operator">=</span> pos <span class="token operator">-</span> prev_step<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
   
  std<span class="token double-colon punctuation">::</span>size_t end <span class="token operator">=</span> pos <span class="token operator">+</span> next_step<span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>end <span class="token operator">></span> content<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    end <span class="token operator">=</span> content<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token comment">//这里是是避只有关键</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>end <span class="token operator">></span> begin<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string desc <span class="token operator">=</span> content<span class="token punctuation">.</span><span class="token function">substr</span><span class="token punctuation">(</span>begin<span class="token punctuation">,</span> end <span class="token operator">-</span> begin<span class="token punctuation">)</span><span class="token punctuation">;</span>
    desc <span class="token operator">+=</span> <span class="token string">"...."</span><span class="token punctuation">;</span>
    <span class="token keyword">return</span> desc<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">else</span>
    <span class="token keyword">return</span> <span class="token string">"Node"</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/d80e8e59b357470cbc2ff9a84b4aaaf1.png" target="_blank"><img src="http://img.e-com-net.com/image/info8/d80e8e59b357470cbc2ff9a84b4aaaf1.png" alt="Boost搜索引擎_第10张图片" width="851" height="283" style="border:1px solid black;"></a></p> 
  <p>这里测试一下.</p> 
  <pre><code class="prism language-shell">请输入关键字<span class="token comment"># filesystem</span>
<span class="token punctuation">[</span>
   <span class="token punctuation">{</span>
      <span class="token string">"desc"</span> <span class="token builtin class-name">:</span> <span class="token string">"boost::asio::execution_context & >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  templ...."</span>,
      <span class="token string">"title"</span> <span class="token builtin class-name">:</span> <span class="token string">"Struct template bound_launcher"</span>,
      <span class="token string">"url"</span> <span class="token builtin class-name">:</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html/boost/process/v2/bound_launcher.html"</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">..</span><span class="token punctuation">..</span>.
<span class="token punctuation">]</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/22f678bdad044a91850ba63bf18cfd9c.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/22f678bdad044a91850ba63bf18cfd9c.jpg" alt="Boost搜索引擎_第11张图片" width="650" height="217" style="border:1px solid black;"></a></p> 
  <h2>综合调试</h2> 
  <p>下面我们这里要测试上面我们写的内容,是不是按照权重从大到小进行排序的,这里在json串哪里测试一下.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/278d900e1927448b98a58688e78b2216.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/278d900e1927448b98a58688e78b2216.jpg" alt="Boost搜索引擎_第12张图片" width="650" height="325" style="border:1px solid black;"></a></p> 
  <p>这个我们思路是.我们拿到所有的倒排拉链里面的内容,根据id找正文.但是我们倒排拉链哪里也是存在权重的.</p> 
  <pre><code class="prism language-shell">请输入关键字<span class="token comment"># split</span>
<span class="token punctuation">[</span>
   <span class="token punctuation">{</span>
      <span class="token string">"desc"</span> <span class="token builtin class-name">:</span> <span class="token string">"Class template split_iteratorHomeLibrariesPeopleFAQMoreClass template split_iteratorboost::algorithm::split_iterato...."</span>,
      <span class="token string">"title"</span> <span class="token builtin class-name">:</span> <span class="token string">"Class template split_iterator"</span>,
      <span class="token string">"url"</span> <span class="token builtin class-name">:</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html/boost/algorithm/split_iterator.html"</span>,
      <span class="token string">"weight"</span> <span class="token builtin class-name">:</span> <span class="token number">37</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">{</span>
      <span class="token string">"desc"</span> <span class="token builtin class-name">:</span> <span class="token string">"ual, BucketTraits, SizeType, BoolFlags >::type split_bucket_hash_equal_t;  typedef split_bucket_hash_equal_t::key_equal                            ...."</span>,
      <span class="token string">"title"</span> <span class="token builtin class-name">:</span> <span class="token string">"Struct template hashdata_internal"</span>,
      <span class="token string">"url"</span> <span class="token builtin class-name">:</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html/boost/intrusive/hashdata_internal.html"</span>,
      <span class="token string">"weight"</span> <span class="token builtin class-name">:</span> <span class="token number">20</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">..</span><span class="token punctuation">..</span>.
<span class="token punctuation">]</span>
</code></pre> 
  <p>关于调试我们这里需要总结几个内容.</p> 
  <ul> 
   <li>计算权重时,我们先去拿了标题,但是在内容中我们是对整个内容去标题.所以我们标题计算权重时要计算两次,那么一个标题是11</li> 
   <li>我们分词的具体规则不知道,不够这里我们就不关心了</li> 
   <li>上面我们还剩下最后一个内容,就是重复文档的问题.</li> 
  </ul> 
  <p>调试后,我们修改一下文件名.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> server.cc debug.cc
</code></pre> 
  <p>同时也修改一下makefile.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
DUG=debug

.PHONY:all
all:$(PARSER) $(DUG)

$(DUG):debug.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

.PHONY:clean
clean:
	rm -f $(PARSER) $(DUG)
</code></pre> 
  <h1>搜索服务端</h1> 
  <p>下面我们开始编写网络版本的服务端,我们先创建好文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> http_server.cc
</code></pre> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"searcher.hpp"</span></span>
<span class="token keyword">int</span> <span class="token function">mian</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这里也修改下makefile.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
DUG=debug
HTTP_SERVER=http_server 
.PHONY:all
all:$(PARSER) $(DUG) $(HTTP_SERVER)

$(DUG):debug.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

$(HTTP_SERVER):http_server.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem -ljsoncpp

.PHONY:clean
clean:
	rm -f $(PARSER) $(DUG) $(HTTP_SERVER)

</code></pre> 
  <p>这里测试一下.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">make</span>
g++ <span class="token parameter variable">-o</span> parser parser.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11 <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span>
g++ <span class="token parameter variable">-o</span> debug debug.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11  <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span> <span class="token parameter variable">-ljsoncpp</span>
g++ <span class="token parameter variable">-o</span> http_server http_server.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11 <span class="token parameter variable">-lpthread</span> <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span> <span class="token parameter variable">-ljsoncpp</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">1548</span>
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">43</span> Sep  <span class="token number">9</span> 06:00 cppjieba -<span class="token operator">></span> /home/qkj/install/cppjieba/include/cppjieba
drwxrwxr-x. <span class="token number">4</span> qkj qkj     <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">658128</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 debug
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">483</span> Sep  <span class="token number">9</span> 09:16 debug.cc
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">32</span> Sep  <span class="token number">9</span> 06:01 dict -<span class="token operator">></span> /home/qkj/install/cppjieba/dict/
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">401400</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server
-rw-rw-r--. <span class="token number">1</span> qkj qkj     <span class="token number">51</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6102</span> Sep  <span class="token number">9</span> 08:33 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">446</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:58 Makefile
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">481760</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 parser
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">4626</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:42 searcher.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">1779</span> Sep  <span class="token number">9</span> 08:27 util.hpp
</code></pre> 
  <h2>升级gcc</h2> 
  <p>这里通信我们可以自己写,后面我们会升级.不过这里我们使用cpp-httplib库.这个库很简单.这里cpp-httplib有点问题,我们需要教新版本的编译器,否则就是编译不通过,或者是运行出现错误.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ gcc <span class="token parameter variable">-v</span>
Using built-in specs.
<span class="token assign-left variable">COLLECT_GCC</span><span class="token operator">=</span>gcc
<span class="token assign-left variable">COLLECT_LTO_WRAPPER</span><span class="token operator">=</span>/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: <span class="token punctuation">..</span>/configure <span class="token parameter variable">--prefix</span><span class="token operator">=</span>/usr <span class="token parameter variable">--mandir</span><span class="token operator">=</span>/usr/share/man --
<span class="token assign-left variable">infodir</span><span class="token operator">=</span>/usr/share/info --with-bugurl<span class="token operator">=</span>http://bugzilla.redhat.com/bugzilla <span class="token parameter variable">--enablebootstrap</span>
--enable-shared --enable-threads<span class="token operator">=</span>posix --enable-checking<span class="token operator">=</span>release --with-systemzlib
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --
enable-linker-build-id --with-linker-hash-style<span class="token operator">=</span>gnu --enable-languages<span class="token operator">=</span>c,c++,objc,objc++,
java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --
with-isl<span class="token operator">=</span>/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --
with-cloog<span class="token operator">=</span>/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install -
-enable-gnu-indirect-function --with-tune<span class="token operator">=</span>generic --with-arch_32<span class="token operator">=</span>x86-64 <span class="token parameter variable">--build</span><span class="token operator">=</span>x86_64-
redhat-linux
Thread model: posix
gcc version <span class="token number">4.8</span>.5 <span class="token number">20150623</span> <span class="token punctuation">(</span>Red Hat <span class="token number">4.8</span>.5-44<span class="token punctuation">)</span> <span class="token punctuation">(</span>GCC<span class="token punctuation">)</span>
</code></pre> 
  <p>下面直接升级.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">sudo</span> yum <span class="token function">install</span> centos-release-scl
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">sudo</span> yum <span class="token function">install</span> devtoolset-8-gcc*
scl <span class="token builtin class-name">enable</span> devtoolset-8 <span class="token function">bash</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token builtin class-name">source</span> /opt/rh/devtoolset-8/enable
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> /usr/bin/gcc /usr/bin/gcc-4.8.5
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /opt/rh/devtoolset-8/root/bin/gcc /usr/bin/gcc
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> /usr/bin/g++ /usr/bin/g++-4.8.5
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /opt/rh/devtoolset-8/root/bin/g++ /usr/bin/g++
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> /usr/bin/c++ /usr/bin/c++-4.8.5
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /opt/rh/devtoolset-8/root/bin/c++ /usr/bin/c++
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ gcc <span class="token parameter variable">-v</span>
Using built-in specs.
<span class="token assign-left variable">COLLECT_GCC</span><span class="token operator">=</span>gcc
<span class="token assign-left variable">COLLECT_LTO_WRAPPER</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
Target: x86_64-redhat-linux
Configured with: <span class="token punctuation">..</span>/configure --enable-bootstrap --enable-languages<span class="token operator">=</span>c,c++,fortran,lto <span class="token parameter variable">--prefix</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr <span class="token parameter variable">--mandir</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr/share/man <span class="token parameter variable">--infodir</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr/share/info --with-bugurl<span class="token operator">=</span>http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads<span class="token operator">=</span>posix --enable-checking<span class="token operator">=</span>release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style<span class="token operator">=</span>gnu --with-default-libstdcxx-abi<span class="token operator">=</span>gcc4-compatible --enable-plugin --enable-initfini-array --with-isl<span class="token operator">=</span>/builddir/build/BUILD/gcc-8.3.1-20190311/obj-x86_64-redhat-linux/isl-install --disable-libmpx --enable-gnu-indirect-function --with-tune<span class="token operator">=</span>generic --with-arch_32<span class="token operator">=</span>x86-64 <span class="token parameter variable">--build</span><span class="token operator">=</span>x86_64-redhat-linux
Thread model: posix
gcc version <span class="token number">8.3</span>.1 <span class="token number">20190311</span> <span class="token punctuation">(</span>Red Hat <span class="token number">8.3</span>.1-3<span class="token punctuation">)</span> <span class="token punctuation">(</span>GCC<span class="token punctuation">)</span> 
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <h2>引入cpp-httplib库</h2> 
  <pre><code>这里我们选择下载0.7.15版本,这是因为较新版本的可能运行时会报错.
这里我们选择下载到桌面,然后拖拽到虚拟机上,这些方法都试一遍.
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/3149b9de962e420ea7b8ede3f5549547.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/3149b9de962e420ea7b8ede3f5549547.jpg" alt="Boost搜索引擎_第13张图片" width="650" height="120" style="border:1px solid black;"></a></p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ rz <span class="token parameter variable">-E</span> 

<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">596</span>
-rwxrwxr-x. <span class="token number">1</span> qkj qkj  <span class="token number">15424</span> Sep  <span class="token number">9</span> 09:09 a.out
drwxr-xr-x. <span class="token number">8</span> qkj qkj   <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span>:40 boost_1_83_0
-rw-r--r--. <span class="token number">1</span> qkj qkj <span class="token number">584053</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:23 cpp-httplib-v0.7.15.zip
drwxrwxr-x. <span class="token number">8</span> qkj qkj    <span class="token number">215</span> Sep  <span class="token number">9</span> 03:38 cppjieba
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">421</span> Sep  <span class="token number">9</span> 09:09 test.cc
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>然后我们创建软连接到我们的项目中.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /home/qkj/install/cpp-httplib-v0.7.15/ cpp-httplib
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">1548</span>
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">38</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:30 cpp-httplib -<span class="token operator">></span> /home/qkj/install/cpp-httplib-v0.7.15/
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">43</span> Sep  <span class="token number">9</span> 06:00 cppjieba -<span class="token operator">></span> /home/qkj/install/cppjieba/include/cppjieba
drwxrwxr-x. <span class="token number">4</span> qkj qkj     <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">658128</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 debug
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">483</span> Sep  <span class="token number">9</span> 09:16 debug.cc
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">32</span> Sep  <span class="token number">9</span> 06:01 dict -<span class="token operator">></span> /home/qkj/install/cppjieba/dict/
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">401400</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server
-rw-rw-r--. <span class="token number">1</span> qkj qkj     <span class="token number">51</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6102</span> Sep  <span class="token number">9</span> 08:33 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">446</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:58 Makefile
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">481760</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 parser
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">4626</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:42 searcher.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">1779</span> Sep  <span class="token number">9</span> 08:27 util.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <h3>测试cpp-httplib</h3> 
  <p>下面我们测试一下httplib库.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/3bf94b148acd4275b7ef33a80f22dc28.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/3bf94b148acd4275b7ef33a80f22dc28.jpg" alt="Boost搜索引擎_第14张图片" width="650" height="135" style="border:1px solid black;"></a></p> 
  <p>这里我们先来测试一下.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">make</span>
g++ <span class="token parameter variable">-o</span> http_server http_server.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11  <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span> <span class="token parameter variable">-ljsoncpp</span>
/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/libstdc++_nonshared.a<span class="token punctuation">(</span>thread48.o<span class="token punctuation">)</span>: In <span class="token keyword">function</span> <span class="token variable"><span class="token variable">`</span>std::thread::_M_start_thread<span class="token punctuation">(</span>std::unique_ptr<span class="token operator"><</span>std::thread::_State, std::default_delete<span class="token operator"><</span>std::thread::_State<span class="token operator">></span> <span class="token operator">></span>, void <span class="token punctuation">(</span>*<span class="token punctuation">)</span><span class="token punctuation">(</span><span class="token punctuation">))</span>':
<span class="token punctuation">(</span>.text._ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE+0x11<span class="token punctuation">)</span>: undefined reference to <span class="token variable">`</span></span>pthread_create<span class="token string">'
/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/libstdc++_nonshared.a(thread48.o): In function `std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)())'</span><span class="token builtin class-name">:</span>
<span class="token punctuation">(</span>.text._ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEEPFvvE+0x60<span class="token punctuation">)</span>: undefined reference to <span class="token variable"><span class="token variable">`</span>pthread_create'
/tmp/ccGWpu61.o: In <span class="token keyword">function</span> <span class="token variable">`</span></span>std::thread::thread<span class="token operator"><</span>httplib::ThreadPool::worker, , void<span class="token operator">></span><span class="token punctuation">(</span>httplib::ThreadPool::worker<span class="token operator">&&</span><span class="token punctuation">)</span><span class="token string">':
http_server.cc:(.text._ZNSt6threadC2IN7httplib10ThreadPool6workerEJEvEEOT_DpOT0_[_ZNSt6threadC5IN7httplib10ThreadPool6workerEJEvEEOT_DpOT0_]+0x21): undefined reference to `pthread_create'</span>
collect2: error: ld returned <span class="token number">1</span> <span class="token builtin class-name">exit</span> status
make: *** <span class="token punctuation">[</span>http_server<span class="token punctuation">]</span> Error <span class="token number">1</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这是由于我们httplib需要引入pthread库.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
DUG=debug
HTTP_SERVER=http_server 
.PHONY:all
all:$(PARSER) $(DUG) $(HTTP_SERVER)

$(DUG):debug.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

$(HTTP_SERVER):http_server.cc
	$(cc) -o $@ $^ -std=c++11 -lpthread -lboost_system -lboost_filesystem -ljsoncpp

.PHONY:clean
clean:
	rm -f $(PARSER) $(DUG) $(HTTP_SERVER)

</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/288805c2e0964ba894e7e50fa48c349b.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/288805c2e0964ba894e7e50fa48c349b.jpg" alt="image-20230910113735355" width="650" height="44"></a></p> 
  <p>这里我们继续测试,先创建一个简单的功能.这个库是很好用的.<a href="http://img.e-com-net.com/image/info8/3e6fbcd8f074418c8c5f6a90aecadcbf.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/3e6fbcd8f074418c8c5f6a90aecadcbf.jpg" alt="image-20230910113849136" width="650" height="69"></a></p> 
  <p>这是我们代码.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cpp-httplib/httplib.h"</span></span>
<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  httplib<span class="token double-colon punctuation">::</span>Server svr<span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span><span class="token string">"hi"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> httplib<span class="token double-colon punctuation">::</span>Request<span class="token operator">&</span> req<span class="token punctuation">,</span> httplib<span class="token double-colon punctuation">::</span>Response<span class="token operator">&</span> rsp<span class="token punctuation">)</span><span class="token punctuation">{</span>
    rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span><span class="token string">"hello word!"</span><span class="token punctuation">,</span> <span class="token string">"text/plain; charset=utf-8"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">listen</span><span class="token punctuation">(</span><span class="token string">"0.0.0.0"</span><span class="token punctuation">,</span> <span class="token number">8081</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/b75b10d0462c441587bf4c17f9e22730.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/b75b10d0462c441587bf4c17f9e22730.jpg" alt="image-20230910114502524" width="650" height="39"></a></p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">netstat</span> <span class="token parameter variable">-ntlp</span>
<span class="token punctuation">(</span>Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.<span class="token punctuation">)</span>
Active Internet connections <span class="token punctuation">(</span>only servers<span class="token punctuation">)</span>
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">127.0</span>.0.1:44227         <span class="token number">0.0</span>.0.0:*               LISTEN      <span class="token number">1903</span>/node           
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">0.0</span>.0.0:111             <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">0.0</span>.0.0:8081            <span class="token number">0.0</span>.0.0:*               LISTEN      <span class="token number">4191</span>/./http_server  
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">192.168</span>.122.1:53        <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">0.0</span>.0.0:22              <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">127.0</span>.0.1:631           <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">127.0</span>.0.1:25            <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> :::111                  :::*                    LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> :::22                   :::*                    LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> ::1:631                 :::*                    LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> ::1:25                  :::*                    LISTEN      -                   
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 

</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/871601ba555c4c8b9390360e7fd4bba4.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/871601ba555c4c8b9390360e7fd4bba4.jpg" alt="Boost搜索引擎_第15张图片" width="650" height="229" style="border:1px solid black;"></a></p> 
  <h3>开放端口号</h3> 
  <p>这是因为我们的虚拟机没有开辟端口被外部网络进行访问.这里需要开放端口.我们看一下下面有那些端口被打开了.下面是打开的规则.</p> 
  <p>Centos开放端口号</p> 
  <p><a href="http://img.e-com-net.com/image/info8/c47d79b3f3c543b88e120f52e58730a3.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/c47d79b3f3c543b88e120f52e58730a3.jpg" alt="image-20230910120405201" width="650" height="91"></a></p> 
  <h2>设置根目录</h2> 
  <p>一般而言,我们都有一个根目录.这样就可以了.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mkdir</span> wwwroot
</code></pre> 
  <p>这里在服务器上面设置跟目录.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cpp-httplib/httplib.h"</span></span>
<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string root_path <span class="token operator">=</span> <span class="token string">"./wwwroot"</span><span class="token punctuation">;</span>

<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  httplib<span class="token double-colon punctuation">::</span>Server svr<span class="token punctuation">;</span>
  <span class="token comment">// 设置跟目录</span>
  svr<span class="token punctuation">.</span><span class="token function">set_base_dir</span><span class="token punctuation">(</span>root_path<span class="token punctuation">.</span><span class="token function">c_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span><span class="token string">"hi"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> httplib<span class="token double-colon punctuation">::</span>Request<span class="token operator">&</span> req<span class="token punctuation">,</span> httplib<span class="token double-colon punctuation">::</span>Response<span class="token operator">&</span> rsp<span class="token punctuation">)</span><span class="token punctuation">{</span>
    rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span><span class="token string">"hello word!"</span><span class="token punctuation">,</span> <span class="token string">"text/plain; charset=utf-8"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">listen</span><span class="token punctuation">(</span><span class="token string">"0.0.0.0"</span><span class="token punctuation">,</span> <span class="token number">8080</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>我们继续测试.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/62122d8ad57748468646e5fc00205b35.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/62122d8ad57748468646e5fc00205b35.jpg" alt="Boost搜索引擎_第16张图片" width="650" height="192" style="border:1px solid black;"></a></p> 
  <p>注意z合适因为我们的根目录下面什么都没有.一般而言,我们是名字为index.html文件.这里设置一下</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost wwwroot<span class="token punctuation">]</span>$ <span class="token function">touch</span> index.html
<span class="token punctuation">[</span>qkj@localhost wwwroot<span class="token punctuation">]</span>$ ll
total <span class="token number">8</span>
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">0</span> Sep  <span class="token number">9</span> <span class="token number">21</span>:10 index.html
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/f72e574431ef41f19d914fb8ee44dcee.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/f72e574431ef41f19d914fb8ee44dcee.jpg" alt="Boost搜索引擎_第17张图片" width="650" height="157" style="border:1px solid black;"></a></p> 
  <p><a href="http://img.e-com-net.com/image/info8/9df106701fb7466c92dca77cfffb6cbd.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/9df106701fb7466c92dca77cfffb6cbd.jpg" alt="Boost搜索引擎_第18张图片" width="650" height="146" style="border:1px solid black;"></a></p> 
  <h2>编写搜索服务端</h2> 
  <p>下面我们就可以编写我们的服务端了.这里面是非常简单的.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cpp-httplib/httplib.h"</span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"searcher.hpp"</span></span>

<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string root_path <span class="token operator">=</span> <span class="token string">"./wwwroot"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string input <span class="token operator">=</span> <span class="token string">"data/raw_html/raw.txt"</span><span class="token punctuation">;</span>
<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 初始化sercher</span>
  ns_searcher<span class="token double-colon punctuation">::</span>Searcher search<span class="token punctuation">;</span>
  search<span class="token punctuation">.</span><span class="token function">InitSearcher</span><span class="token punctuation">(</span>input<span class="token punctuation">)</span><span class="token punctuation">;</span>

  httplib<span class="token double-colon punctuation">::</span>Server svr<span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">set_base_dir</span><span class="token punctuation">(</span>root_path<span class="token punctuation">.</span><span class="token function">c_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 设置跟目录</span>

  svr<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span><span class="token string">"/s"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token operator">&</span>search<span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> httplib<span class="token double-colon punctuation">::</span>Request <span class="token operator">&</span>req<span class="token punctuation">,</span> httplib<span class="token double-colon punctuation">::</span>Response <span class="token operator">&</span>rsp<span class="token punctuation">)</span>
          <span class="token punctuation">{</span>
            <span class="token keyword">if</span> <span class="token punctuation">(</span>req<span class="token punctuation">.</span><span class="token function">has_param</span><span class="token punctuation">(</span><span class="token string">"word"</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
            <span class="token punctuation">{</span>
              rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span><span class="token string">"必须要搜索关键字"</span><span class="token punctuation">,</span> <span class="token string">"text/plain; charset=utf-8"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
              <span class="token keyword">return</span><span class="token punctuation">;</span>
            <span class="token punctuation">}</span>

            std<span class="token double-colon punctuation">::</span>string word <span class="token operator">=</span> req<span class="token punctuation">.</span><span class="token function">get_param_value</span><span class="token punctuation">(</span><span class="token string">"word"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
            std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"用户搜索的: "</span> <span class="token operator"><<</span> word <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>

            std<span class="token double-colon punctuation">::</span>string json_string<span class="token punctuation">;</span>
            search<span class="token punctuation">.</span><span class="token function">Search</span><span class="token punctuation">(</span>word<span class="token punctuation">,</span> <span class="token operator">&</span>json_string<span class="token punctuation">)</span><span class="token punctuation">;</span>
            rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span>json_string<span class="token punctuation">,</span> <span class="token string">"application/json"</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"服务器启动成功"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>

  svr<span class="token punctuation">.</span><span class="token function">listen</span><span class="token punctuation">(</span><span class="token string">"0.0.0.0"</span><span class="token punctuation">,</span> <span class="token number">8081</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/72a6333e42a44e988154a299c28b3a05.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/72a6333e42a44e988154a299c28b3a05.jpg" alt="image-20230910122016183" width="650" height="86"></a></p> 
  <p><a href="http://img.e-com-net.com/image/info8/9038f34ecdc945a09d5ddde6402678f4.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/9038f34ecdc945a09d5ddde6402678f4.jpg" alt="Boost搜索引擎_第19张图片" width="650" height="113" style="border:1px solid black;"></a></p> 
  <h1>前端代码</h1> 
  <p>前端部分我们可以选学,这里我们也不谈.如果想学,可以去下面的网站.</p> 
  <ul> 
   <li>HTML: 编写网页结构, 网页的骨骼</li> 
   <li>CSS : 网页样式,网页的皮肉</li> 
   <li>Js : 前后端交互,网页的灵魂</li> 
  </ul> 
  <blockquote> 
   <p>前端学习网站推荐:http://www.w3school.com.cn</p> 
  </blockquote> 
  <h2>网页结构</h2> 
  <p>我们设置的网页结构是这样的.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/52dd3d674c194fe7b117abe5ea57554c.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/52dd3d674c194fe7b117abe5ea57554c.jpg" alt="Boost搜索引擎_第20张图片" width="650" height="460" style="border:1px solid black;"></a></p> 
  <p>按照上面的内容,我们的html可以这样写.</p> 
  <pre><code class="prism language-html"><span class="token doctype"><span class="token punctuation"><!</span><span class="token doctype-tag">DOCTYPE</span> <span class="token name">html</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>html</span> <span class="token attr-name">lang</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>en<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>head</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">charset</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>UTF-8<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">http-equiv</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>X-UA-Compatible<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>IE=edge<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">name</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>viewport<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>width=device-width, initial-scale=1.0<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>title</span><span class="token punctuation">></span></span>boost 搜索引擎<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>title</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>head</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>body</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>container<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>search<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>input</span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text<span class="token punctuation">"</span></span> <span class="token attr-name">value</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>输入搜索关键字...<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>button</span><span class="token punctuation">></span></span>搜索一下<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>button</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>result<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>body</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>html</span><span class="token punctuation">></span></span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/42e984ae2ffb4c8fa41abacfcdc5bf5b.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/42e984ae2ffb4c8fa41abacfcdc5bf5b.jpg" alt="Boost搜索引擎_第21张图片" width="650" height="288" style="border:1px solid black;"></a></p> 
  <h2>网页样式</h2> 
  <p>上面我们发现有点丑,所以这里我们要给他美颜一下.</p> 
  <pre><code class="prism language-html"><span class="token doctype"><span class="token punctuation"><!</span><span class="token doctype-tag">DOCTYPE</span> <span class="token name">html</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>html</span> <span class="token attr-name">lang</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>en<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>head</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">charset</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>UTF-8<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">http-equiv</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>X-UA-Compatible<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>IE=edge<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">name</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>viewport<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>width=device-width, initial-scale=1.0<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>title</span><span class="token punctuation">></span></span>boost 搜索引擎<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>title</span><span class="token punctuation">></span></span>

  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>style</span><span class="token punctuation">></span></span><span class="token style"><span class="token language-css">
    <span class="token comment">/* 去掉网页中的所有的默认内外边距,html的盒子模型 */</span>
    <span class="token selector">*</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置外边距 */</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
      <span class="token comment">/* 设置内边距 */</span>
      <span class="token property">padding</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 将我们的body内的内容100%和html的呈现吻合 */</span>
    <span class="token selector">html,
    body</span> <span class="token punctuation">{</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 类选择器.container */</span>
    <span class="token selector">.container</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置div的宽度 */</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 800px<span class="token punctuation">;</span>
      <span class="token comment">/* 通过设置外边距达到居中对齐的目的 */</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0px auto<span class="token punctuation">;</span>
      <span class="token comment">/* 设置外边距的上边距,保持元素和网页的上部距离 */</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 复合选择器,选中container 下的 search */</span>
    <span class="token selector">.container .search</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 宽度与父标签保持一致 */</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
      <span class="token comment">/* 高度设置为52px */</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 先选中input标签, 直接设置标签的属性,先要选中, input:标签选择器*/</span>
    <span class="token comment">/* input在进行高度设置的时候,没有考虑边框的问题 */</span>
    <span class="token selector">.container .search input</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置left浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 600px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 50px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置边框属性:边框的宽度,样式,颜色 */</span>
      <span class="token property">border</span><span class="token punctuation">:</span> 1px solid black<span class="token punctuation">;</span>
      <span class="token comment">/* 去掉input输入框的有边框 */</span>
      <span class="token property">border-right</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token comment">/* 设置内边距,默认文字不要和左侧边框紧挨着 */</span>
      <span class="token property">padding-left</span><span class="token punctuation">:</span> 10px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置input内部的字体的颜色和样式 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #CCC<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 先选中button标签, 直接设置标签的属性,先要选中, button:标签选择器*/</span>
    <span class="token selector">.container .search button</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置left浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 150px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置button的背景颜色,#4e6ef2 */</span>
      <span class="token property">background-color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>
      <span class="token comment">/* 设置button中的字体颜色 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #FFF<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体的大小 */</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 19px<span class="token punctuation">;</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> Georgia<span class="token punctuation">,</span> <span class="token string">'Times New Roman'</span><span class="token punctuation">,</span> Times<span class="token punctuation">,</span> serif<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result</span> <span class="token punctuation">{</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item</span> <span class="token punctuation">{</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置为块级元素,单独站一行 */</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>
      <span class="token comment">/* a标签的下划线去掉 */</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token comment">/* 设置a标签中的文字的字体大小 */</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 20px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体的颜色 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a:hover</span> <span class="token punctuation">{</span>
      <span class="token comment">/*设置鼠标放在a之上的动态效果*/</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> underline<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item p</span> <span class="token punctuation">{</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 5px<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 16px<span class="token punctuation">;</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> <span class="token string">'Lucida Sans'</span><span class="token punctuation">,</span> <span class="token string">'Lucida Sans Regular'</span><span class="token punctuation">,</span> <span class="token string">'Lucida Grande'</span><span class="token punctuation">,</span> <span class="token string">'Lucida SansUnicode'</span><span class="token punctuation">,</span> Geneva<span class="token punctuation">,</span> Verdana<span class="token punctuation">,</span> sans-serif<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item i</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置为块级元素,单独站一行 */</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>
      <span class="token comment">/* 取消斜体风格 */</span>
      <span class="token property">font-style</span><span class="token punctuation">:</span> normal<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> green<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  </span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>style</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>head</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>body</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>container<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>search<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>input</span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text<span class="token punctuation">"</span></span> <span class="token attr-name">value</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>输入搜索关键字...<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>button</span><span class="token punctuation">></span></span>搜索一下<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>button</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>result<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>body</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>html</span><span class="token punctuation">></span></span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/2d012732abf149d8a40e5ee0877a8411.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/2d012732abf149d8a40e5ee0877a8411.jpg" alt="Boost搜索引擎_第22张图片" width="650" height="251" style="border:1px solid black;"></a></p> 
  <h2>前后端交互</h2> 
  <p>下面我们继续使用前后端交互.也是直接贴代码.</p> 
  <pre><code class="prism language-html"><span class="token comment"><!-- 形成骨架 --></span>
<span class="token doctype"><span class="token punctuation"><!</span><span class="token doctype-tag">DOCTYPE</span> <span class="token name">html</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>html</span> <span class="token attr-name">lang</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>en<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>head</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">charset</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>UTF-8<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">http-equiv</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>X-UA-Compatible<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>IE=edge<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">name</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>viewport<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>width=device-width, initial-scale=1.0<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>script</span> <span class="token attr-name">src</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>http://code.jquery.com/jquery-2.1.1.min.js<span class="token punctuation">"</span></span><span class="token punctuation">></span></span><span class="token script"></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>script</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>title</span><span class="token punctuation">></span></span>boost 搜索引擎<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>title</span><span class="token punctuation">></span></span>
  <span class="token comment"><!-- 把内外边距清零 --></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>style</span><span class="token punctuation">></span></span><span class="token style"><span class="token language-css">
    <span class="token selector">*</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置外边距 */</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
      <span class="token comment">/* 设置内边距 */</span>
      <span class="token property">padding</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">html,
    body</span> <span class="token punctuation">{</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 居中显式  以点开头的我们称之类选择器 */</span>
    <span class="token selector">.container</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 这是最大框架 */</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 800px<span class="token punctuation">;</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0px auto<span class="token punctuation">;</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 复合选择器 */</span>
    <span class="token selector">.container .search</span> <span class="token punctuation">{</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
      <span class="token comment">/* 为何是52我们后面解释 */</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .search input</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 加上浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 600px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 50px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置边框 */</span>
      <span class="token property">border</span><span class="token punctuation">:</span> 1px solid black<span class="token punctuation">;</span>
      <span class="token comment">/* 去掉右边距 */</span>
      <span class="token property">border-right</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token property">padding-left</span><span class="token punctuation">:</span> 10px<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #ccc<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .search button</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 加上浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 120px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>

      <span class="token comment">/* 设置背景颜色 */</span>
      <span class="token property">background-color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体颜色 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #fff<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体大小 */</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 19px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体样式 */</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> <span class="token string">'Times New Roman'</span><span class="token punctuation">,</span> Times<span class="token punctuation">,</span> serif<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>


    <span class="token selector">.container .result</span> <span class="token punctuation">{</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item</span> <span class="token punctuation">{</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a</span> <span class="token punctuation">{</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>
      <span class="token comment">/* 去掉下划线 */</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 20px<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a:hover</span> <span class="token punctuation">{</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> underline<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item p</span> <span class="token punctuation">{</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 5px<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 16px<span class="token punctuation">;</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> <span class="token string">'Times New Roman'</span><span class="token punctuation">,</span> Times<span class="token punctuation">,</span> serif<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item i</span> <span class="token punctuation">{</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>

      <span class="token comment">/* 取消斜体 */</span>
      <span class="token property">font-style</span><span class="token punctuation">:</span> normal<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> green<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  </span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>style</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>head</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>body</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>container<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>search<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>input</span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text<span class="token punctuation">"</span></span> <span class="token attr-name">value</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>输入搜索关键字...<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>button</span> <span class="token special-attr"><span class="token attr-name">onclick</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span><span class="token value javascript language-javascript"><span class="token function">Search</span><span class="token punctuation">(</span><span class="token punctuation">)</span></span><span class="token punctuation">"</span></span></span><span class="token punctuation">></span></span>搜索一下<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>button</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>result<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token comment"><!-- 动态生成网页内容 --></span>

      <span class="token comment"><!-- <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要,这是摘要这是摘要,这是摘要这是摘要,这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div> --></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>script</span><span class="token punctuation">></span></span><span class="token script"><span class="token language-javascript">
    <span class="token keyword">function</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
      <span class="token comment">// alert("hello js");</span>
      <span class="token comment">// 1. 提取数据 jquery</span>

      <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">".container .search input"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">val</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">if</span><span class="token punctuation">(</span>query <span class="token operator">==</span> <span class="token string">''</span> <span class="token operator">||</span> query <span class="token operator">==</span> <span class="token keyword">null</span><span class="token punctuation">)</span><span class="token punctuation">{</span>
        <span class="token keyword">return</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span><span class="token string">"query = "</span> <span class="token operator">+</span> query<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token comment">// 2. 发起http 请求</span>
      $<span class="token punctuation">.</span><span class="token function">ajax</span><span class="token punctuation">(</span><span class="token punctuation">{</span>
        <span class="token literal-property property">type</span><span class="token operator">:</span> <span class="token string">"GET"</span><span class="token punctuation">,</span>
        <span class="token literal-property property">url</span><span class="token operator">:</span> <span class="token string">"/s?word="</span> <span class="token operator">+</span> query<span class="token punctuation">,</span>
        <span class="token function-variable function">success</span><span class="token operator">:</span> <span class="token keyword">function</span> <span class="token punctuation">(</span><span class="token parameter">data</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
          console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>
          <span class="token comment">// 构建新网页  -- 动态的</span>
          <span class="token function">BuildHtml</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
      <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>


    <span class="token keyword">function</span> <span class="token function">BuildHtml</span><span class="token punctuation">(</span><span class="token parameter">data</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>

      <span class="token keyword">if</span><span class="token punctuation">(</span>date <span class="token operator">==</span> <span class="token string">''</span> <span class="token operator">||</span> data <span class="token operator">==</span> <span class="token keyword">null</span><span class="token punctuation">)</span><span class="token punctuation">{</span>
        document<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token string">"搜索的内容没有"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">return</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">let</span> result_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">".container .result"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      result_lable<span class="token punctuation">.</span><span class="token function">empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">let</span> elem <span class="token keyword">of</span> data<span class="token punctuation">)</span> <span class="token punctuation">{</span>

        <span class="token comment">// console.log(elem.title);</span>
        <span class="token comment">// console.log(elem.url);</span>

        <span class="token keyword">let</span> a_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<a>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token literal-property property">text</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>title<span class="token punctuation">,</span>
          <span class="token literal-property property">href</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>url<span class="token punctuation">,</span>
          <span class="token literal-property property">target</span><span class="token operator">:</span> <span class="token string">"_blank"</span>
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">let</span> p_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<p>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token literal-property property">text</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>desc
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">let</span> i_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<i>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token literal-property property">text</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>url
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

        <span class="token keyword">let</span> div_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<div>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token keyword">class</span><span class="token operator">:</span> <span class="token string">"item"</span>
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>


        a_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>div_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
        p_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>div_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
        i_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>div_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
        div_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>result_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

    <span class="token punctuation">}</span>

  </span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>script</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>body</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>html</span><span class="token punctuation">></span></span>
</code></pre> 
  <h1>项目成果</h1> 
  <p>下面我们就可以使用我们的项目做搜索服务了看一下.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/b4f8a1125b234116a9e415065b101dea.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/b4f8a1125b234116a9e415065b101dea.jpg" alt="Boost搜索引擎_第23张图片" width="650" height="204" style="border:1px solid black;"></a></p> 
  <h1>项目补充</h1> 
  <p>下面我们补充点内容,有些小细节我们还没有谈.</p> 
  <h2>取重完善</h2> 
  <p>我们在搜索服务那里说过,对于我们关键词的搜索结果,在多个关键字之间,我们的文档id可能会重复.这个时候我们需要进行去重分为两步.</p> 
  <ul> 
   <li>找到在重复的id</li> 
   <li>把id里面的权重尽心相加</li> 
   <li>重新构造,让后进行查找构建json串</li> 
  </ul> 
  <p>下面是我们的遇到的情况.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/2c984779dfcb40fc8a4530bab48f38ce.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/2c984779dfcb40fc8a4530bab48f38ce.jpg" alt="Boost搜索引擎_第24张图片" width="650" height="245" style="border:1px solid black;"></a></p> 
  <p>这里我们应该要处理.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">struct</span> <span class="token class-name">InvertedElemPrint</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦id</span>

    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>                     <span class="token comment">// 权重</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span> <span class="token comment">// 一个id里面可以对饮多个词</span>
    <span class="token function">InvertedElemPrint</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">doc_id</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">weight</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Searcher</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>
    <span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 1 分词  先来分词后面在进行查找</span>
      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>words<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token comment">// 2 根据分词结果依次触发  搜索</span>
      std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span><span class="token keyword">uint64_t</span><span class="token punctuation">,</span> InvertedElemPrint<span class="token operator">></span> tokens_map<span class="token punctuation">;</span> <span class="token comment">//根据id,找到InvertedElemPrint</span>
      
      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>InvertedElemPrint<span class="token operator">></span> inverted_list_all<span class="token punctuation">;</span> <span class="token comment">// 为了去重</span>

      <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> words<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> 
        <span class="token comment">// 先查倒排</span>
        ns_index<span class="token double-colon punctuation">::</span>InvertedList <span class="token operator">*</span>inverted_list <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> inverted_list<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token keyword">continue</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
       
        <span class="token comment">// 根据倒排拉量找到我们所有的文档id</span>
        <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">auto</span> <span class="token operator">&</span>elem <span class="token operator">:</span> <span class="token operator">*</span>inverted_list<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token comment">// 去看这个id是不在哈希表中,如果在,拿到InvertedElemPrint</span>
          <span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">=</span> tokens_map<span class="token punctuation">[</span>elem<span class="token punctuation">.</span>doc_id<span class="token punctuation">]</span><span class="token punctuation">;</span> 
          item<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> elem<span class="token punctuation">.</span>doc_id<span class="token punctuation">;</span> 
          <span class="token comment">// 把关键字也插入其中</span>
          item<span class="token punctuation">.</span>words<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>elem<span class="token punctuation">.</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span>
          <span class="token comment">// 计算权重</span>
          item<span class="token punctuation">.</span>weight <span class="token operator">+=</span> elem<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        <span class="token comment">// 此时我们相同的id 已经被保存了</span>
      <span class="token punctuation">}</span>
      <span class="token comment">// 这里就把我们相同id的InvertedElemPrint插入所有的数组中</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> tokens_map<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        inverted_list_all<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>item<span class="token punctuation">.</span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token comment">// 3 合并排序  -- 按照相关性进行降序排序,这里是根据新的权重.</span>
      std<span class="token double-colon punctuation">::</span><span class="token function">sort</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
                <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> InvertedElemPrint <span class="token operator">&</span>e1<span class="token punctuation">,</span> <span class="token keyword">const</span> InvertedElemPrint <span class="token operator">&</span>e2<span class="token punctuation">)</span>
                <span class="token punctuation">{</span>
                  <span class="token keyword">return</span> e1<span class="token punctuation">.</span>weight <span class="token operator">></span> e2<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
                <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>


      <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
      Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> inverted_list_all<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 此时拿到正派</span>
        ns_index<span class="token double-colon punctuation">::</span>DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span>item<span class="token punctuation">.</span>doc_id<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> doc<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token keyword">continue</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>

        <span class="token comment">// 获取了 文档内容</span>
        Json<span class="token double-colon punctuation">::</span>Value elem<span class="token punctuation">;</span>
        elem<span class="token punctuation">[</span><span class="token string">"title"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>title<span class="token punctuation">;</span>
        elem<span class="token punctuation">[</span><span class="token string">"desc"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token function">make_summary</span><span class="token punctuation">(</span>doc<span class="token operator">-></span>content<span class="token punctuation">,</span> item<span class="token punctuation">.</span>words<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 我们需要根据关键字来提取摘要</span>
        elem<span class="token punctuation">[</span><span class="token string">"url"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>url<span class="token punctuation">;</span>

        <span class="token comment">// fordebug</span>
        <span class="token comment">//  elem["id"] = (int)item.doc_id;</span>
        <span class="token comment">//  elem["weight"] = item.weight; // 会自动转成string</span>
        root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>elem<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是有序的</span>
      <span class="token punctuation">}</span>

      Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span> <span class="token comment">// 这里我们暂时用这个格式</span>
      <span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>
    ns_index<span class="token double-colon punctuation">::</span>Index <span class="token operator">*</span>index<span class="token punctuation">;</span> <span class="token comment">// 提供系统经行查找索引</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
</code></pre> 
  <h2>添加日志</h2> 
  <p>这里我们添加日志创建一个文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> log.hpp
</code></pre> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">pragma</span> <span class="token expression">once</span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><iostream></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><string></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><ctime></span></span>

<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">NORMAL</span> <span class="token expression"><span class="token number">1</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">WARNING</span> <span class="token expression"><span class="token number">2</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">DEBUG</span> <span class="token expression"><span class="token number">3</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">FATAL</span> <span class="token expression"><span class="token number">4</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name function">LOG</span><span class="token expression"><span class="token punctuation">(</span>LEVEL<span class="token punctuation">,</span> MESSAGE<span class="token punctuation">)</span> <span class="token function">log</span><span class="token punctuation">(</span>#LEVEL<span class="token punctuation">,</span> MESSAGE<span class="token punctuation">,</span> <span class="token constant">__FILE__</span><span class="token punctuation">,</span> <span class="token constant">__LINE__</span><span class="token punctuation">)</span></span></span>

<span class="token keyword">void</span> <span class="token function">log</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string level<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string message<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string file<span class="token punctuation">,</span> <span class="token keyword">int</span> line<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> level <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> <span class="token function">time</span><span class="token punctuation">(</span><span class="token keyword">nullptr</span><span class="token punctuation">)</span> <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> message <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> file <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"[:"</span> <span class="token operator"><<</span> line <span class="token operator"><<</span> <span class="token string">"]"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h3>在索引那里建立日志</h3> 
  <p><a href="http://img.e-com-net.com/image/info8/749bad2f4ece4cbc97ab323c30631dd1.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/749bad2f4ece4cbc97ab323c30631dd1.jpg" alt="Boost搜索引擎_第25张图片" width="650" height="154" style="border:1px solid black;"></a></p> 
  <h3>在搜索那里建立日志</h3> 
  <p><a href="http://img.e-com-net.com/image/info8/e2d2a083e9374e45b5f7a8f7f8e6e516.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/e2d2a083e9374e45b5f7a8f7f8e6e516.jpg" alt="Boost搜索引擎_第26张图片" width="650" height="224" style="border:1px solid black;"></a></p> 
  <h3>在服务端那里建立日志</h3> 
  <p><a href="http://img.e-com-net.com/image/info8/e6756302362a45b9922671004117a7de.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/e6756302362a45b9922671004117a7de.jpg" alt="Boost搜索引擎_第27张图片" width="650" height="316" style="border:1px solid black;"></a></p> 
  <h1>项目拓展</h1> 
  <p>这里我们可以扩展一下项目.</p> 
  <h2>摘要完善</h2> 
  <p>我们知道,分词的时候是可以去掉暂停词的.上面的我们都没有这么做.这是因为我们的如果加上去掉暂停词,此时对资源的要求非常大.那么这里可以作为一个扩展.jieba里面也有暂停词的集合.我们使用一下.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">class</span> <span class="token class-name">JiebaUtil</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">CutString</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token function">assert</span><span class="token punctuation">(</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">get_instance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">-></span><span class="token function">CutStringHelper</span><span class="token punctuation">(</span>src<span class="token punctuation">,</span> out<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

<span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">/// @brief 这里是分词</span>
    <span class="token comment">/// @param src</span>
    <span class="token comment">/// @param out</span>
    <span class="token keyword">void</span> <span class="token function">CutStringHelper</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      jieba<span class="token punctuation">.</span><span class="token function">CutForSearch</span><span class="token punctuation">(</span>src<span class="token punctuation">,</span> <span class="token operator">*</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> iter <span class="token operator">=</span> out<span class="token operator">-></span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> iter <span class="token operator">!=</span> out<span class="token operator">-></span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token keyword">auto</span> it <span class="token operator">=</span> stop_words<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span><span class="token operator">*</span>iter<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>it <span class="token operator">!=</span> stop_words<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token comment">// 此时是暂停词 删除</span>
          <span class="token comment">//  避免迭代器失效</span>
          <span class="token comment">// std::cout << *iter << std::endl;</span>
          iter <span class="token operator">=</span> out<span class="token operator">-></span><span class="token function">erase</span><span class="token punctuation">(</span>iter<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        <span class="token keyword">else</span>
        <span class="token punctuation">{</span>
          iter<span class="token operator">++</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
      <span class="token punctuation">}</span>
    <span class="token punctuation">}</span>
    <span class="token keyword">static</span> JiebaUtil <span class="token operator">*</span><span class="token function">get_instance</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">static</span> std<span class="token double-colon punctuation">::</span>mutex mtx<span class="token punctuation">;</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> instance<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        mtx<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> instance<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          instance <span class="token operator">=</span> <span class="token keyword">new</span> JiebaUtil<span class="token punctuation">;</span>
          instance<span class="token operator">-></span><span class="token function">InitJiebaUtil</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        mtx<span class="token punctuation">.</span><span class="token function">unlock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">return</span> instance<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// 这是我们的切分词</span>

    <span class="token keyword">void</span> <span class="token function">InitJiebaUtil</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      std<span class="token double-colon punctuation">::</span>ifstream <span class="token function">in</span><span class="token punctuation">(</span>STOP_WORD_PATH<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>in<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token function">LOG</span><span class="token punctuation">(</span>FATAL<span class="token punctuation">,</span> <span class="token string">"加载暂停词错误"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">return</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      std<span class="token double-colon punctuation">::</span>string line<span class="token punctuation">;</span>
      <span class="token keyword">while</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>in<span class="token punctuation">,</span> line<span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        stop_words<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">make_pair</span><span class="token punctuation">(</span>line<span class="token punctuation">,</span> <span class="token boolean">true</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      in<span class="token punctuation">.</span><span class="token function">close</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token keyword">static</span> JiebaUtil <span class="token operator">*</span>instance<span class="token punctuation">;</span>

    cppjieba<span class="token double-colon punctuation">::</span>Jieba jieba<span class="token punctuation">;</span>
    std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> <span class="token keyword">bool</span><span class="token operator">></span> stop_words<span class="token punctuation">;</span>
    <span class="token function">JiebaUtil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">jieba</span><span class="token punctuation">(</span>DICT_PATH<span class="token punctuation">,</span> HMM_PATH<span class="token punctuation">,</span> USER_DICT_PATH<span class="token punctuation">,</span> IDF_PATH<span class="token punctuation">,</span> STOP_WORD_PATH<span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token comment">// 拷贝构造等 delte</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
  JiebaUtil <span class="token operator">*</span>JiebaUtil<span class="token double-colon punctuation">::</span>instance <span class="token operator">=</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
</code></pre> 
  <h2>后台部署服务</h2> 
  <p>我们可以把它设置为精灵进程.</p> 
  <h3>nohup指令</h3> 
  <blockquote> 
   <p><strong>nohup的执行:</strong></p> 
   <p>nohup指令: 将服务进程以守护进程的方式执行 , 使关闭XShell之后仍可以访问该服务。</p> 
   <p>例如 nohup ./http_server</p> 
   <p>如果让程序在后台执行, 可以在末尾加上 & , 程序就会隐身 , 不会显示在终端。</p> 
   <p>例如 nohup ./http_server &</p> 
  </blockquote> 
  <blockquote> 
   <p><strong>nohup形成的文件:</strong></p> 
   <p>执行完上述的nohup指令之后,将会形成一个nohup.out存储日志信息文件,可以cat查看该文件</p> 
  </blockquote> 
  <h3>setsid</h3> 
  <p>我们也是可以使用下面的方式惊醒守护进程化</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">pragma</span> <span class="token expression">once</span></span>

<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><cstdio></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><iostream></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><signal.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><unistd.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><sys/types.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><sys/stat.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><fcntl.h></span></span>

<span class="token keyword">void</span> <span class="token function">daemonize</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
    <span class="token keyword">int</span> fd <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
    <span class="token comment">// 1. 忽略SIGPIPE</span>
    <span class="token function">signal</span><span class="token punctuation">(</span>SIGPIPE<span class="token punctuation">,</span> SIG_IGN<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// 2. 更改进程的工作目录</span>
    <span class="token comment">// chdir();</span>
    <span class="token comment">// 3. 让自己不要成为进程组组长</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token function">fork</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">></span> <span class="token number">0</span><span class="token punctuation">)</span>
        <span class="token function">exit</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// 4. 设置自己是一个独立的会话</span>
    <span class="token function">setsid</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// 5. 重定向0,1,2</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>fd <span class="token operator">=</span> <span class="token function">open</span><span class="token punctuation">(</span><span class="token string">"/dev/null"</span><span class="token punctuation">,</span> O_RDWR<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token comment">// fd == 3</span>
    <span class="token punctuation">{</span>
        <span class="token function">dup2</span><span class="token punctuation">(</span>fd<span class="token punctuation">,</span> STDIN_FILENO<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token function">dup2</span><span class="token punctuation">(</span>fd<span class="token punctuation">,</span> STDOUT_FILENO<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token function">dup2</span><span class="token punctuation">(</span>fd<span class="token punctuation">,</span> STDERR_FILENO<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token comment">// 6. 关闭掉不需要的fd</span>
        <span class="token keyword">if</span><span class="token punctuation">(</span>fd <span class="token operator">></span> STDERR_FILENO<span class="token punctuation">)</span> <span class="token function">close</span><span class="token punctuation">(</span>fd<span class="token punctuation">)</span><span class="token punctuation">;</span>
       <span class="token comment">// 6. close(0,1,2)// 严重不推荐</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>其他拓展</h2> 
  <ul> 
   <li>我们在搜索引擎中,对于权重的设置先后显示顺序,我们其实可以叠加一些算法,比如可以设置竞价排名,热点统计,额外增加某些文档的权重。</li> 
   <li>我们可以利用数据库,设置用户登录注册,引入对MySQL的使用。</li> 
  </ul> 
 </div> 
</div>
                            </div>
                        </div>
                    </div>
                    <!--PC和WAP自适应版-->
                    <div id="SOHUCS" sid="1701002251797082112"></div>
                    <script type="text/javascript" src="/views/front/js/chanyan.js"></script>
                    <!-- 文章页-底部 动态广告位 -->
                    <div class="youdao-fixed-ad" id="detail_ad_bottom"></div>
                </div>
                <div class="col-md-3">
                    <div class="row" id="ad">
                        <!-- 文章页-右侧1 动态广告位 -->
                        <div id="right-1" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad">
                            <div class="youdao-fixed-ad" id="detail_ad_1"> </div>
                        </div>
                        <!-- 文章页-右侧2 动态广告位 -->
                        <div id="right-2" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad">
                            <div class="youdao-fixed-ad" id="detail_ad_2"></div>
                        </div>
                        <!-- 文章页-右侧3 动态广告位 -->
                        <div id="right-3" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad">
                            <div class="youdao-fixed-ad" id="detail_ad_3"></div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
    <div class="container">
        <h4 class="pt20 mb15 mt0 border-top">你可能感兴趣的:(项目,搜索引擎,git,github,centos,c++,visualstudio)</h4>
        <div id="paradigm-article-related">
            <div class="recommend-post mb30">
                <ul class="widget-links">
                    <li><a href="/article/1900882116510543872.htm"
                           title="高级线程管理_第九章_《C++并发编程实战》笔记" target="_blank">高级线程管理_第九章_《C++并发编程实战》笔记</a>
                        <span class="text-muted">郭涤生</span>
<a class="tag" taget="_blank" href="/search/%23/1.htm">#</a><a class="tag" taget="_blank" href="/search/%E5%B9%B6%E5%8F%91%E7%BA%BF%E7%A8%8B/1.htm">并发线程</a><a class="tag" taget="_blank" href="/search/c%2Fc%2B%2B/1.htm">c/c++</a><a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a><a class="tag" taget="_blank" href="/search/%E5%B9%B6%E5%8F%91%E7%BC%96%E7%A8%8B/1.htm">并发编程</a>
                        <div>高级线程管理1.线程池(ThreadPool)1.1线程池结构要素1.2线程池实现步骤2.线程中断(InterruptibleThreads)2.1中断机制实现多选题多选题答案设计题目设计题目答案1.线程池(ThreadPool)核心目的:避免频繁创建/销毁线程,复用固定数量的线程处理任务队列。1.1线程池结构要素任务队列:存储待执行的任务(函数对象)工作线程集合:执行任务的线程同步机制:互斥锁(</div>
                    </li>
                    <li><a href="/article/1900881612099350528.htm"
                           title="为什么需要进行软件测试需求分析?专业第三方软件测评中心分享" target="_blank">为什么需要进行软件测试需求分析?专业第三方软件测评中心分享</a>
                        <span class="text-muted">第三方软件测评</span>
<a class="tag" taget="_blank" href="/search/%E9%9C%80%E6%B1%82%E5%88%86%E6%9E%90/1.htm">需求分析</a>
                        <div>一、什么是软件测试需求分析?软件测试需求就是了解软件测试要测试什么项目,只有明确了测试需求,才能确定如何进行测试工作、测试时间、测试人员、测试环境、测试工具等等,这些都是测试计划设计的基本要素,因此测试需求则是测试计划的基础与重点。测试需求分析是分析软件应满足的用户需求点,总结出一份软件需求规格说明书,测试人员按照需求规格说明书进行测试。二、为什么要进行软件测试需求分析?1、软件测试需求是设计测试</div>
                    </li>
                    <li><a href="/article/1900880704875589632.htm"
                           title="Python 潮流周刊#93:为什么“if not list”比len()快2倍?(摘要)" target="_blank">Python 潮流周刊#93:为什么“if not list”比len()快2倍?(摘要)</a>
                        <span class="text-muted"></span>
<a class="tag" taget="_blank" href="/search/python/1.htm">python</a>
                        <div>本周刊由Python猫出品,精心筛选国内外的250+信息源,为你挑选最值得分享的文章、教程、开源项目、软件工具、播客和视频、热门话题等内容。愿景:帮助所有读者精进Python技术,并增长职业和副业的收入。分享了12篇文章,12个开源项目以下是本期摘要:文章&教程①为什么Python中'ifnotlist'比len()快2倍?②掌握Python单体代码库③Python3.14尾调用解释器的性能④Py</div>
                    </li>
                    <li><a href="/article/1900880324745818112.htm"
                           title="人大预算联网监督系统" target="_blank">人大预算联网监督系统</a>
                        <span class="text-muted"></span>
<a class="tag" taget="_blank" href="/search/%E5%89%8D%E7%AB%AF%E4%BA%A7%E5%93%81%E4%BA%A7%E5%93%81%E8%AE%BE%E8%AE%A1/1.htm">前端产品产品设计</a>
                        <div>人大财政预算联网监督是建立和完善中国特色社会主义预算审查监督制度的有益探索,是贯彻实施预算法,加强对政府全口径预算决算审查监督,推动实施全面规范、公开透明预算制度的客观需要,是对人大预算审查监督工作的创新发展。项目地址:Github、国内Gitee演示地址:http://silianpan.cn/bss/以下是演示角色和账号(密码同账号):超级管理员:seal_adminXXX市人大管理员:xxx</div>
                    </li>
                    <li><a href="/article/1900880323420418048.htm"
                           title="Visual Studio Code (VS Code) – C/C++ 入门" target="_blank">Visual Studio Code (VS Code) – C/C++ 入门</a>
                        <span class="text-muted"></span>

                        <div>——基于VisualStudioCode官方文档的全面的、具体的入门级教程请移步至https://blog.csdn.net/m0_73287396/article/details/128635316</div>
                    </li>
                    <li><a href="/article/1900876946716291072.htm"
                           title="Java:AI 浪潮中的隐形支柱 —— 探秘 Java 在人工智能领域的独特地位" target="_blank">Java:AI 浪潮中的隐形支柱 —— 探秘 Java 在人工智能领域的独特地位</a>
                        <span class="text-muted">琢磨先生David</span>
<a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a>
                        <div>引言在人工智能技术席卷全球的今天,当人们谈论AI开发时,Python、R语言、C++等工具总是最先被提及。然而在这个充满创新的领域,有一个"老兵"正悄然发挥着不可替代的作用——自1995年诞生至今的Java语言,凭借其独特的工程化基因,正在构建起AI世界的底层基础设施。本文将揭示Java如何在大数据、机器学习、企业级AI系统等领域持续创造价值。一、Java的AI基因解码跨平台优势的现代意义"一次编</div>
                    </li>
                    <li><a href="/article/1900874047030358016.htm"
                           title="Flet 项目常见问题解决方案" target="_blank">Flet 项目常见问题解决方案</a>
                        <span class="text-muted">龙香令Beatrice</span>

                        <div>Flet项目常见问题解决方案fletFletenablesdeveloperstoeasilybuildrealtimeweb,mobileanddesktopappsinPython.Nofrontendexperiencerequired.项目地址:https://gitcode.com/gh_mirrors/fl/flet1.项目基础介绍和主要编程语言Flet是一个开源框架,允许开发者在Py</div>
                    </li>
                    <li><a href="/article/1900873920664367104.htm"
                           title="Flet 框架教程" target="_blank">Flet 框架教程</a>
                        <span class="text-muted">樊贝路Strawberry</span>

                        <div>Flet框架教程fletFletenablesdeveloperstoeasilybuildrealtimeweb,mobileanddesktopappsinPython.Nofrontendexperiencerequired.项目地址:https://gitcode.com/gh_mirrors/fl/flet1.项目介绍Flet是一个框架,它允许开发者使用Python轻松构建实时的Web、</div>
                    </li>
                    <li><a href="/article/1900867613186125824.htm"
                           title="无需月费,完全本地运行!开源神器Local Deep Research解锁AI研究新姿势" target="_blank">无需月费,完全本地运行!开源神器Local Deep Research解锁AI研究新姿势</a>
                        <span class="text-muted">遇见小码</span>
<a class="tag" taget="_blank" href="/search/AI%E6%A3%B1%E9%95%9C%E5%AE%9E%E9%AA%8C%E5%AE%A4/1.htm">AI棱镜实验室</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E6%BA%90/1.htm">开源</a><a class="tag" taget="_blank" href="/search/github/1.htm">github</a>
                        <div>在AI技术日新月异的今天,动辄数百美元的订阅费和高性能硬件需求,让许多开发者和小团队对前沿研究工具望而却步。然而,近期一款名为LocalDeepResearch的开源项目横空出世,凭借完全免费、本地化运行、高度可定制的特性,迅速成为技术社区的热议焦点。它不仅打破了传统AI研究工具的高昂门槛,更让每个人都能轻松拥有堪比专业团队的研究能力!一、LocalDeepResearch是什么?LocalDee</div>
                    </li>
                    <li><a href="/article/1900867234566303744.htm"
                           title="正则表达式" target="_blank">正则表达式</a>
                        <span class="text-muted">大神乔伊</span>
<a class="tag" taget="_blank" href="/search/%E5%B7%A5%E5%85%B7/1.htm">工具</a><a class="tag" taget="_blank" href="/search/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/1.htm">正则表达式</a>
                        <div>前言如果你学会了,可以忽略本文章,或去项目经验地图寻找更多答案原则1.找规律2.不要追求完美3.思考:什么开头?什么结束?什么类型?多少位数?什么范围?出现次数?语法一:对象写法letreg=newRegExp(/\d{5}/)letreg=newRegExp("\\d{5}")letstr='我的号码是12345'console.log(reg.test(str))//true二:字面量方式:l</div>
                    </li>
                    <li><a href="/article/1900865847073763328.htm"
                           title="C++程序设计语言笔记——抽象机制:泛型程序设计" target="_blank">C++程序设计语言笔记——抽象机制:泛型程序设计</a>
                        <span class="text-muted">钺不言</span>
<a class="tag" taget="_blank" href="/search/C%2B%2B%E7%AC%94%E8%AE%B0/1.htm">C++笔记</a><a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a><a class="tag" taget="_blank" href="/search/%E7%AC%94%E8%AE%B0/1.htm">笔记</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/%E7%BB%8F%E9%AA%8C%E5%88%86%E4%BA%AB/1.htm">经验分享</a>
                        <div>0模板可传递实参类型而不丢失信息。在C++中,模板传递实参类型时保留所有类型信息的关键在于正确使用引用和转发机制。以下是几种常见场景的解决方案:1.使用万能引用(UniversalReference)和完美转发通过T&&捕获任意类型的引用(左值/右值),结合std::forward保留原始类型信息:templatevoidwrapper(T&&arg){//完美转发,保留所有类型信息(包括cons</div>
                    </li>
                    <li><a href="/article/1900864837299269632.htm"
                           title="关于支付宝授权用户信息" target="_blank">关于支付宝授权用户信息</a>
                        <span class="text-muted">道系女孩~</span>
<a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/php/1.htm">php</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E5%BA%93/1.htm">数据库</a>
                        <div>最近做的一个项目授权支付宝信息进行报名支付以下是流程1、一个首先引进阿里相关配置信息2、因为我这边项目是支持小程序、H5、支付宝登录报名的,我这边只展示支付宝代码哦对啦微信不同应用下unionid是一样的,所以可以将小程序/H5下的视为同一用户,好啦接下来说说支付宝吧3、elseif($data['type']==ActivityUser::TYPE_ALI){list($res1,$info1)</div>
                    </li>
                    <li><a href="/article/1900862944485371904.htm"
                           title="跟着黑马学MySQL基础篇笔记(4)-多表查询" target="_blank">跟着黑马学MySQL基础篇笔记(4)-多表查询</a>
                        <span class="text-muted">小杜不吃糖</span>
<a class="tag" taget="_blank" href="/search/mysql/1.htm">mysql</a><a class="tag" taget="_blank" href="/search/%E7%AC%94%E8%AE%B0/1.htm">笔记</a>
                        <div>37.多表查询-多表关系介绍多表关系概述项目开发中,在进行数据库表结构设计时,会根据业务需求及业务模块之间的关系,分析并设计表结构,由于业务之间相互关联,所以各个表结构之间也存在着各种联系,基本上分为三种:一对多(多对一)多对多一对一一对多(多对一)案例:部门与员工的关系关系:一个部门对应多个员工,一个员工对应一个部门实现:在多的一方建立外键,指向一的一方的主键多对多案例:学生与课程的关系一个学生</div>
                    </li>
                    <li><a href="/article/1900862313326505984.htm"
                           title="全局路径规划器:full_coverage_path_planner完全指南" target="_blank">全局路径规划器:full_coverage_path_planner完全指南</a>
                        <span class="text-muted">段钰忻</span>

                        <div>全局路径规划器:full_coverage_path_planner完全指南项目地址:https://gitcode.com/gh_mirrors/fu/full_coverage_path_planner项目介绍full_coverage_path_planner是一个在ROS(RobotOperatingSystem)环境下开发的开源全局路径规划算法实现,旨在提供全面覆盖的路径规划解决方案。该</div>
                    </li>
                    <li><a href="/article/1900861178616279040.htm"
                           title="How to install phpMyAdmin on CentOS 8 / AlmaLinux 8 / RockyLinux 8" target="_blank">How to install phpMyAdmin on CentOS 8 / AlmaLinux 8 / RockyLinux 8</a>
                        <span class="text-muted">Evoxt 益沃斯</span>
<a class="tag" taget="_blank" href="/search/centos/1.htm">centos</a><a class="tag" taget="_blank" href="/search/android/1.htm">android</a><a class="tag" taget="_blank" href="/search/linux/1.htm">linux</a>
                        <div>phpMyAdminisaweb-basedfreeandopen-sourcetoolthatiswritteninPHPtohelpuserstomanagetheirdatabaseeasily.WithphpMyAdmin,userscaneasilycreateandmanagedatabases,importandexportdataandevenexecutingSQLqueries</div>
                    </li>
                    <li><a href="/article/1900860926001737728.htm"
                           title="c++成绩排名" target="_blank">c++成绩排名</a>
                        <span class="text-muted">vir02</span>
<a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a><a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a><a class="tag" taget="_blank" href="/search/%E7%AE%97%E6%B3%95/1.htm">算法</a>
                        <div>编写一个学生类,包含学号(string)、姓名(string)和成绩(double)三个私有属性,以及设置姓名、学号和成绩值,获得成绩值,输出姓名、学号和成绩等的公有成员函数。根据输入的人数,定义学生类对象数组,并读入学生信息,然后按照成绩由高低顺序排序并输出。输入格式:第1行输入学生人数n(0#includeusingnamespacestd;classStudent{public:string</div>
                    </li>
                    <li><a href="/article/1900860547407081472.htm"
                           title="【大一新生必收藏系列】❤机器学习7大方面,30个数据集。纯干货分享❤" target="_blank">【大一新生必收藏系列】❤机器学习7大方面,30个数据集。纯干货分享❤</a>
                        <span class="text-muted">.Boss.</span>
<a class="tag" taget="_blank" href="/search/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/1.htm">机器学习</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E7%AE%97%E6%B3%95/1.htm">算法</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/%E7%AC%94%E8%AE%B0/1.htm">笔记</a><a class="tag" taget="_blank" href="/search/%23%E5%A4%A7%E4%B8%80%E6%96%B0%E7%94%9F/1.htm">#大一新生</a>
                        <div>.记住了就可以跟同学装起来了嗷....目录.纯干货回归问题分类问题图像分类文本情感分析自然语言处理自动驾驶金融类...........纯干货..................在刚刚开始学习算法的时候,大家有没有过这种感觉,最最重要的那必须是算法本身!其实在一定程度上忽略了数据的重要性。而事实上一定是,质量高的数据集可能是最重要的!数据集在机器学习算法项目中具有非常关键的重要性,数据集的大小、质量</div>
                    </li>
                    <li><a href="/article/1900860301079801856.htm"
                           title="如何应对 IT 项目中的需求变更?" target="_blank">如何应对 IT 项目中的需求变更?</a>
                        <span class="text-muted"></span>
<a class="tag" taget="_blank" href="/search/%E9%9C%80%E6%B1%82%E7%AE%A1%E7%90%86/1.htm">需求管理</a>
                        <div>在IT项目管理中,需求变更是常见且难以避免的问题,无论是由于市场环境变化、技术更新还是用户需求调整,需求变更都可能影响项目进度、成本和质量。因此,项目团队必须具备有效的应对策略。首先,明确需求管理的流程、设立变更控制机制以及与客户和相关方保持密切沟通是确保项目顺利推进的关键。在此基础上,项目经理需要做出适时的决策,灵活调整计划,并保持对变更带来影响的预判能力。本文将深入探讨如何有效管理和应对IT项</div>
                    </li>
                    <li><a href="/article/1900859286574133248.htm"
                           title="DeepSeek 使用教程及部署指南:从入门到实践" target="_blank">DeepSeek 使用教程及部署指南:从入门到实践</a>
                        <span class="text-muted">点我头像干啥</span>
<a class="tag" taget="_blank" href="/search/Ai/1.htm">Ai</a><a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">信息可视化</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E5%88%86%E7%B1%BB/1.htm">分类</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E6%8C%96%E6%8E%98/1.htm">数据挖掘</a><a class="tag" taget="_blank" href="/search/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/1.htm">深度学习</a>
                        <div>目录引言第一部分:DeepSeek简介1.1什么是DeepSeek?1.2DeepSeek的核心功能1.3DeepSeek的应用场景第二部分:DeepSeek使用教程2.1注册与登录2.2创建项目2.3数据导入2.4数据分析2.5文本挖掘2.6信息检索2.7保存与分享第三部分:DeepSeek部署指南3.1本地部署3.1.1环境准备3.1.2安装DeepSeek3.1.3启动DeepSeek3.2</div>
                    </li>
                    <li><a href="/article/1900859164641521664.htm"
                           title="SvelteKit 最新中文文档教程(3)—— 数据加载" target="_blank">SvelteKit 最新中文文档教程(3)—— 数据加载</a>
                        <span class="text-muted"></span>

                        <div>前言Svelte,一个语法简洁、入门容易,面向未来的前端框架。从Svelte诞生之初,就备受开发者的喜爱,根据统计,从2019年到2024年,连续6年一直是开发者最感兴趣的前端框架No.1:Svelte以其独特的编译时优化机制著称,具有轻量级、高性能、易上手等特性,非常适合构建轻量级Web项目。为了帮助大家学习Svelte,我同时搭建了Svelte最新的中文文档站点。如果需要进阶学习,也可以入手我</div>
                    </li>
                    <li><a href="/article/1900859167208435712.htm"
                           title="PHP 日志系统的最佳搭档:一个 Go 写的远程日志收集服务" target="_blank">PHP 日志系统的最佳搭档:一个 Go 写的远程日志收集服务</a>
                        <span class="text-muted"></span>
<a class="tag" taget="_blank" href="/search/phpgo%E6%97%A5%E5%BF%97%E5%88%86%E6%9E%90/1.htm">phpgo日志分析</a>
                        <div>之前折腾了一个PHP日志系统,终于能让项目的错误信息乖乖地记录到日志里了。但问题又来了:日志是存了,可我怎么知道它什么时候爆炸了?有些错误轻微到无关紧要,有些错误严重到能把整个系统送走,但如果我要知道这些错误,我得SSH进服务器,然后手动去翻日志,效率低得要死。而且,多个服务器运行着同样的代码,有的报错,有的没事,我根本不知道到底哪里出了问题。于是,为了在bug出现的时候第一时间收到消息,而不是等</div>
                    </li>
                    <li><a href="/article/1900858656346402816.htm"
                           title="laravel项目中使用FFMPeg 剪裁视频" target="_blank">laravel项目中使用FFMPeg 剪裁视频</a>
                        <span class="text-muted">道系女孩~</span>
<a class="tag" taget="_blank" href="/search/php/1.htm">php</a><a class="tag" taget="_blank" href="/search/laravel/1.htm">laravel</a>
                        <div>#运行环境需安装的软件ffmpeg#安装的扩展pbmedia/laravel-ffmpeg:^8.3#扩展文档https://packagist.org/packages/pbmedia/laravel-ffmpeg#引入的类useFFMpeg\Coordinate\TimeCode;useFFMpeg\Format\Video\X264;useFFMpeg\Exception\RuntimeEx</div>
                    </li>
                    <li><a href="/article/1900857270959403008.htm"
                           title="探索Pydoll:基于Python的无驱动浏览器自动化新星" target="_blank">探索Pydoll:基于Python的无驱动浏览器自动化新星</a>
                        <span class="text-muted">几道之旅</span>
<a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/%E6%99%BA%E8%83%BD%E4%BD%93%E5%8F%8A%E6%95%B0%E5%AD%97%E5%91%98%E5%B7%A5/1.htm">智能体及数字员工</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E8%87%AA%E5%8A%A8%E5%8C%96/1.htm">自动化</a><a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a>
                        <div>在当今Web自动化与数据抓取领域,基于Chromium的工具层出不穷,但大多数方案依赖WebDriver或额外的浏览器插件。Pydoll作为一款新兴的Python库,以无驱动架构和原生异步支持迅速成为开发者关注的焦点。本文将从技术原理、核心功能、应用场景及实战案例多角度解析这一工具。一、Pydoll项目概览Pydoll由开发者thalissonvs等团队维护,旨在通过Python实现对Chromi</div>
                    </li>
                    <li><a href="/article/1900854246031552512.htm"
                           title="Mahilo技术深度解析:构建下一代人机协同智能系统的开源框架" target="_blank">Mahilo技术深度解析:构建下一代人机协同智能系统的开源框架</a>
                        <span class="text-muted">花生糖@</span>
<a class="tag" taget="_blank" href="/search/AIGC%E5%AD%A6%E4%B9%A0%E8%B5%84%E6%96%99%E5%BA%93/1.htm">AIGC学习资料库</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E6%BA%90/1.htm">开源</a><a class="tag" taget="_blank" href="/search/%E6%99%BA%E8%83%BD%E4%BD%93/1.htm">智能体</a><a class="tag" taget="_blank" href="/search/mahilo/1.htm">mahilo</a>
                        <div>一、框架定位与技术突破Mahilo作为2025年最受关注的多智能体协作框架,其创新性在于实现了人机协同的闭环控制与智能体自主协作的动态平衡。根据GitHub仓库数据显示,该框架在开源首周即获得3.2k星标,在医疗、金融、工业等领域的15个场景验证中,任务执行效率提升58%。核心技术创新混合通信协议:支持点对点(P2P)与层级式通信的灵活切换,在911紧急响应场景测试中,医疗/物流/通信智能体的协作</div>
                    </li>
                    <li><a href="/article/1900853993463148544.htm"
                           title="Java 和 Kotlin 实现 23 种设计模式:从理论到实践" target="_blank">Java 和 Kotlin 实现 23 种设计模式:从理论到实践</a>
                        <span class="text-muted">tangweiguo03051987</span>
<a class="tag" taget="_blank" href="/search/android/1.htm">android</a><a class="tag" taget="_blank" href="/search/Kotlin%E8%AF%AD%E6%B3%95/1.htm">Kotlin语法</a><a class="tag" taget="_blank" href="/search/android/1.htm">android</a><a class="tag" taget="_blank" href="/search/kotlin/1.htm">kotlin</a><a class="tag" taget="_blank" href="/search/java/1.htm">java</a>
                        <div>设计模式是软件开发中解决常见问题的经典解决方案模板。它们帮助开发者编写可维护、可扩展和可重用的代码。本文详细介绍了23种经典设计模式,包括创建型、结构型和行为型模式,并提供了Java和Kotlin的完整实现示例。无论你是初学者还是有经验的开发者,本文都能帮助你深入理解设计模式的核心思想,并将其应用到实际项目中。Java和Kotlin实现23种设计模式设计模式是软件开发中常见问题的解决方案模板。它们</div>
                    </li>
                    <li><a href="/article/1900853489534300160.htm"
                           title="Nginx、LVS、HAProxy 的区别和优缺点" target="_blank">Nginx、LVS、HAProxy 的区别和优缺点</a>
                        <span class="text-muted">青年夏日科技</span>
<a class="tag" taget="_blank" href="/search/nginx/1.htm">nginx</a><a class="tag" taget="_blank" href="/search/%E8%BF%90%E7%BB%B4/1.htm">运维</a>
                        <div>Nginx、LVS、HAProxy是目前使用最广泛的三种负载均衡软件,本人都在多个项目中实施过,通常会结合Keepalive做健康检查,实现故障转移的高可用功能。1)在四层(tcp)实现负载均衡的软件:lvs------>重量级nginx------>轻量级,带缓存功能,正则表达式较灵活haproxy------>模拟四层转发,较灵活2)在七层(http)实现反向代理的软件:haproxy----</div>
                    </li>
                    <li><a href="/article/1900850717065801728.htm"
                           title="C++设计模式-观察者模式:从基本介绍,内部原理、应用场景、使用方法,常见问题和解决方案进行深度解析" target="_blank">C++设计模式-观察者模式:从基本介绍,内部原理、应用场景、使用方法,常见问题和解决方案进行深度解析</a>
                        <span class="text-muted">牵牛老人</span>
<a class="tag" taget="_blank" href="/search/C%2B%2B%E4%B8%93%E6%A0%8F/1.htm">C++专栏</a><a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a><a class="tag" taget="_blank" href="/search/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/1.htm">设计模式</a><a class="tag" taget="_blank" href="/search/%E8%A7%82%E5%AF%9F%E8%80%85%E6%A8%A1%E5%BC%8F/1.htm">观察者模式</a>
                        <div>一、基本介绍1.1模式定义与核心思想观察者模式(ObserverPattern)是一种行为型设计模式,它定义了对象间一对多的依赖关系。当被观察对象(Subject)状态改变时,所有依赖它的观察者(Observer)都会自动收到通知并更新。这种模式类似于报纸订阅机制——报社发布新刊时,所有订阅者都会收到最新报纸。1.2模式价值体现解耦利器:将事件发布者与订阅者解耦,提升系统扩展性动态响应:支持运行时</div>
                    </li>
                    <li><a href="/article/1900850715018981376.htm"
                           title="C++设计模式-工厂模式:从原理、适用场景、使用方法,常见问题和解决方案深度解析" target="_blank">C++设计模式-工厂模式:从原理、适用场景、使用方法,常见问题和解决方案深度解析</a>
                        <span class="text-muted">牵牛老人</span>
<a class="tag" taget="_blank" href="/search/C%2B%2B%E4%B8%93%E6%A0%8F/1.htm">C++专栏</a><a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a><a class="tag" taget="_blank" href="/search/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/1.htm">设计模式</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a>
                        <div>一、工厂模式的核心原理工厂模式是一种创建型设计模式,其核心思想是通过将对象创建的职责从客户端代码中剥离,交由专门的工厂类来管理。这种模式通过"封装对象创建过程"特性,实现了以下设计原则:开放封闭原则工厂模式允许系统在不修改已有代码的前提下扩展新的产品类型。如处理器内核的生产案例中,新增型号只需扩展新工厂而非修改原有逻辑。单一职责原则创建对象的逻辑集中在工厂类中,客户端只需关注接口调用,避免了对象构</div>
                    </li>
                    <li><a href="/article/1900848949569318912.htm"
                           title="flutter跑马灯" target="_blank">flutter跑马灯</a>
                        <span class="text-muted">我是刘成</span>
<a class="tag" taget="_blank" href="/search/flutter/1.htm">flutter</a><a class="tag" taget="_blank" href="/search/flutter/1.htm">flutter</a><a class="tag" taget="_blank" href="/search/flutter%E8%B7%91%E9%A9%AC%E7%81%AF/1.htm">flutter跑马灯</a>
                        <div>flutter_marqueeflutter插件flutter跑马灯可以指定跑马灯的方向可以传入数组,可以是自定义的widget可以控制跑马灯的时间间隔控制点击事件等等效果图githttps://github.com/LiuC520/flutter_marquee引入:dependencies:flutter:sdk:flutterflutter_marquee:git:https://githu</div>
                    </li>
                    <li><a href="/article/1900848316862754816.htm"
                           title="OnionArch 项目教程" target="_blank">OnionArch 项目教程</a>
                        <span class="text-muted">宁彦腾</span>

                        <div>OnionArch项目教程OnionArchA.NETCoredemoapplicationwhichusestheOnionArchitecture项目地址:https://gitcode.com/gh_mirrors/on/OnionArch1.项目介绍OnionArch是一个基于.NETCore的演示应用程序,采用了洋葱架构(OnionArchitecture)。洋葱架构是一种软件设计模式,</div>
                    </li>
                                <li><a href="/article/92.htm"
                                       title="log4j对象改变日志级别" target="_blank">log4j对象改变日志级别</a>
                                    <span class="text-muted">3213213333332132</span>
<a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/log4j/1.htm">log4j</a><a class="tag" taget="_blank" href="/search/level/1.htm">level</a><a class="tag" taget="_blank" href="/search/log4j%E5%AF%B9%E8%B1%A1%E5%90%8D%E7%A7%B0/1.htm">log4j对象名称</a><a class="tag" taget="_blank" href="/search/%E6%97%A5%E5%BF%97%E7%BA%A7%E5%88%AB/1.htm">日志级别</a>
                                    <div>log4j对象改变日志级别可批量的改变所有级别,或是根据条件改变日志级别。 
 
log4j配置文件: 
 
 

log4j.rootLogger=ERROR,FILE,CONSOLE,EXECPTION
 
#log4j.appender.FILE=org.apache.log4j.RollingFileAppender
log4j.appender.FILE=org.apache.l</div>
                                </li>
                                <li><a href="/article/219.htm"
                                       title="elk+redis 搭建nginx日志分析平台" target="_blank">elk+redis 搭建nginx日志分析平台</a>
                                    <span class="text-muted">ronin47</span>
<a class="tag" taget="_blank" href="/search/elasticsearch/1.htm">elasticsearch</a><a class="tag" taget="_blank" href="/search/kibana/1.htm">kibana</a><a class="tag" taget="_blank" href="/search/logstash/1.htm">logstash</a>
                                    <div>              elk+redis 搭建nginx日志分析平台 
logstash,elasticsearch,kibana 怎么进行nginx的日志分析呢?首先,架构方面,nginx是有日志文件的,它的每个请求的状态等都有日志文件进行记录。其次,需要有个队 列,redis的l</div>
                                </li>
                                <li><a href="/article/346.htm"
                                       title="Yii2设置时区" target="_blank">Yii2设置时区</a>
                                    <span class="text-muted">dcj3sjt126com</span>
<a class="tag" taget="_blank" href="/search/PHP/1.htm">PHP</a><a class="tag" taget="_blank" href="/search/timezone/1.htm">timezone</a><a class="tag" taget="_blank" href="/search/yii2/1.htm">yii2</a>
                                    <div>时区这东西,在开发的时候,你说重要吧,也还好,毕竟没它也能正常运行,你说不重要吧,那就纠结了。特别是linux系统,都TMD差上几小时,你能不痛苦吗?win还好一点。有一些常规方法,是大家目前都在采用的1、php.ini中的设置,这个就不谈了,2、程序中公用文件里设置,date_default_timezone_set一下时区3、或者。。。自己写时间处理函数,在遇到时间的时候,用这个函数处理(比较</div>
                                </li>
                                <li><a href="/article/473.htm"
                                       title="js实现前台动态添加文本框,后台获取文本框内容" target="_blank">js实现前台动态添加文本框,后台获取文本框内容</a>
                                    <span class="text-muted">171815164</span>
<a class="tag" taget="_blank" href="/search/%E6%96%87%E6%9C%AC%E6%A1%86/1.htm">文本框</a>
                                    <div>

<%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://w</div>
                                </li>
                                <li><a href="/article/600.htm"
                                       title="持续集成工具" target="_blank">持续集成工具</a>
                                    <span class="text-muted">g21121</span>
<a class="tag" taget="_blank" href="/search/%E6%8C%81%E7%BB%AD%E9%9B%86%E6%88%90/1.htm">持续集成</a>
                                    <div>        持续集成是什么?我们为什么需要持续集成?持续集成带来的好处是什么?什么样的项目需要持续集成?...        持续集成(Continuous integration ,简称CI),所谓集成可以理解为将互相依赖的工程或模块合并成一个能单独运行</div>
                                </li>
                                <li><a href="/article/727.htm"
                                       title="数据结构哈希表(hash)总结" target="_blank">数据结构哈希表(hash)总结</a>
                                    <span class="text-muted">永夜-极光</span>
<a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/1.htm">数据结构</a>
                                    <div>1.什么是hash 
来源于百度百科: 
Hash,一般翻译做“散列”,也有直接音译为“哈希”的,就是把任意长度的输入,通过散列算法,变换成固定长度的输出,该输出就是散列值。这种转换是一种压缩映射,也就是,散列值的空间通常远小于输入的空间,不同的输入可能会散列成相同的输出,所以不可能从散列值来唯一的确定输入值。简单的说就是一种将任意长度的消息压缩到某一固定长度的消息摘要的函数。 
  
</div>
                                </li>
                                <li><a href="/article/854.htm"
                                       title="乱七八糟" target="_blank">乱七八糟</a>
                                    <span class="text-muted">程序员是怎么炼成的</span>

                                    <div>eclipse中的jvm字节码查看插件地址: 
http://andrei.gmxhome.de/eclipse/ 
安装该地址的outline 插件  后重启,打开window下的view下的bytecode视图 
http://andrei.gmxhome.de/eclipse/ 
  
jvm博客: 
http://yunshen0909.iteye.com/blog/2</div>
                                </li>
                                <li><a href="/article/981.htm"
                                       title="职场人伤害了“上司” 怎样弥补" target="_blank">职场人伤害了“上司” 怎样弥补</a>
                                    <span class="text-muted">aijuans</span>
<a class="tag" taget="_blank" href="/search/%E8%81%8C%E5%9C%BA/1.htm">职场</a>
                                    <div> 由于工作中的失误,或者平时不注意自己的言行“伤害”、“得罪”了自己的上司,怎么办呢? 
  在职业生涯中这种问题尽量不要发生。下面提供了一些解决问题的建议: 
  一、利用一些轻松的场合表示对他的尊重 
  即使是开明的上司也很注重自己的权威,都希望得到下属的尊重,所以当你与上司冲突后,最好让不愉快成为过去,你不妨在一些轻松的场合,比如会餐、联谊活动等,向上司问个好,敬下酒,表示你对对方的尊重,</div>
                                </li>
                                <li><a href="/article/1108.htm"
                                       title="深入浅出url编码" target="_blank">深入浅出url编码</a>
                                    <span class="text-muted">antonyup_2006</span>
<a class="tag" taget="_blank" href="/search/%E5%BA%94%E7%94%A8%E6%9C%8D%E5%8A%A1%E5%99%A8/1.htm">应用服务器</a><a class="tag" taget="_blank" href="/search/%E6%B5%8F%E8%A7%88%E5%99%A8/1.htm">浏览器</a><a class="tag" taget="_blank" href="/search/servlet/1.htm">servlet</a><a class="tag" taget="_blank" href="/search/weblogic/1.htm">weblogic</a><a class="tag" taget="_blank" href="/search/IE/1.htm">IE</a>
                                    <div>出处:http://blog.csdn.net/yzhz  杨争   
http://blog.csdn.net/yzhz/archive/2007/07/03/1676796.aspx 
 
一、问题: 
        编码问题是JAVA初学者在web开发过程中经常会遇到问题,网上也有大量相关的</div>
                                </li>
                                <li><a href="/article/1235.htm"
                                       title="建表后创建表的约束关系和增加表的字段" target="_blank">建表后创建表的约束关系和增加表的字段</a>
                                    <span class="text-muted">百合不是茶</span>
<a class="tag" taget="_blank" href="/search/%E6%A0%87%E7%9A%84%E7%BA%A6%E6%9D%9F%E5%85%B3%E7%B3%BB/1.htm">标的约束关系</a><a class="tag" taget="_blank" href="/search/%E5%A2%9E%E5%8A%A0%E8%A1%A8%E7%9A%84%E5%AD%97%E6%AE%B5/1.htm">增加表的字段</a>
                                    <div>  
下面所有的操作都是在表建立后操作的,主要目的就是熟悉sql的约束,约束语句的万能公式 
  
1,增加字段(student表中增加 姓名字段) 
  
alter table 增加字段的表名 add  增加的字段名   增加字段的数据类型

  alter table student add name varchar2(10);
 
  
&nb</div>
                                </li>
                                <li><a href="/article/1362.htm"
                                       title="Uploadify 3.2 参数属性、事件、方法函数详解" target="_blank">Uploadify 3.2 参数属性、事件、方法函数详解</a>
                                    <span class="text-muted">bijian1013</span>
<a class="tag" taget="_blank" href="/search/JavaScript/1.htm">JavaScript</a><a class="tag" taget="_blank" href="/search/uploadify/1.htm">uploadify</a>
                                    <div>一.属性     
属性名称   
默认值   
说明     
auto   
true   
设置为true当选择文件后就直接上传了,为false需要点击上传按钮才上传。     
buttonClass   
”   
按钮样式     
buttonCursor   
‘hand’   
鼠标指针悬停在按钮上的样子     
buttonImage   
null   
浏览按钮的图片的路</div>
                                </li>
                                <li><a href="/article/1489.htm"
                                       title="精通Oracle10编程SQL(16)使用LOB对象" target="_blank">精通Oracle10编程SQL(16)使用LOB对象</a>
                                    <span class="text-muted">bijian1013</span>
<a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E5%BA%93/1.htm">数据库</a><a class="tag" taget="_blank" href="/search/plsql/1.htm">plsql</a>
                                    <div>/*
 *使用LOB对象
 */
--LOB(Large Object)是专门用于处理大对象的一种数据类型,其所存放的数据长度可以达到4G字节
--CLOB/NCLOB用于存储大批量字符数据,BLOB用于存储大批量二进制数据,而BFILE则存储着指向OS文件的指针

/*
 *综合实例
 */
--建立表空间 
--#指定区尺寸为128k,如不指定,区尺寸默认为64k 
CR</div>
                                </li>
                                <li><a href="/article/1616.htm"
                                       title="【Resin一】Resin服务器部署web应用" target="_blank">【Resin一】Resin服务器部署web应用</a>
                                    <span class="text-muted">bit1129</span>
<a class="tag" taget="_blank" href="/search/resin/1.htm">resin</a>
                                    <div>工作中,在Resin服务器上部署web应用,通常有如下三种方式: 
  
 
 配置多个web-app 
 配置多个http id 
 为每个应用配置一个propeties、xml以及sh脚本文件 
 配置多个web-app 
  
 
 在resin.xml中,可以为一个host配置多个web-app 
 
  
<cluster id="app&q</div>
                                </li>
                                <li><a href="/article/1743.htm"
                                       title="red5简介及基础知识" target="_blank">red5简介及基础知识</a>
                                    <span class="text-muted">白糖_</span>
<a class="tag" taget="_blank" href="/search/%E5%9F%BA%E7%A1%80/1.htm">基础</a>
                                    <div>  
 
 简介 
 
  
Red5的主要功能和Macromedia公司的FMS类似,提供基于Flash的流媒体服务的一款基于Java的开源流媒体服务器。它由Java语言编写,使用RTMP作为流媒体传输协议,这与FMS完全兼容。它具有流化FLV、MP3文件,实时录制客户端流为FLV文件,共享对象,实时视频播放、Remoting等功能。用Red5替换FMS后,客户端不用更改可正</div>
                                </li>
                                <li><a href="/article/1870.htm"
                                       title="angular.fromJson" target="_blank">angular.fromJson</a>
                                    <span class="text-muted">boyitech</span>
<a class="tag" taget="_blank" href="/search/AngularJS/1.htm">AngularJS</a><a class="tag" taget="_blank" href="/search/AngularJS+%E5%AE%98%E6%96%B9API/1.htm">AngularJS 官方API</a><a class="tag" taget="_blank" href="/search/AngularJS+API/1.htm">AngularJS API</a>
                                    <div>angular.fromJson   描述:   把Json字符串转为对象    使用方法:   angular.fromJson(json);   参数详解:      Param Type Details   json 
string  
JSON 字符串      返回值:   对象, 数组, 字符串 或者是一个数字   示例:   
<!DOCTYPE HTML>
<h</div>
                                </li>
                                <li><a href="/article/1997.htm"
                                       title="java-颠倒一个句子中的词的顺序。比如: I am a student颠倒后变成:student a am I" target="_blank">java-颠倒一个句子中的词的顺序。比如: I am a student颠倒后变成:student a am I</a>
                                    <span class="text-muted">bylijinnan</span>
<a class="tag" taget="_blank" href="/search/java/1.htm">java</a>
                                    <div>

public class ReverseWords {

	/**
	 * 题目:颠倒一个句子中的词的顺序。比如: I am a student颠倒后变成:student a am I.词以空格分隔。
	 * 要求:
	 * 1.实现速度最快,移动最少
	 * 2.不能使用String的方法如split,indexOf等等。
	 * 解答:两次翻转。
	 */
	publ</div>
                                </li>
                                <li><a href="/article/2124.htm"
                                       title="web实时通讯" target="_blank">web实时通讯</a>
                                    <span class="text-muted">Chen.H</span>
<a class="tag" taget="_blank" href="/search/Web/1.htm">Web</a><a class="tag" taget="_blank" href="/search/%E6%B5%8F%E8%A7%88%E5%99%A8/1.htm">浏览器</a><a class="tag" taget="_blank" href="/search/socket/1.htm">socket</a><a class="tag" taget="_blank" href="/search/%E8%84%9A%E6%9C%AC/1.htm">脚本</a>
                                    <div>关于web实时通讯,做一些监控软件。 
由web服务器组件从消息服务器订阅实时数据,并建立消息服务器到所述web服务器之间的连接,web浏览器利用从所述web服务器下载到web页面的客户端代理与web服务器组件之间的socket连接,建立web浏览器与web服务器之间的持久连接;利用所述客户端代理与web浏览器页面之间的信息交互实现页面本地更新,建立一条从消息服务器到web浏览器页面之间的消息通路</div>
                                </li>
                                <li><a href="/article/2251.htm"
                                       title="[基因与生物]远古生物的基因可以嫁接到现代生物基因组中吗?" target="_blank">[基因与生物]远古生物的基因可以嫁接到现代生物基因组中吗?</a>
                                    <span class="text-muted">comsci</span>
<a class="tag" taget="_blank" href="/search/%E7%94%9F%E7%89%A9/1.htm">生物</a>
                                    <div> 
 
      大家仅仅把我说的事情当作一个IT行业的笑话来听吧..没有其它更多的意思 
 
 
    如果我们把大自然看成是一位伟大的程序员,专门为地球上的生态系统编制基因代码,并创造出各种不同的生物来,那么6500万年前的程序员开发的代码,是否兼容现代派的程序员的代码和架构呢? 
 
  </div>
                                </li>
                                <li><a href="/article/2378.htm"
                                       title="oracle 外部表" target="_blank">oracle 外部表</a>
                                    <span class="text-muted">daizj</span>
<a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a><a class="tag" taget="_blank" href="/search/%E5%A4%96%E9%83%A8%E8%A1%A8/1.htm">外部表</a><a class="tag" taget="_blank" href="/search/external+tables/1.htm">external tables</a>
                                    <div>    oracle外部表是只允许只读访问,不能进行DML操作,不能创建索引,可以对外部表进行的查询,连接,排序,创建视图和创建同义词操作。 
you can select, join, or sort external table data. You can also create views and synonyms for external tables. Ho</div>
                                </li>
                                <li><a href="/article/2505.htm"
                                       title="aop相关的概念及配置" target="_blank">aop相关的概念及配置</a>
                                    <span class="text-muted">daysinsun</span>
<a class="tag" taget="_blank" href="/search/AOP/1.htm">AOP</a>
                                    <div>切面(Aspect): 
通常在目标方法执行前后需要执行的方法(如事务、日志、权限),这些方法我们封装到一个类里面,这个类就叫切面。 
 
 
连接点(joinpoint) 
spring里面的连接点指需要切入的方法,通常这个joinpoint可以作为一个参数传入到切面的方法里面(非常有用的一个东西)。 
 
 
通知(Advice) 
通知就是切面里面方法的具体实现,分为前置、后置、最终、异常环</div>
                                </li>
                                <li><a href="/article/2632.htm"
                                       title="初一上学期难记忆单词背诵第二课" target="_blank">初一上学期难记忆单词背诵第二课</a>
                                    <span class="text-muted">dcj3sjt126com</span>
<a class="tag" taget="_blank" href="/search/english/1.htm">english</a><a class="tag" taget="_blank" href="/search/word/1.htm">word</a>
                                    <div>middle 中间的,中级的 
well 喔,那么;好吧 
phone 电话,电话机 
policeman 警察 
ask 问 
take 拿到;带到 
address 地址 
glad 高兴的,乐意的 
why 为什么  
China 中国 
family 家庭 
grandmother (外)祖母 
grandfather (外)祖父 
wife 妻子 
husband 丈夫 
da</div>
                                </li>
                                <li><a href="/article/2759.htm"
                                       title="Linux日志分析常用命令" target="_blank">Linux日志分析常用命令</a>
                                    <span class="text-muted">dcj3sjt126com</span>
<a class="tag" taget="_blank" href="/search/linux/1.htm">linux</a><a class="tag" taget="_blank" href="/search/log/1.htm">log</a>
                                    <div>1.查看文件内容 
cat 
-n 显示行号 2.分页显示 
more 
Enter 显示下一行 
空格 显示下一页 
F 显示下一屏 
B 显示上一屏 
less 
/get 查询"get"字符串并高亮显示 3.显示文件尾 
tail 
-f 不退出持续显示 
-n 显示文件最后n行 4.显示头文件 
head 
-n 显示文件开始n行 5.内容排序 
sort 
-n 按照</div>
                                </li>
                                <li><a href="/article/2886.htm"
                                       title="JSONP 原理分析" target="_blank">JSONP 原理分析</a>
                                    <span class="text-muted">fantasy2005</span>
<a class="tag" taget="_blank" href="/search/JavaScript/1.htm">JavaScript</a><a class="tag" taget="_blank" href="/search/jsonp/1.htm">jsonp</a><a class="tag" taget="_blank" href="/search/jsonp+%E8%B7%A8%E5%9F%9F/1.htm">jsonp 跨域</a>
                                    <div>转自 http://www.nowamagic.net/librarys/veda/detail/224 
JavaScript是一种在Web开发中经常使用的前端动态脚本技术。在JavaScript中,有一个很重要的安全性限制,被称为“Same-Origin Policy”(同源策略)。这一策略对于JavaScript代码能够访问的页面内容做了很重要的限制,即JavaScript只能访问与包含它的</div>
                                </li>
                                <li><a href="/article/3013.htm"
                                       title="使用connect by进行级联查询" target="_blank">使用connect by进行级联查询</a>
                                    <span class="text-muted">234390216</span>
<a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a><a class="tag" taget="_blank" href="/search/%E6%9F%A5%E8%AF%A2/1.htm">查询</a><a class="tag" taget="_blank" href="/search/%E7%88%B6%E5%AD%90/1.htm">父子</a><a class="tag" taget="_blank" href="/search/Connect+by/1.htm">Connect by</a><a class="tag" taget="_blank" href="/search/%E7%BA%A7%E8%81%94/1.htm">级联</a>
                                    <div>使用connect by进行级联查询 
  
       connect by可以用于级联查询,常用于对具有树状结构的记录查询某一节点的所有子孙节点或所有祖辈节点。 
  
       来看一个示例,现假设我们拥有一个菜单表t_menu,其中只有三个字段:</div>
                                </li>
                                <li><a href="/article/3140.htm"
                                       title="一个不错的能将HTML表格导出为excel,pdf等的jquery插件" target="_blank">一个不错的能将HTML表格导出为excel,pdf等的jquery插件</a>
                                    <span class="text-muted">jackyrong</span>
<a class="tag" taget="_blank" href="/search/jquery%E6%8F%92%E4%BB%B6/1.htm">jquery插件</a>
                                    <div>发现一个老外写的不错的jquery插件,可以实现将HTML 
表格导出为excel,pdf等格式, 
地址在: 
https://github.com/kayalshri/ 
 
下面看个例子,实现导出表格到excel,pdf 
 
 


<html>
			<head>
				<title>Export html table to excel an</div>
                                </li>
                                <li><a href="/article/3267.htm"
                                       title="UI设计中我们为什么需要设计动效" target="_blank">UI设计中我们为什么需要设计动效</a>
                                    <span class="text-muted">lampcy</span>
<a class="tag" taget="_blank" href="/search/UI/1.htm">UI</a><a class="tag" taget="_blank" href="/search/UI%E8%AE%BE%E8%AE%A1/1.htm">UI设计</a>
                                    <div>关于Unity3D中的Shader的知识 
首先先解释下Unity3D的Shader,Unity里面的Shaders是使用一种叫ShaderLab的语言编写的,它同微软的FX文件或者NVIDIA的CgFX有些类似。传统意义上的vertex shader和pixel shader还是使用标准的Cg/HLSL 编程语言编写的。因此Unity文档里面的Shader,都是指用ShaderLab编写的代码,</div>
                                </li>
                                <li><a href="/article/3394.htm"
                                       title="如何禁止页面缓存" target="_blank">如何禁止页面缓存</a>
                                    <span class="text-muted">nannan408</span>
<a class="tag" taget="_blank" href="/search/html/1.htm">html</a><a class="tag" taget="_blank" href="/search/jsp/1.htm">jsp</a><a class="tag" taget="_blank" href="/search/cache/1.htm">cache</a>
                                    <div>禁止页面使用缓存~ 
------------------------------------------------ 
jsp:页面no cache: 
 
response.setHeader("Pragma","No-cache"); 
response.setHeader("Cache-Control","no-cach</div>
                                </li>
                                <li><a href="/article/3521.htm"
                                       title="以代码的方式管理quartz定时任务的暂停、重启、删除、添加等" target="_blank">以代码的方式管理quartz定时任务的暂停、重启、删除、添加等</a>
                                    <span class="text-muted">Everyday都不同</span>
<a class="tag" taget="_blank" href="/search/%E5%AE%9A%E6%97%B6%E4%BB%BB%E5%8A%A1%E7%AE%A1%E7%90%86/1.htm">定时任务管理</a><a class="tag" taget="_blank" href="/search/spring-quartz/1.htm">spring-quartz</a>
                                    <div>      【前言】在项目的管理功能中,对定时任务的管理有时会很常见。因为我们不能指望只在配置文件中配置好定时任务就行了,因为如果要控制定时任务的 “暂停” 呢?暂停之后又要在某个时间点 “重启” 该定时任务呢?或者说直接 “删除” 该定时任务呢?要改变某定时任务的触发时间呢? “添加” 一个定时任务对于系统的使用者而言,是不太现实的,因为一个定时任务的处理逻辑他是不</div>
                                </li>
                                <li><a href="/article/3648.htm"
                                       title="EXT实例" target="_blank">EXT实例</a>
                                    <span class="text-muted">tntxia</span>
<a class="tag" taget="_blank" href="/search/ext/1.htm">ext</a>
                                    <div>  
(1) 增加一个按钮 
  
JSP: 
  
<%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%>
<%
String path = request.getContextPath();
Stri</div>
                                </li>
                                <li><a href="/article/3775.htm"
                                       title="数学学习在计算机研究领域的作用和重要性" target="_blank">数学学习在计算机研究领域的作用和重要性</a>
                                    <span class="text-muted">xjnine</span>
<a class="tag" taget="_blank" href="/search/Math/1.htm">Math</a>
                                    <div>最近一直有师弟师妹和朋友问我数学和研究的关系,研一要去学什么数学课。毕竟在清华,衡量一个研究生最重要的指标之一就是paper,而没有数学,是肯定上不了世界顶级的期刊和会议的,这在计算机学界尤其重要!你会发现,不论哪个领域有价值的东西,都一定离不开数学!在这样一个信息时代,当google已经让世界没有秘密的时候,一种卓越的数学思维,绝对可以成为你的核心竞争力.  无奈本人实在见地</div>
                                </li>
                </ul>
            </div>
        </div>
    </div>

<div>
    <div class="container">
        <div class="indexes">
            <strong>按字母分类:</strong>
            <a href="/tags/A/1.htm" target="_blank">A</a><a href="/tags/B/1.htm" target="_blank">B</a><a href="/tags/C/1.htm" target="_blank">C</a><a
                href="/tags/D/1.htm" target="_blank">D</a><a href="/tags/E/1.htm" target="_blank">E</a><a href="/tags/F/1.htm" target="_blank">F</a><a
                href="/tags/G/1.htm" target="_blank">G</a><a href="/tags/H/1.htm" target="_blank">H</a><a href="/tags/I/1.htm" target="_blank">I</a><a
                href="/tags/J/1.htm" target="_blank">J</a><a href="/tags/K/1.htm" target="_blank">K</a><a href="/tags/L/1.htm" target="_blank">L</a><a
                href="/tags/M/1.htm" target="_blank">M</a><a href="/tags/N/1.htm" target="_blank">N</a><a href="/tags/O/1.htm" target="_blank">O</a><a
                href="/tags/P/1.htm" target="_blank">P</a><a href="/tags/Q/1.htm" target="_blank">Q</a><a href="/tags/R/1.htm" target="_blank">R</a><a
                href="/tags/S/1.htm" target="_blank">S</a><a href="/tags/T/1.htm" target="_blank">T</a><a href="/tags/U/1.htm" target="_blank">U</a><a
                href="/tags/V/1.htm" target="_blank">V</a><a href="/tags/W/1.htm" target="_blank">W</a><a href="/tags/X/1.htm" target="_blank">X</a><a
                href="/tags/Y/1.htm" target="_blank">Y</a><a href="/tags/Z/1.htm" target="_blank">Z</a><a href="/tags/0/1.htm" target="_blank">其他</a>
        </div>
    </div>
</div>
<footer id="footer" class="mb30 mt30">
    <div class="container">
        <div class="footBglm">
            <a target="_blank" href="/">首页</a> -
            <a target="_blank" href="/custom/about.htm">关于我们</a> -
            <a target="_blank" href="/search/Java/1.htm">站内搜索</a> -
            <a target="_blank" href="/sitemap.txt">Sitemap</a> -
            <a target="_blank" href="/custom/delete.htm">侵权投诉</a>
        </div>
        <div class="copyright">版权所有 IT知识库 CopyRight © 2000-2050 E-COM-NET.COM , All Rights Reserved.
<!--            <a href="https://beian.miit.gov.cn/" rel="nofollow" target="_blank">京ICP备09083238号</a><br>-->
        </div>
    </div>
</footer>
<!-- 代码高亮 -->
<script type="text/javascript" src="/static/syntaxhighlighter/scripts/shCore.js"></script>
<script type="text/javascript" src="/static/syntaxhighlighter/scripts/shLegacy.js"></script>
<script type="text/javascript" src="/static/syntaxhighlighter/scripts/shAutoloader.js"></script>
<link type="text/css" rel="stylesheet" href="/static/syntaxhighlighter/styles/shCoreDefault.css"/>
<script type="text/javascript" src="/static/syntaxhighlighter/src/my_start_1.js"></script>





</body>

</html>