Boost搜索引擎

项目背景

先说一下什么是搜索引擎,很简单,就是我们平常使用的百度,我们把自己想要所有的内容输入进去,百度给我们返回相关的内容.百度一般给我们返回哪些内容呢?这里很简单,我们先来看一下.

Boost搜索引擎_第1张图片

搜索引擎基本原理

这里我们简单的说一下我们的搜索引擎的基本原理.

我们给服务器发起请求,例如搜索关键字"boost",服务器拿到请求之后,此时检索自己的资源,然后把结果构成响应发送给我们.

Boost搜索引擎_第2张图片

Boost库

boost库是一个经过千锤百炼、可移植、提供源代码的 C++ 库,作为标准库的后备.他的供能很强大,但是这里面有一个小小的缺陷,它不支持搜索,例如我们想要搜索一个函数,看一下cplus库,他是支持的.

image-20230909141645320

但是我们的boost库不支持,不知道我们后面支不支持.

image-20230909141829732

项目目的

下面我们就要说一下我们的项目的目的了,很简单,我们给boost添加一个搜索的功能,这里要说一下,我们服务器上面说了,我们需要搜索资源,可以通过两个方式

  • 搜索其他的网页资源:这里需要使用爬虫,有一定的技术要求
  • 把boost下载下来,我们在本地搜索资源

这里我们使用第二个方式,下载一下boost库.

Boost搜索引擎宏观流程

清晰数据

我们把boost库下载下来,此时我们想要把所有的后缀是html的文件进行处理,也就是清晰数据.我们先来看一个简单的html文件.我们把其中的title,content,url进行保存.

构建索引

我们把清晰出来的标签构建好索引,为了后期便于查找.这里细节很多,我们后面说/

处理请求

我们把请求处理好,然后根据索引拿到结果,由于我们的结果很多,这里我们把众多的结果根据权重排好序之后,发送给客户端.

前端页面

根据返回的结果,我们使用前端技术进行处理,让后我们就可以完成这个项目了.

Boost搜索引擎_第3张图片

技术栈与环境

技术栈

  • 后端: C/C++, C++11,STL, boost标准库, Jsoncpp, cppjieba, cpp-httplib
  • 前端: html5,css,js、jQuery, Ajax

环境

  • Centos7虚拟机,vim,gcc(g++),Makefile,Vscode

认识索引

下面我们要说下什么是索引,这里很简单,我们给编上号,我们可以根据编号找到唯一确定的文件,这就是索引的基本的原理.不过这里的索引分为正排索引和倒排索引.

  • 正派索引: 根据编号找到文件,这里的结果是唯一的
  • 倒排索引: 根据关键字,找到文件id.

这里们说大家可能觉得有点不太清楚,这里我们举一个例子,这里有两个文件.

Boost搜索引擎_第4张图片

正排索引

我们对每一个文件进行编号.

文档ID 文档名称 文档内容
1 文档A 你好,我是大学生
2 文档B 你好,我是社会人

这里的正派索引很简单,我们根据文档编号,直接就可以找到文档的内容.

倒排索引

我们把每一个文档都进行分词,拿出来不重复的词,对于每一个不重复的次,下面都挂着我们的文档的编号.

关键字 文档ID
你好 1, 2
1, 2
1, 2
大学生 1
社会人 2

倒排索引,就是根据关键字,拿到我们的文档ID.

如何分词

上面我们说了把文档进行分词,为何分词?为了提高查找的效率.那么请问我们该如何分词呢?这里我们可以自己手动分,但是已经有大佬给我们变好了一个库,我们直接使用就可以了.但是如果我们手动分?这里该如何分,很简单.

  • 你好,我是大学生: 你好/我/是/大学生

  • 你好,我是社会人: 你好/我/是/社会人

注意的,上面的分词我随意分的,不一定就是这样的.不过这里我们要谈一下我们一个提高效率的方法,我们发现,一个文旦里面的了" , “从” , “吗” , “the” , “a” 有的时候意义不是太大,那么我们这里是不是在分词的时候直接忽略,可以提高我们的效率,像这一种词,我们称为停止词.

模拟查找

下面我们模拟一下查找的流程的。

用户输入:你好 -> 倒排索引中查找 -> 提取出文档ID(1,2) -> 根据正排索引 -> 找到文档的内容 ->title+conent(desc)+url 文档结果进行摘要->构建响应结果

数据清洗

我们先下载一下boost库,直接使用最新版本的,我这里是1.83.0.我们下载到桌面,然后在centos下使用指令rz传入虚拟机中,然后解压一下就可以了.

image-20230909151825742

[qkj@localhost install]$ rz -E 

[qkj@localhost install]$ ll
total 141256
-rw-r--r--. 1 qkj qkj 144645738 Sep  9 00:15 boost_1_83_0.tar.gz
[qkj@localhost install]$ tar xzf boost_1_83_0.tar.gz 
[qkj@localhost install]$ ll
total 141260
drwxr-xr-x. 8 qkj qkj      4096 Aug  8 14:40 boost_1_83_0
-rw-r--r--. 1 qkj qkj 144645738 Sep  9 00:15 boost_1_83_0.tar.gz
[qkj@localhost install]$ 

下面看一下这个库的内容.

[qkj@localhost install]$ cd boost_1_83_0/
[qkj@localhost boost_1_83_0]$ ll
total 112
drwxr-xr-x. 139 qkj qkj  8192 Aug  8 14:40 boost
-rw-r--r--.   1 qkj qkj   851 Aug  8 14:02 boost-build.jam
-rw-r--r--.   1 qkj qkj 20245 Aug  8 14:02 boostcpp.jam
-rw-r--r--.   1 qkj qkj   989 Aug  8 14:02 boost.css
-rw-r--r--.   1 qkj qkj  6308 Aug  8 14:02 boost.png
-rw-r--r--.   1 qkj qkj  2486 Aug  8 14:02 bootstrap.bat
-rwxr-xr-x.   1 qkj qkj 10811 Aug  8 14:02 bootstrap.sh
drwxr-xr-x.   7 qkj qkj   196 Aug  8 14:14 doc
-rw-r--r--.   1 qkj qkj   769 Aug  8 14:02 index.htm
-rw-r--r--.   1 qkj qkj  5418 Aug  8 14:40 index.html
-rw-r--r--.   1 qkj qkj   291 Aug  8 14:02 INSTALL
-rw-r--r--.   1 qkj qkj 11947 Aug  8 14:02 Jamroot
drwxr-xr-x. 148 qkj qkj  4096 Aug  8 14:40 libs
-rw-r--r--.   1 qkj qkj  1338 Aug  8 14:02 LICENSE_1_0.txt
drwxr-xr-x.   4 qkj qkj   159 Aug  8 14:02 more
-rw-r--r--.   1 qkj qkj   542 Aug  8 14:02 README.md
-rw-r--r--.   1 qkj qkj  2608 Aug  8 14:02 rst.css
drwxr-xr-x.   2 qkj qkj   171 Aug  8 14:02 status
drwxr-xr-x.  14 qkj qkj   256 Aug  8 14:02 tools
[qkj@localhost boost_1_83_0]$ 

这里面就是我们boost库的全部内容,为了我们的项目简单一些,这里我们使用boost里面的doc里面的html目录下的的html文件.如果我们想要搭建所有的html文件,这里在后面去做.

boost_1_83_0/doc/html
[qkj@localhost doc]$ cd html/
[qkj@localhost html]$ ll
total 2900
-rw-r--r--.  1 qkj qkj   3476 Aug  8 14:24 about.html
drwxr-xr-x.  2 qkj qkj     82 Aug  8 14:25 accumulators
-rw-r--r--.  1 qkj qkj   5858 Aug  8 14:25 accumulators.html
drwxr-xr-x.  2 qkj qkj    168 Aug  8 14:26 align
-rw-r--r--.  1 qkj qkj   4440 Aug  8 14:26 align.html
drwxr-xr-x.  2 qkj qkj     78 Aug  8 14:26 any
-rw-r--r--.  1 qkj qkj   9011 Aug  8 14:26 any.html
drwxr-xr-x.  3 qkj qkj     78 Aug  8 14:26 array
-rw-r--r--.  1 qkj qkj   8377 Aug  8 14:26 array.html
-rw-r--r--.  1 qkj qkj  36597 Aug  8 14:30 array_types.html
-rw-r--r--.  1 qkj qkj 286811 Aug  8 14:29 asio_HTML.manifest
-rw-r--r--.  1 qkj qkj   6685 Aug  8 14:35 Assignable.html
-rw-r--r--.  1 qkj qkj    700 Aug  8 14:02 atomic.html
-rw-r--r--.  1 qkj qkj  20627 Aug  8 14:30 auxiliary.html
drwxr-xr-x.  2 qkj qkj     31 Aug  8 14:02 bbv2
...

下面我们要做的就是就是把boost_1_83_0/doc/html里面的所有内容保存到一个文件中.

[qkj@localhost boost_searcher]$ mkdir data/input -p
[qkj@localhost boost_searcher]$ cp -rf ../../install/boost_1_83_0/doc/html/* data/input/

我们看一下.

[qkj@localhost boost_searcher]$ cd data/input/
[qkj@localhost input]$ ll
total 2900
-rw-r--r--.  1 qkj qkj   3476 Sep  9 00:31 about.html
drwxr-xr-x.  2 qkj qkj     82 Sep  9 00:31 accumulators
-rw-r--r--.  1 qkj qkj   5858 Sep  9 00:31 accumulators.html
drwxr-xr-x.  2 qkj qkj    168 Sep  9 00:31 align
-rw-r--r--.  1 qkj qkj   4440 Sep  9 00:31 align.html
drwxr-xr-x.  2 qkj qkj     78 Sep  9 00:31 any
-rw-r--r--.  1 qkj qkj   9011 Sep  9 00:31 any.html
drwxr-xr-x.  3 qkj qkj     78 Sep  9 00:31 array
-rw-r--r--.  1 qkj qkj   8377 Sep  9 00:31 array.html

下面就可以去去标签了,这里创建一个文件.

[qkj@localhost boost_searcher]$ touch parser.cc

认识标签

在谈去标签之前,我们需要先认识一下标签.,我们随便打开的一个html文件.

DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">    
<html>    
<head>    
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">    
<title>Chapter 45. Boost.YAPtitle>    
<link rel="stylesheet" href="../../doc/src/boostbook.css" type="text/css">    
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">    
<link rel="home" href="index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">    
<link rel="up" href="libraries.html" title="Part I. The Boost C++ Libraries (BoostBook Subset)">    
<link rel="prev" href="xpressive/appendices.html" title="Appendices">    
<link rel="next" href="boost_yap/manual.html" title="Manual">    
<meta name="viewport" content="width=device-width, initial-scale=1">    
head>    
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">    
<table cellpadding="2" width="100%"><tr>    
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../boost.png">td>             
<td align="center"><a href="../../index.html">Homea>td>    
<td align="center"><a href="../../libs/libraries.htm">Librariesa>td>    
<td align="center"><a href="http://www.boost.org/users/people.html">Peoplea>td>  

像这种由<>包含的就是标签,一般而言,标签是成对出现的.这些标签对我们来说现在是没有价值的.我们需要把它给清晰了.对与清晰的数据我们也保存在一个文件中.

[qkj@localhost boost_searcher]$ mkdir data/raw_html -p
[qkj@localhost boost_searcher]$ cd data/
[qkj@localhost data]$ ll
total 16
drwxrwxr-x. 58 qkj qkj 12288 Sep  9 00:31 input     // 这里保存源html
drwxrwxr-x.  2 qkj qkj     6 Sep  9 00:44 raw_html  // 这里保存清晰后的html
[qkj@localhost data]$  

下面说一下我们该如何保存这些清晰后的文档内容,看一我们源html文件有多少个.

[qkj@localhost input]$ ls -Rl | grep -E "*.html" | wc -l
8581
[qkj@localhost input]$

这里我们可以对每一个源html都创建一个文件,但是这里有些多了,不如我们把所有的文档清洗好之后结果放在一个文件中,文件与文件之间使用’\3’隔开,就像下面的格式

XXXXXXXXXXXXXXXXX\3YYYYYYYYYYYYYYYYYYYYY\3ZZZZZZZZZZZZZZZZZZZZZZZZZ\3

这里解释一下我们为何使用’\3’.这是因为在ASCII表中 , 控制字符是不可显示字符 , 即无法打印。在我们获取的文档内容(即data/input中的html网页文件)中,里面基本上都是可打印字符,基本上不会有不可显示的控制字符。如此以来也就不会污染我们的文档内容啦。

不过我们不适用上面的格式,这里我们想办法把一个文档的’\n’全部去掉,然后我们使用这样的格式.

类似:title\3content\3url \n title\3content\3url \n title\3content\3url \n ...
方便我们getline(ifsream, line),直接获取文档的全部内容:title\3content\3url

我们创建一个文件来保存我们去标签之后的内容.

drwxrwxr-x. 58 qkj qkj 12288 Sep  9 01:03 input
drwxrwxr-x.  2 qkj qkj     6 Sep  9 01:03 raw_html
[qkj@localhost data]$ 
[qkj@localhost data]$ cd raw_html/
[qkj@localhost raw_html]$ touch raw.txt
[qkj@localhost raw_html]$ ll
total 0
-rw-rw-r--. 1 qkj qkj 0 Sep  9 02:32 raw.txt

清晰标签框架

下面我们开始编写parser.cc简单框架内,我们看一下.

#include 
#include 
#include 
#include 
// 这是一个目录,下面放的是所有的html网页
const std::string src_path = "data/input";

// 下面是一个文本文件,该文件保存所有的 网页清洗后的数据
const std::string output = "data/raw_html/raw.txt";

// 解析网页格式
typedef struct DocInfo
{
  std::string title;   // 文档标题
  std::string content; // 文旦内容
  std::string url;     // 该文档在官网的的url
} DocInfo_t;

static bool EnumFile(const std::string &src_path, std::vector<std::string> *file_list);
static bool ParseHtml(const std::vector<std::string> &file_list, std::vector<DocInfo_t> *results);
static bool SaveHtml(const std::vector<DocInfo_t> &results, const std::string &output);

int main(void)
{
  // 保存所有的 html 的文件名
  std::vector<std::string> file_list;

  // 第一步: EnumFile 枚举所有的文件名(带路径),仅限 网页,方便后期对一个一个文件进行读取
  if (false == EnumFile(src_path, &file_list))
  {
    std::cerr << "枚举文件名失败" << std::endl;
    return 1;
  }

  // 第二部:读取每一个文件的内容,进行解析,解析的格式 为DocInfo_t
  std::vector<DocInfo_t> results;
  if (false == ParseHtml(file_list, &results))
  {
    std::cerr << "解析文件失败" << std::endl;
    return 2;
  }

  // 第三步: 把解析文件的内容写入到output中,按照\3\n 作为每一个文档的分割符
  if (false == SaveHtml(results, output))
  {
    std::cerr << "保存文件失败" << std::endl;
    return 3;
  }
  return 0;
}

我们的的基本思路是下面这样的.

  • 拿到我们所有的源html文件名,然后把这些文件名保存在一个数组中
  • 依次遍历数组,把文件进行去标签,然后把去掉的内容整理成一个DocInfo_t结构体,里面保存title,content,url, 结果放在一个数组中
  • 遍历结构体数组,然后把内容写入到我们的目的文件中,按照一定的格式.

Boost库的安装

在实现上面的接口前,我们这里需要下载一个boost库,这是因为我们需要使用他们的函数.

[qkj@localhost BoostSearchEngine]$ sudo yum install -y boost-devel
[sudo] password for qkj: 

我们这里简单认识一下boost,下面是使用手册.

Boost搜索引擎_第5张图片

我们要使用是的关于文件的函数,这里我们看一下.

image-20230909162857940

EnumFile函数实现

下面开始EnumFil函数的实现,它的功能是把我们给定src_path目录下的所有后缀是html的文件名字给保存下了,存在在一个file_list数组中.

static bool EnumFile(const std::string &src_path, std::vector<std::string> *file_list)

具体的实现是.

static bool EnumFile(const std::string &src_path, std::vector<std::string> *file_list)
{
  assert(file_list);
  namespace fs = boost::filesystem; // 这是一个习惯, C++支持
  fs::path root_path(src_path);     // 定义一个path对象

  if (fs::exists(root_path) == false) // 判断路径是不是存在
  {
    std::cerr << src_path << " 路径是不存在的" << std::endl;
    return false;
  }

  // 定义一个空的迭代器, 用来判断 迭代器递归结束
  fs::recursive_directory_iterator end;
  for (fs::recursive_directory_iterator iter(root_path); iter != end; iter++)
  {
    // 保证是普通的文件
    if (fs::is_regular_file(*iter) == false)
    {
      // 这里是目录一类的
      continue;
    }

    // 普通文件需要 html 文件后缀结束
    if (iter->path().extension() != ".html")
    {
      continue;
    }

     std::cout << "debug: " << iter->path().string() << std::endl;

    // 此时一定 是以 html 后缀结尾的普通文件
    file_list->push_back(iter->path().string());
  }

  return true;
}

下面我们测试一下,写一些Makefile.

cc=g++
parser:parser.cc 
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem
.PHONY:clean
clean:
	rm parser

下面运行一下,我们发现成功了.

[qkj@localhost BoostSearchEngine]$ make
g++ -o parser parser.cc -std=c++11 -lboost_system -lboost_filesystem
[qkj@localhost BoostSearchEngine]$ ll
total 104
drwxrwxr-x. 4 qkj qkj    35 Sep  9 01:03 data
-rw-rw-r--. 1 qkj qkj   117 Sep  9 01:41 Makefile
-rwxrwxr-x. 1 qkj qkj 89152 Sep  9 01:43 parser
-rw-rw-r--. 1 qkj qkj  8398 Sep  9 01:43 parser.cc
[qkj@localhost BoostSearchEngine]$ ./parser 
debug: data/input/about.html
debug: data/input/accumulators/user_s_guide.html
debug: data/input/accumulators/acknowledgements.html
debug: data/input/accumulators/reference.html
debug: data/input/accumulators.html
...

ParseHtml实现

这里我们开始解析我们的每一个html目录.

static bool ParseHtml(const std::vector<std::string> &file_list, std::vector<DocInfo_t> *results)

下面是我们的框架.

static bool ParseTitle(const std::string &file, std::string *title);
static bool ParseContent(const std::string &file, std::string *content);
static bool ParseUrl(const std::string &file_path, std::string *url);

static bool ParseHtml(const std::vector<std::string> &file_list, std::vector<DocInfo_t> *results)
{
  assert(results);
  for (auto &file_path : file_list)
  {
    // 1. 读取文件
    std::string result;
    if (false == ns_util::FileUtil::ReadFile(file_path, &result))
    {
      continue;
    }

    DocInfo_t doc;
    // 2. 提取title
    if (false == ParseTitle(result, &doc.title))
    {
      continue;
    }
    // 3. 提取content  本质时 去标签
    if (false == ParseContent(result, &doc.content))
    {
      continue;
    }
    // 4. 提取url
    if (false == ParseUrl(file_path, &doc.url))
    {
      continue;
    }
    // 到这里一定时完成了解析任务
    results->push_back(std::move(doc)); // 右值引用
  }
  return true;
}

我们说一下我们的流程

  • 对于每一个文件,我们把它读取到一个字符串中
  • 根据字符串拿到title
  • 根据字符串拿到content
  • 根据字符串拿到url

下面我们分别实现这些函数的功能.

读取文件内容

对于这个函数,我们把它放在一个工具集中,后面可能会使用到.

#pragma once
#include 
#include 
#include 
#include 
// 这是一个工具集
namespace ns_util
{
  /// @brief  这是为了解析文件
  class FileUtil
  {
  public:
    /// @brief 读取文件内容到 out中
    /// @param file_path
    /// @param out
    /// @return
    static bool ReadFile(const std::string &file_path, std::string *out)
    {
      assert(out);
      std::ifstream in(file_path, std::ios::in);
      if (in.is_open() == false)
      {
        std::cerr << file_path << " 打开失败" << std::endl;
        return false;
      }

      std::string line;
      // 注意 getline 不会 读取 \n
      while (std::getline(in, line))
      {
        *out += line;
      }

      in.close();
      return true;
    }
  };
}

提取titile

我们这里继续看一下我们的一个html文件,title是在一个标签里面的.

image-20230909165910185

下面根据字符串来进行提取title.

static bool ParseTitle(const std::string &file, std::string *title)
{
  assert(title);
  std::size_t begin = file.find(""</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>begin <span class="token operator">==</span> std<span class="token double-colon punctuation">::</span>string<span class="token double-colon punctuation">::</span>npos<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  std<span class="token double-colon punctuation">::</span>size_t end <span class="token operator">=</span> file<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span><span class="token string">""); // 反方向查
  if (end == std::string::npos)
  {
    return false;
  }

  begin += std::string(""</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>begin <span class="token operator">></span> end<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token operator">*</span>title <span class="token operator">=</span> file<span class="token punctuation">.</span><span class="token function">substr</span><span class="token punctuation">(</span>begin<span class="token punctuation">,</span> end <span class="token operator">-</span> begin<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h4>提取content</h4> 
  <p>这里我们获取content,不是把所有的内容都拿出来,而是要去标签,这里需要借助一个状态机.</p> 
  <p>我们知道标签是有<code><></code>这样的表示的.那么我们这里使用一个状态机.我们默认第一个字符是<code><</code></p> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">ParseContent</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>file<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>content<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token function">assert</span><span class="token punctuation">(</span>content<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 这就是我们去标签最重要的地方</span>
  <span class="token comment">// 我们这里使用一个简单的状态机</span>
  <span class="token keyword">enum</span> <span class="token class-name">status</span>
  <span class="token punctuation">{</span>
    LABLE<span class="token punctuation">,</span>
    CONTENT
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
  
  <span class="token keyword">enum</span> <span class="token class-name">status</span> s <span class="token operator">=</span> LABLE<span class="token punctuation">;</span> <span class="token comment">// 默认第一个是 '<'</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">char</span> ch <span class="token operator">:</span> file<span class="token punctuation">)</span> <span class="token comment">// 注意这里我没有使用引用,后面解释</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">switch</span> <span class="token punctuation">(</span>s<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
    <span class="token keyword">case</span> LABLE<span class="token operator">:</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>ch <span class="token operator">==</span> <span class="token char">'>'</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 此时意味这当前的标签被处理完毕</span>
        s <span class="token operator">=</span> CONTENT<span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">break</span><span class="token punctuation">;</span>

    <span class="token keyword">case</span> CONTENT<span class="token operator">:</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>ch <span class="token operator">==</span> <span class="token char">'<'</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
          <span class="token comment">// 这里有可能是<><>这样的情况</span>
        s <span class="token operator">=</span> LABLE<span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">else</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 这里有一个细节 我们不想要'\n' 字符</span>
        <span class="token comment">// 我们希望用'\n' 作为分隔符</span>
        <span class="token comment">// 注意,这个应该不会出现\n,</span>
        <span class="token comment">// 毕竟我们读取文件的时候使用的getline,可是不我们不能把希望寄托到被人身上</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>ch <span class="token operator">==</span> <span class="token char">'\n'</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          ch <span class="token operator">=</span> <span class="token char">' '</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        content<span class="token operator">-></span><span class="token function">push_back</span><span class="token punctuation">(</span>ch<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">break</span><span class="token punctuation">;</span>

    <span class="token keyword">default</span><span class="token operator">:</span>
      <span class="token keyword">break</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h4>提取url</h4> 
  <p>这里面有一个需要谈的.我们这里是要凭借url,那么我么看一下官网的url和我们的本地的url是有什么关系的.</p> 
  <pre><code>官网url: https://www.boost.org/doc/libs/1_83_0/doc/html/accumulators.html
本地url: data/input/accumulators.html                   // 这是因为为我们把doc/html/里面的内容拷贝到data/input中的

// 这里我们要拼接url
url_head = "https://www.boost.org/doc/libs/1_83_0/doc/html";
url_tail = [data/input](删除) /accumulators.html
         => url_tail = /accumulators.html

url = url_head + url_tail ; 相当于形成了一个官网链接
</code></pre> 
  <p>下面就是我们的代码</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">ParseUrl</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>file_path<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>url<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token function">assert</span><span class="token punctuation">(</span>url<span class="token punctuation">)</span><span class="token punctuation">;</span>
    
  <span class="token comment">//  url_head = "https://www.boost.org/doc/libs/1_78_0/doc/html"</span>
  <span class="token comment">//  url_tail = "/accumulators.html"</span>
  std<span class="token double-colon punctuation">::</span>string url_head <span class="token operator">=</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html"</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>string url_tail <span class="token operator">=</span> file_path<span class="token punctuation">.</span><span class="token function">substr</span><span class="token punctuation">(</span>src_path<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token operator">*</span>url <span class="token operator">=</span> url_head <span class="token operator">+</span> url_tail<span class="token punctuation">;</span>

  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面我们测试验证一下,使用一个函数.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">ShowDoc</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo_t <span class="token operator">&</span>doc<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"title: "</span> <span class="token operator"><<</span> doc<span class="token punctuation">.</span>title <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"content: "</span> <span class="token operator"><<</span> doc<span class="token punctuation">.</span>content <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"url: "</span> <span class="token operator"><<</span> doc<span class="token punctuation">.</span>url <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">ParseHtml</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">&</span>file_list<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>DocInfo_t<span class="token operator">></span> <span class="token operator">*</span>results<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token function">assert</span><span class="token punctuation">(</span>results<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>file_path <span class="token operator">:</span> file_list<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// 1. 读取文件</span>
    std<span class="token double-colon punctuation">::</span>string result<span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">FileUtil</span><span class="token double-colon punctuation">::</span><span class="token function">ReadFile</span><span class="token punctuation">(</span>file_path<span class="token punctuation">,</span> <span class="token operator">&</span>result<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    DocInfo_t doc<span class="token punctuation">;</span>
    <span class="token comment">// 2. 提取title</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> <span class="token function">ParseTitle</span><span class="token punctuation">(</span>result<span class="token punctuation">,</span> <span class="token operator">&</span>doc<span class="token punctuation">.</span>title<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// 3. 提取content  本质时 去标签</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> <span class="token function">ParseContent</span><span class="token punctuation">(</span>result<span class="token punctuation">,</span> <span class="token operator">&</span>doc<span class="token punctuation">.</span>content<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// 4. 提取url</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token boolean">false</span> <span class="token operator">==</span> <span class="token function">ParseUrl</span><span class="token punctuation">(</span>file_path<span class="token punctuation">,</span> <span class="token operator">&</span>doc<span class="token punctuation">.</span>url<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// for debug</span>
    <span class="token function">ShowDoc</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// break;</span>
    <span class="token comment">// 到这里一定时完成了解析任务</span>
    results<span class="token operator">-></span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 右值引用</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这个是我们的测定结果.</p> 
  <pre><code>title: Struct template result<This(InputIterator, InputIterator)>
content: Struct template result<This(InputIterator, InputIterator)>HomeLibrariesPeopleFAQMoreStruct template result<This(InputIterator, InputIterator)>boost::proto::functional::distance::result<This(InputIterator, InputIterator)>Synopsis// In header: <boost/proto/functional/std/iterator.hpp>template<typename This, typename InputIterator> struct result<This(InputIterator, InputIterator)> {  // types  typedef typename std::iterator_traits<      typename boost::remove_const<        typename boost::remove_reference<InputIterator>::type      >::type    >::difference_type type;};Copyright © 2008 Eric Niebler        Distributed under the Boost Software License, Version 1.0. (See accompanying        file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)      
url: https://www.boost.org/doc/libs/1_83_0/doc/html/boost/proto/functional/distance/resu_1_3_32_5_26_2_1_1_2_4.html
</code></pre> 
  <p>我们拿到这个url去官网上看看是不是,我们发现是的.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/afb19f838ac94b9c9f9b617244ba27b4.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/afb19f838ac94b9c9f9b617244ba27b4.jpg" alt="Boost搜索引擎_第6张图片" width="650" height="255" style="border:1px solid black;"></a></p> 
  <h3><code>SaveHtml</code>实现</h3> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">SaveHtml</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>DocInfo_t<span class="token operator">></span> <span class="token operator">&</span>results<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>output<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <p>我们已经得到每一个文件的结构体了,下面我们开始保存文件到要求的文件中.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">static</span> <span class="token keyword">bool</span> <span class="token function">SaveHtml</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>DocInfo_t<span class="token operator">></span> <span class="token operator">&</span>results<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>output<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">SEP</span> <span class="token string">"\3"</span></span>
  <span class="token comment">// 我们按照下面的方式,要知道我们把文档的内容去掉了\n</span>
  <span class="token comment">// title\3content\3url\n title\3content\3url\n title\3content\3url\n return true;</span>

  <span class="token comment">// explicit basic_ofstream (const char* filename,</span>
  <span class="token comment">//                       ios_base::openmode mode = ios_base::out);</span>
  std<span class="token double-colon punctuation">::</span>ofstream <span class="token function">out</span><span class="token punctuation">(</span>output<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>out <span class="token operator">|</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>out<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"打开文件失败 "</span> <span class="token operator"><<</span> output <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>e <span class="token operator">:</span> results<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string str <span class="token operator">=</span> e<span class="token punctuation">.</span>title<span class="token punctuation">;</span>
    str <span class="token operator">+=</span> SEP<span class="token punctuation">;</span>

    str <span class="token operator">+=</span> e<span class="token punctuation">.</span>content<span class="token punctuation">;</span>
    str <span class="token operator">+=</span> SEP<span class="token punctuation">;</span>

    str <span class="token operator">+=</span> e<span class="token punctuation">.</span>url<span class="token punctuation">;</span>
    str <span class="token operator">+=</span> <span class="token string">"\n"</span><span class="token punctuation">;</span>
    out<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>str<span class="token punctuation">.</span><span class="token function">c_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> str<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  out<span class="token punctuation">.</span><span class="token function">close</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这里验证是不是保存了.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/539b06fcc75847068f3982455af8a815.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/539b06fcc75847068f3982455af8a815.jpg" alt="Boost搜索引擎_第7张图片" width="650" height="106" style="border:1px solid black;"></a></p> 
  <p>这里我们验证下是不是保存完全了.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ls</span> ./data/input/ <span class="token parameter variable">-Rl</span> <span class="token operator">|</span> <span class="token function">grep</span> <span class="token parameter variable">-E</span> <span class="token string">"*.html"</span> <span class="token operator">|</span> <span class="token function">wc</span> <span class="token parameter variable">-l</span>
<span class="token number">8581</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">cat</span> ./data/raw_html/raw.txt <span class="token operator">|</span> <span class="token function">wc</span> <span class="token parameter variable">-l</span>
<span class="token number">8581</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <h1>建立索引</h1> 
  <p>下面我们就要建立索引的,建立索引实际上就是构建存储+搜索的数据结构,来加快我们对于关键字->文档ID->文档内容的搜索过程。根据上面谈的,我们建立正派索引和倒排索引.</p> 
  <h2>jieba安装与使用</h2> 
  <p>对于分词,这里我们使用cppjieba分词工具,我们执行下面的命令就可以了.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">git</span> clone https://github.com/yanyiwu/cppjieba.git
</code></pre> 
  <p>这里我们看一下cppjieba的具体内容.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ tree cppjieba/
cppjieba/
├── ChangeLog.md
├── CMakeLists.txt
├── deps
│   ├── CMakeLists.txt
│   ├── gtest
│   │   ├── CMakeLists.txt
│   │   ├── include
│   │   │   └── gtest
│   │   │       ├── gtest-death-test.h
│   │   │       ├── gtest.h
│   │   │       ├── gtest-message.h
│   │   │       ├── gtest-param-test.h
│   │   │       ├── gtest-param-test.h.pump
│   │   │       ├── gtest_pred_impl.h
│   │   │       ├── gtest-printers.h
│   │   │       ├── gtest_prod.h
│   │   │       ├── gtest-spi.h
│   │   │       ├── gtest-test-part.h
│   │   │       ├── gtest-typed-test.h
│   │   │       └── internal
│   │   │           ├── gtest-death-test-internal.h
│   │   │           ├── gtest-filepath.h
│   │   │           ├── gtest-internal.h
│   │   │           ├── gtest-linked_ptr.h
│   │   │           ├── gtest-param-util-generated.h
│   │   │           ├── gtest-param-util-generated.h.pump
│   │   │           ├── gtest-param-util.h
│   │   │           ├── gtest-port.h
│   │   │           ├── gtest-string.h
│   │   │           ├── gtest-tuple.h
│   │   │           ├── gtest-tuple.h.pump
│   │   │           ├── gtest-type-util.h
│   │   │           └── gtest-type-util.h.pump
│   │   └── src
│   │       ├── gtest-all.cc
│   │       ├── gtest.cc
│   │       ├── gtest-death-test.cc
│   │       ├── gtest-filepath.cc
│   │       ├── gtest-internal-inl.h
│   │       ├── gtest_main.cc
│   │       ├── gtest-port.cc
│   │       ├── gtest-printers.cc
│   │       ├── gtest-test-part.cc
│   │       └── gtest-typed-test.cc
│   └── limonp
├── dict
│   ├── hmm_model.utf8
│   ├── idf.utf8
│   ├── jieba.dict.utf8
│   ├── pos_dict
│   │   ├── char_state_tab.utf8
│   │   ├── prob_emit.utf8
│   │   ├── prob_start.utf8
│   │   └── prob_trans.utf8
│   ├── README.md
│   ├── stop_words.utf8
│   └── user.dict.utf8
├── include
│   └── cppjieba
│       ├── DictTrie.hpp
│       ├── FullSegment.hpp
│       ├── HMMModel.hpp
│       ├── HMMSegment.hpp
│       ├── Jieba.hpp
│       ├── KeywordExtractor.hpp
│       ├── MixSegment.hpp
│       ├── MPSegment.hpp
│       ├── PosTagger.hpp
│       ├── PreFilter.hpp
│       ├── QuerySegment.hpp
│       ├── SegmentBase.hpp
│       ├── SegmentTagged.hpp
│       ├── TextRankExtractor.hpp
│       ├── Trie.hpp
│       └── Unicode.hpp
├── LICENSE
├── README_EN.md
├── README.md
└── <span class="token builtin class-name">test</span>
    ├── CMakeLists.txt
    ├── demo.cpp
    ├── load_test.cpp
    ├── testdata
    │   ├── curl.res
    │   ├── extra_dict
    │   │   └── jieba.dict.small.utf8
    │   ├── gbk_dict
    │   │   ├── hmm_model.gbk
    │   │   └── jieba.dict.gbk
    │   ├── jieba.dict.0.1.utf8
    │   ├── jieba.dict.0.utf8
    │   ├── jieba.dict.1.utf8
    │   ├── jieba.dict.2.utf8
    │   ├── load_test.urls
    │   ├── review.100
    │   ├── review.100.res
    │   ├── server.conf
    │   ├── testlines.gbk
    │   ├── testlines.utf8
    │   ├── userdict.2.utf8
    │   ├── userdict.english
    │   ├── userdict.utf8
    │   └── weicheng.utf8
    └── unittest
        ├── CMakeLists.txt
        ├── gtest_main.cpp
        ├── jieba_test.cpp
        ├── keyword_extractor_test.cpp
        ├── pos_tagger_test.cpp
        ├── pre_filter_test.cpp
        ├── segments_test.cpp
        ├── textrank_test.cpp
        ├── trie_test.cpp
        └── unicode_test.cpp

<span class="token number">16</span> directories, <span class="token number">98</span> files
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里我们要关注的是两个文件.</p> 
  <ul> 
   <li>cppjieba/include : 我们的头文件</li> 
   <li>cppjiba/dict : 我们的字典</li> 
  </ul> 
  <blockquote> 
   <p>下面我们开始jiebba分词的使用,里面存在一个demo.cpp文件供我们测试在,这里我们把它拷贝到一个位置.</p> 
  </blockquote> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ <span class="token builtin class-name">pwd</span>
/home/qkj/install/cppjieba/test
<span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ ll
total <span class="token number">16</span>
-rw-rw-r--. <span class="token number">1</span> qkj qkj  <span class="token number">148</span> Sep  <span class="token number">9</span> 03:38 CMakeLists.txt
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">2797</span> Sep  <span class="token number">9</span> 03:38 demo.cpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">1532</span> Sep  <span class="token number">9</span> 03:38 load_test.cpp
drwxrwxr-x. <span class="token number">4</span> qkj qkj <span class="token number">4096</span> Sep  <span class="token number">9</span> 03:38 testdata
drwxrwxr-x. <span class="token number">2</span> qkj qkj  <span class="token number">255</span> Sep  <span class="token number">9</span> 03:38 unittest
<span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ <span class="token function">cp</span> demo.cpp <span class="token punctuation">..</span>/<span class="token punctuation">..</span>
<span class="token punctuation">[</span>qkj@localhost test<span class="token punctuation">]</span>$ <span class="token builtin class-name">cd</span> <span class="token punctuation">..</span>/<span class="token punctuation">..</span>/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">8</span>
drwxr-xr-x. <span class="token number">8</span> qkj qkj <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span>:40 boost_1_83_0
drwxrwxr-x. <span class="token number">8</span> qkj qkj  <span class="token number">215</span> Sep  <span class="token number">9</span> 03:38 cppjieba
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">2797</span> Sep  <span class="token number">9</span> 03:49 demo.cpp
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>首先,我们不能直接编译,它会报错.</p> 
  <pre><code class="prism language-cpp"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g<span class="token operator">++</span> demo<span class="token punctuation">.</span>cpp 
demo<span class="token punctuation">.</span>cpp<span class="token operator">:</span><span class="token number">1</span><span class="token operator">:</span><span class="token number">10</span><span class="token operator">:</span> fatal error<span class="token operator">:</span> cppjieba<span class="token operator">/</span>Jieba<span class="token punctuation">.</span>hpp<span class="token operator">:</span> No such file <span class="token operator">or</span> directory
 <span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cppjieba/Jieba.hpp"</span></span>
          <span class="token operator">^</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span><span class="token operator">~</span>
compilation terminated<span class="token punctuation">.</span>
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这是因为我们这里的库和头文件的路径是不对的,这里添加软链接.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span>  cppjieba/include/ inc
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span>  cppjieba/dict/ dict
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">8</span>
drwxr-xr-x. <span class="token number">8</span> qkj qkj <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span>:40 boost_1_83_0
drwxrwxr-x. <span class="token number">8</span> qkj qkj  <span class="token number">215</span> Sep  <span class="token number">9</span> 03:38 cppjieba
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">2797</span> Sep  <span class="token number">9</span> 03:49 demo.cpp
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">14</span> Sep  <span class="token number">9</span> 03:50 dict -<span class="token operator">></span> cppjieba/dict/
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">17</span> Sep  <span class="token number">9</span> 03:50 inc -<span class="token operator">></span> cppjieba/include/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">cp</span> <span class="token parameter variable">-rf</span> cppjieba/deps/limonp/ cppjieba/include/cppjieba/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>下面我们要修改demo.cpp文件.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/ac95d7d621a54572883c91ab6a6b566a.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/ac95d7d621a54572883c91ab6a6b566a.jpg" alt="Boost搜索引擎_第8张图片" width="650" height="229" style="border:1px solid black;"></a></p> 
  <p>下面我们继续编译,我们发现还是出现错误.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g++ demo.cpp 
In <span class="token function">file</span> included from inc/cppjieba/Jieba.hpp:4,
                 from demo.cpp:1:
inc/cppjieba/QuerySegment.hpp:7:10: fatal error: limonp/Logging.hpp: No such <span class="token function">file</span> or directory
 <span class="token comment">#include "limonp/Logging.hpp"</span>
          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.
</code></pre> 
  <p>这是因为cppjieba/deps/limonp实际上是空文件夹</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token builtin class-name">cd</span>  cppjieba/include/cppjieba/limonp/
<span class="token punctuation">[</span>qkj@localhost limonp<span class="token punctuation">]</span>$ ll
total <span class="token number">0</span>
<span class="token punctuation">[</span>qkj@localhost limonp<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里需要我们手动去下载这个目录.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">git</span> clone https://github.com/yanyiwu/limonp.git
</code></pre> 
  <p>然后把我们下载好的目录拷贝到cppjieba/deps/limonp,然后重新拷贝到cppjieba/include/cppjieba/.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">cp</span> <span class="token parameter variable">-rf</span> limonp/include/limonp/ cppjieba/deps/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">cp</span> <span class="token parameter variable">-rf</span> cppjieba/deps/limonp/ cppjieba/include/cppjieba/
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这样就可以了,我们这里编译一下.</p> 
  <pre><code class="prism language-cpp"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g<span class="token operator">++</span> demo<span class="token punctuation">.</span>cpp <span class="token operator">-</span>std<span class="token operator">=</span>c<span class="token operator">++</span><span class="token number">11</span>
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">480</span>
<span class="token operator">-</span>rwxrwxr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj <span class="token number">482896</span> Sep  <span class="token number">9</span> <span class="token number">05</span><span class="token operator">:</span><span class="token number">50</span> a<span class="token punctuation">.</span>out
drwxr<span class="token operator">-</span>xr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">8</span> qkj qkj   <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span><span class="token operator">:</span><span class="token number">40</span> boost_1_83_0
drwxrwxr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">8</span> qkj qkj    <span class="token number">215</span> Sep  <span class="token number">9</span> <span class="token number">03</span><span class="token operator">:</span><span class="token number">38</span> cppjieba
<span class="token operator">-</span>rw<span class="token operator">-</span>rw<span class="token operator">-</span>r<span class="token operator">--</span><span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj   <span class="token number">2852</span> Sep  <span class="token number">9</span> <span class="token number">05</span><span class="token operator">:</span><span class="token number">28</span> demo<span class="token punctuation">.</span>cpp
lrwxrwxrwx<span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj     <span class="token number">14</span> Sep  <span class="token number">9</span> <span class="token number">03</span><span class="token operator">:</span><span class="token number">50</span> dict <span class="token operator">-></span> cppjieba<span class="token operator">/</span>dict<span class="token operator">/</span>
lrwxrwxrwx<span class="token punctuation">.</span> <span class="token number">1</span> qkj qkj     <span class="token number">17</span> Sep  <span class="token number">9</span> <span class="token number">03</span><span class="token operator">:</span><span class="token number">50</span> inc <span class="token operator">-></span> cppjieba<span class="token operator">/</span>include<span class="token operator">/</span>
drwxrwxr<span class="token operator">-</span>x<span class="token punctuation">.</span> <span class="token number">6</span> qkj qkj    <span class="token number">171</span> Sep  <span class="token number">9</span> <span class="token number">05</span><span class="token operator">:</span><span class="token number">46</span> limonp
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token punctuation">.</span><span class="token operator">/</span>a<span class="token punctuation">.</span>out 
他来到了网易杭研大厦
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> Cut With HMM
他<span class="token operator">/</span>来到<span class="token operator">/</span>了<span class="token operator">/</span>网易<span class="token operator">/</span>杭研<span class="token operator">/</span>大厦
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> Cut Without HMM 
他<span class="token operator">/</span>来到<span class="token operator">/</span>了<span class="token operator">/</span>网易<span class="token operator">/</span>杭<span class="token operator">/</span>研<span class="token operator">/</span>大厦
我来到北京清华大学
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> CutAll
我<span class="token operator">/</span>来到<span class="token operator">/</span>北京<span class="token operator">/</span>清华<span class="token operator">/</span>清华大学<span class="token operator">/</span>华大<span class="token operator">/</span>大学
小明硕士毕业于中国科学院计算所,后在日本京都大学深造
<span class="token punctuation">[</span>demo<span class="token punctuation">]</span> CutForSearch
小明<span class="token operator">/</span>硕士<span class="token operator">/</span>毕业<span class="token operator">/</span>于<span class="token operator">/</span>中国<span class="token operator">/</span>科学<span class="token operator">/</span>学院<span class="token operator">/</span>科学院<span class="token operator">/</span>中国科学院<span class="token operator">/</span>计算<span class="token operator">/</span>计算所<span class="token operator">/</span>,<span class="token operator">/</span>后<span class="token operator">/</span>在<span class="token operator">/</span>日本<span class="token operator">/</span>京都<span class="token operator">/</span>大学<span class="token operator">/</span>日本京都大学<span class="token operator">/</span>深造
</code></pre> 
  <h2>索引框架</h2> 
  <p>下面我们创建一个文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> index.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">124</span>
drwxrwxr-x. <span class="token number">4</span> qkj qkj     <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
-rw-rw-r--. <span class="token number">1</span> qkj qkj      <span class="token number">0</span> Sep  <span class="token number">9</span> 02:48 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">117</span> Sep  <span class="token number">9</span> 01:41 Makefile
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">110008</span> Sep  <span class="token number">9</span> 02:48 parser
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">783</span> Sep  <span class="token number">9</span> 02:48 util.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里我们需要明确是我们要建立正排和倒排索引.并且我们还要提供一个两个查找的接口.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">namespace</span> ns_index
<span class="token punctuation">{</span>
  <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string title<span class="token punctuation">;</span>   <span class="token comment">// 文档标题</span>
    std<span class="token double-colon punctuation">::</span>string content<span class="token punctuation">;</span> <span class="token comment">// 文档内容</span>
    std<span class="token double-colon punctuation">::</span>string url<span class="token punctuation">;</span>     <span class="token comment">// 官网url</span>

    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦的id 暂时不做理解</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">/// @brief 作为倒排索引的辅助</span>
  <span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span>  <span class="token comment">// 文旦id</span>
    std<span class="token double-colon punctuation">::</span>string word<span class="token punctuation">;</span> <span class="token comment">// 关键字</span>
    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>       <span class="token comment">// 权重 -->后面解释</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">// 倒排拉链  -- 根据用一个关键字 来拿到一组的InvertedElem</span>
  <span class="token keyword">typedef</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span><span class="token operator">></span> InvertedList<span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Index</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>

    <span class="token comment">/// @brief 根据doc_id来获取正派索引 ,也就是文旦内容</span>
    <span class="token comment">/// @param doc_id  文旦id</span>
    <span class="token comment">/// @return 返回文档结构体的地址</span>
    <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span> <span class="token operator">*</span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据关键字 获取倒排拉链</span>
    <span class="token comment">/// @param word 关键</span>
    <span class="token comment">/// @return</span>
    InvertedList <span class="token operator">*</span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
		<span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据目录 文件 构建 正派和倒排索引,这里是最重的一步</span>
    <span class="token comment">/// @param src_path 去标签后目录文件目录</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 建立正排</span>
      <span class="token comment">// 建立倒排</span>
      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
      
    <span class="token comment">/// @brief 根据字符串建立正派索引  也就是根据文旦id找到 文档内容</span>
    <span class="token comment">/// @param line 一个字符串,该字符串保留一个html文档的所有内容</span>
    <span class="token comment">/// @return</span>
    DocInfo <span class="token operator">*</span><span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>line<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
<span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">// 这两个结构不暴露给外部</span>
    <span class="token comment">/// @brief 根据一个文档内容的结构体建立倒排索引,需要经行分词 </span>
    <span class="token comment">/// @param doc  这个是一个结构体</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo <span class="token operator">&</span>doc<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">// 正排索引 -- 根据vector下标可以更加高效作为id找到内容</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">DocInfo</span><span class="token operator">></span> forward_index<span class="token punctuation">;</span>
    <span class="token comment">// 倒排索引 一个关键字 可能在很多的文档中出现,一定是一个关键字和一组InvertedElem对应</span>
    std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> InvertedList<span class="token operator">></span> inverted_index<span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面我们依次实现这里面的函数.</p> 
  <h3>BuildIndex 构建索引</h3> 
  <pre><code class="prism language-cpp"><span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <p>这个是根据我们已经清洗好的数据,通过它来构建索引.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  std<span class="token double-colon punctuation">::</span>ifstream <span class="token function">in</span><span class="token punctuation">(</span>src_path<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>in <span class="token operator">|</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>in<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"文件目录 "</span> <span class="token operator"><<</span> src_path <span class="token operator"><<</span> <span class="token string">"无效"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">int</span> count <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span> <span class="token comment">// 他的作用是让我们看到构建索引的过程</span>
  std<span class="token double-colon punctuation">::</span>string line<span class="token punctuation">;</span> 
  <span class="token keyword">while</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>in<span class="token punctuation">,</span> line<span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// 此时我们已经提取到每一个html内容了</span>
    <span class="token comment">// 建立正派索引</span>
    DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> <span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span>line<span class="token punctuation">)</span><span class="token punctuation">;</span> 
    
    <span class="token keyword">if</span> <span class="token punctuation">(</span>doc <span class="token operator">==</span> <span class="token keyword">nullptr</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"建立一个正派索引失败"</span> <span class="token operator"><<</span> line <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 建立 倒排索引</span>
    <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token operator">*</span>doc<span class="token punctuation">)</span><span class="token punctuation">;</span>
    count<span class="token operator">++</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span>count <span class="token operator">%</span> <span class="token number">50</span> <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 后期加上一个进度条</span>
       std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"当前已经处理了 索引文档 "</span> <span class="token operator"><<</span> count <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h4>建立正排索引</h4> 
  <p>这个是在是太好实现了,我们数组下标天然是我们的文档ID,只需要把清晰后每一个文档的内容处理成结构体,然后添加到数组中就可以了.</p> 
  <pre><code class="prism language-cpp"><span class="token comment">/// @brief 根据字符串建立正派索引  也就是根据文旦id找到 文档内容</span>
<span class="token comment">/// @param line 一个字符串,该字符串保留一个html文档的所有内容</span>
<span class="token comment">/// @return</span>
DocInfo <span class="token operator">*</span><span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>line<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// title\3content\3url\n</span>

  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> results<span class="token punctuation">;</span>
  <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string sep <span class="token operator">=</span> <span class="token string">"\3"</span><span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">StringUtil</span><span class="token double-colon punctuation">::</span><span class="token function">Split</span><span class="token punctuation">(</span>line<span class="token punctuation">,</span> <span class="token operator">&</span>results<span class="token punctuation">,</span> sep<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是工具集里面切分字符串</span>

  <span class="token keyword">if</span> <span class="token punctuation">(</span>results<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token number">3</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>

  DocInfo doc<span class="token punctuation">;</span>
  doc<span class="token punctuation">.</span>title <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
  doc<span class="token punctuation">.</span>content <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
  doc<span class="token punctuation">.</span>url <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
  <span class="token comment">// 文档id,就是数组下标</span>
  doc<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 注意这里是 正派拉链</span>

  forward_index<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>把工具集里面的代码写一下.</p> 
  <pre><code class="prism language-cpp"><span class="token comment">/// @brief 字符串切分</span>
<span class="token keyword">class</span> <span class="token class-name">StringUtil</span>
<span class="token punctuation">{</span>
<span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">Split</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>target<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string sep<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token function">assert</span><span class="token punctuation">(</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token comment">// 我们这里使用现成的切分函数</span>
      boost<span class="token double-colon punctuation">::</span><span class="token function">split</span><span class="token punctuation">(</span><span class="token operator">*</span>out<span class="token punctuation">,</span> target<span class="token punctuation">,</span> boost<span class="token double-colon punctuation">::</span><span class="token function">is_any_of</span><span class="token punctuation">(</span>sep<span class="token punctuation">)</span><span class="token punctuation">,</span>
                   boost<span class="token double-colon punctuation">::</span>token_compress_on<span class="token punctuation">)</span><span class="token punctuation">;</span>
	<span class="token punctuation">}</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>
</code></pre> 
  <h4>建立倒排索引</h4> 
  <p>下面我们开始根据最新的结构体建立倒排索引.这里我们需要分词.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">struct</span> <span class="token class-name">word_cnt</span>
<span class="token punctuation">{</span>
  <span class="token keyword">int</span> title_cnt<span class="token punctuation">;</span>
  <span class="token keyword">int</span> content_cnt<span class="token punctuation">;</span>
  <span class="token function">word_cnt</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">title_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">content_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>

<span class="token keyword">bool</span> <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo <span class="token operator">&</span>doc<span class="token punctuation">)</span>
<span class="token punctuation">{</span>

  <span class="token comment">// 用来暂存 词频</span>
  std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> word_cnt<span class="token operator">></span> word_map<span class="token punctuation">;</span>
  
  <span class="token comment">// 1.对标题 分词</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> title_words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>title<span class="token punctuation">,</span> <span class="token operator">&</span>title_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token comment">// 不区分大小写</span>
  <span class="token comment">// 那么用户也不因该区分大小写</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> title_words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> 
    word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>title_cnt<span class="token operator">++</span><span class="token punctuation">;</span> <span class="token comment">// 解释一下</span>
  <span class="token punctuation">}</span>

    
  <span class="token comment">// 对文档内容分词</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> content_words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token operator">&</span>content_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> s <span class="token operator">:</span> content_words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
    word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>content_cnt<span class="token operator">++</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  
  <span class="token comment">// 到这里每一个词都有它的在标题和内容中出现的次数</span>
   
  <span class="token comment">// 3 构建倒排拉链</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>word_pair <span class="token operator">:</span> word_map<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">/*
    struct InvertedElem
    {
        uint64_t doc_id;  // 文旦id
        std::string word; // 关键字
        int weight;       // 权重 -->后面解释
    };
    */</span>
    
    InvertedElem item<span class="token punctuation">;</span> 
    
    item<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> doc<span class="token punctuation">.</span>doc_id<span class="token punctuation">;</span> <span class="token comment">// 这里解释了上面我们为何添加了id</span>
    item<span class="token punctuation">.</span>word <span class="token operator">=</span> word_pair<span class="token punctuation">.</span>first<span class="token punctuation">;</span>
    item<span class="token punctuation">.</span>weight <span class="token operator">=</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span>word_pair<span class="token punctuation">.</span>second<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是计算权重的</span>
    
    
    <span class="token comment">// 加入倒排拉链中</span>
    <span class="token comment">// typedef std::vector<struct InvertedElem> InvertedList;</span>
    <span class="token comment">// std::unordered_map<std::string, InvertedList> inverted_index;</span>
    InvertedList <span class="token operator">&</span>inverted_list <span class="token operator">=</span> inverted_index<span class="token punctuation">[</span>word_pair<span class="token punctuation">.</span>first<span class="token punctuation">]</span><span class="token punctuation">;</span>
    inverted_list<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>item<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h5>引入jieba</h5> 
  <p>由于倒排索引需要分词,这里我们引入jiebe,这里我们把切分字符串写成一个工具.这是使用软链接.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /home/qkj/install/cppjieba/include/cppjieba cppjieba
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /home/qkj/install/cppjieba/dict/ dict
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">24</span>
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">43</span> Sep  <span class="token number">9</span> 06:00 cppjieba -<span class="token operator">></span> /home/qkj/install/cppjieba/include/cppjieba
drwxrwxr-x. <span class="token number">4</span> qkj qkj   <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
lrwxrwxrwx. <span class="token number">1</span> qkj qkj   <span class="token number">32</span> Sep  <span class="token number">9</span> 06:01 dict -<span class="token operator">></span> /home/qkj/install/cppjieba/dict/
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">6379</span> Sep  <span class="token number">9</span> 03:15 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj  <span class="token number">117</span> Sep  <span class="token number">9</span> 01:41 Makefile
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj <span class="token number">1199</span> Sep  <span class="token number">9</span> 03:15 util.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这里就可以编写我们的切词工具了.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> DICT_PATH <span class="token operator">=</span> <span class="token string">"./dict/jieba.dict.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> HMM_PATH <span class="token operator">=</span> <span class="token string">"./dict/hmm_model.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> USER_DICT_PATH <span class="token operator">=</span> <span class="token string">"./dict/user.dict.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> IDF_PATH <span class="token operator">=</span> <span class="token string">"./dict/idf.utf8"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> <span class="token keyword">char</span> <span class="token operator">*</span><span class="token keyword">const</span> STOP_WORD_PATH <span class="token operator">=</span> <span class="token string">"./dict/stop_words.utf8"</span><span class="token punctuation">;</span>

<span class="token comment">/// @brief 这是一个jieba分词</span>
<span class="token keyword">class</span> <span class="token class-name">JiebaUtil</span>
<span class="token punctuation">{</span>
<span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">CutString</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
    	<span class="token function">assert</span><span class="token punctuation">(</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
    	jieba<span class="token punctuation">.</span><span class="token function">CutForSearch</span><span class="token punctuation">(</span>src<span class="token punctuation">,</span> <span class="token operator">*</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
	<span class="token punctuation">}</span>
<span class="token keyword">private</span><span class="token operator">:</span>
	<span class="token keyword">static</span> cppjieba<span class="token double-colon punctuation">::</span>Jieba jieba<span class="token punctuation">;</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>
cppjieba<span class="token double-colon punctuation">::</span>Jieba <span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">jieba</span><span class="token punctuation">(</span>DICT_PATH<span class="token punctuation">,</span> HMM_PATH<span class="token punctuation">,</span> USER_DICT_PATH<span class="token punctuation">,</span> IDF_PATH<span class="token punctuation">,</span> STOP_WORD_PATH<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <h5>权重计算</h5> 
  <p>先来解释一下什么是权重,可以这么理解.对于搜索频率高的单词,我们认为它的权重高.同时对一个文档,如果关键字出现的次数越多,起权重越大.这里我么权重结算简单些.</p> 
  <pre><code class="prism language-cpp">    <span class="token keyword">int</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">struct</span> <span class="token class-name">word_cnt</span> <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">X</span> <span class="token expression"><span class="token number">10</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">Y</span> <span class="token expression"><span class="token number">1</span></span></span>
      <span class="token keyword">return</span> X <span class="token operator">*</span> word<span class="token punctuation">.</span>title_cnt <span class="token operator">+</span> Y <span class="token operator">*</span> word<span class="token punctuation">.</span>content_cnt<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
</code></pre> 
  <p>那么权重有什么作用呢?这里可以等我们搜索的时候,一个关键字可以对应多个文档,那么此时我们可以把权重高的放在前面.</p> 
  <p>现在我们的结构是这样的.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/ccc26dd4848f41eb87fd09db422de84b.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/ccc26dd4848f41eb87fd09db422de84b.jpg" alt="Boost搜索引擎_第9张图片" width="650" height="250" style="border:1px solid black;"></a></p> 
  <h3><code>GetForwardIndex</code></h3> 
  <p>这个是根据文档的id找到文档的内容.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">struct</span> <span class="token class-name">DocInfo</span> <span class="token operator">*</span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>doc_id <span class="token operator"><</span> <span class="token number">0</span> <span class="token operator">||</span> doc_id <span class="token operator">>=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"索引id "</span> <span class="token operator"><<</span> doc_id <span class="token operator"><<</span> <span class="token string">" 越界了"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>doc_id<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h3><code>GetInvertedList</code></h3> 
  <p>这个是根据关键字拿到倒排拉链.</p> 
  <pre><code class="prism language-cpp">InvertedList <span class="token operator">*</span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token keyword">auto</span> it <span class="token operator">=</span> inverted_index<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>it <span class="token operator">==</span> inverted_index<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"关键字 "</span> <span class="token operator"><<</span> word <span class="token operator"><<</span> <span class="token string">" 不存在"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
    <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>

  <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>it<span class="token operator">-></span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这里还剩下一个小工作,后面我们把index设置为单例模式.</p> 
  <h2>设置成单例</h2> 
  <p>下面我们把index设置成单例模式,一来,我们其实在boost搜索引擎项目当中,事实上不需要建立多个Index索引对象,只需要建立一个索引对象就可以完成查找工作了二来,我们建立一个索引对象的成本事实上是极高的,因为我们需要将所有的网页信息分词,统计,填充,插入,效率上会受极大损失。</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">namespace</span> ns_index
<span class="token punctuation">{</span>
  <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string title<span class="token punctuation">;</span>   <span class="token comment">// 文档标题</span>
    std<span class="token double-colon punctuation">::</span>string content<span class="token punctuation">;</span> <span class="token comment">// 文档内容</span>
    std<span class="token double-colon punctuation">::</span>string url<span class="token punctuation">;</span>     <span class="token comment">// 官网url</span>

    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦的id 暂时不做理解</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">/// @brief 作为倒排索引的辅助</span>
  <span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span>  <span class="token comment">// 文旦id</span>
    std<span class="token double-colon punctuation">::</span>string word<span class="token punctuation">;</span> <span class="token comment">// 关键字</span>
    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>       <span class="token comment">// 权重</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token comment">// 倒排拉链  -- 根据用一个关键字 来拿到一组的InvertedElem</span>
  <span class="token keyword">typedef</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">InvertedElem</span><span class="token operator">></span> InvertedList<span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Index</span>
  <span class="token punctuation">{</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token function">Index</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token function">Index</span><span class="token punctuation">(</span><span class="token keyword">const</span> Index <span class="token operator">&</span><span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">delete</span><span class="token punctuation">;</span>
    Index <span class="token operator">&</span><span class="token keyword">operator</span><span class="token operator">=</span><span class="token punctuation">(</span><span class="token keyword">const</span> Index <span class="token operator">&</span><span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">delete</span><span class="token punctuation">;</span>
    <span class="token keyword">static</span> Index <span class="token operator">*</span>instance<span class="token punctuation">;</span>
    <span class="token keyword">static</span> std<span class="token double-colon punctuation">::</span>mutex mtx<span class="token punctuation">;</span>

  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token operator">~</span><span class="token function">Index</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
    <span class="token punctuation">}</span>
    <span class="token keyword">static</span> Index <span class="token operator">*</span><span class="token function">GetInstance</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 线程不安全,加锁</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> instance<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        mtx<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>instance <span class="token operator">==</span> <span class="token keyword">nullptr</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          instance <span class="token operator">=</span> <span class="token keyword">new</span> Index<span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        mtx<span class="token punctuation">.</span><span class="token function">unlock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">return</span> instance<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据doc_id来获取正派索引 ,也就是文旦内容</span>
    <span class="token comment">/// @param doc_id  文旦id</span>
    <span class="token comment">/// @return 返回文档结构体的地址</span>
    <span class="token keyword">struct</span> <span class="token class-name">DocInfo</span> <span class="token operator">*</span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>doc_id <span class="token operator"><</span> <span class="token number">0</span> <span class="token operator">||</span> doc_id <span class="token operator">>=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"索引id "</span> <span class="token operator"><<</span> doc_id <span class="token operator"><<</span> <span class="token string">" 越界了"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>doc_id<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据关键字 获取倒排拉链</span>
    <span class="token comment">/// @param word 关键</span>
    <span class="token comment">/// @return</span>
    InvertedList <span class="token operator">*</span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">auto</span> it <span class="token operator">=</span> inverted_index<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>it <span class="token operator">==</span> inverted_index<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"关键字 "</span> <span class="token operator"><<</span> word <span class="token operator"><<</span> <span class="token string">" 不存在"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>it<span class="token operator">-></span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/// @brief 根据目录 文件 构建 正派和倒排索引,这里是最重的一步</span>
    <span class="token comment">/// @param src_path 去标签后目录文件目录</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src_path<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      std<span class="token double-colon punctuation">::</span>ifstream <span class="token function">in</span><span class="token punctuation">(</span>src_path<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>in <span class="token operator">|</span> std<span class="token double-colon punctuation">::</span>ios<span class="token double-colon punctuation">::</span>binary<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">if</span> <span class="token punctuation">(</span>in<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"文件目录 "</span> <span class="token operator"><<</span> src_path <span class="token operator"><<</span> <span class="token string">"无效"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">int</span> count <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
      std<span class="token double-colon punctuation">::</span>string line<span class="token punctuation">;</span>
      <span class="token keyword">while</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>in<span class="token punctuation">,</span> line<span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 此时我们已经提取到每一个html内容了</span>
        <span class="token comment">// 建立正派索引</span>
        DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> <span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span>line<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>doc <span class="token operator">==</span> <span class="token keyword">nullptr</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          std<span class="token double-colon punctuation">::</span>cerr <span class="token operator"><<</span> <span class="token string">"建立一个正派索引失败"</span> <span class="token operator"><<</span> line <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
          <span class="token keyword">continue</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>

        <span class="token comment">// 建立 倒排索引</span>
        <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token operator">*</span>doc<span class="token punctuation">)</span><span class="token punctuation">;</span>
        count<span class="token operator">++</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>count <span class="token operator">%</span> <span class="token number">50</span> <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token comment">// 后期加上一个进度条</span>
          <span class="token comment">// LOG(NORMAL, "当前已经处理了 " + std::to_string(count) + " 个文档");</span>
          std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"当前已经处理了 索引文档 "</span> <span class="token operator"><<</span> count <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">/// @brief 根据字符串建立正派索引  也就是根据文旦id找到 文档内容</span>
    <span class="token comment">/// @param line 一个字符串,该字符串保留一个html文档的所有内容</span>
    <span class="token comment">/// @return</span>
    DocInfo <span class="token operator">*</span><span class="token function">BuildForwardIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>line<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// title\3content\3url\n</span>

      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> results<span class="token punctuation">;</span>
      <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string sep <span class="token operator">=</span> <span class="token string">"\3"</span><span class="token punctuation">;</span>
      
     ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">StringUtil</span><span class="token double-colon punctuation">::</span><span class="token function">Split</span><span class="token punctuation">(</span>line<span class="token punctuation">,</span> <span class="token operator">&</span>results<span class="token punctuation">,</span> sep<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">if</span> <span class="token punctuation">(</span>results<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token number">3</span><span class="token punctuation">)</span>
        <span class="token keyword">return</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>

      DocInfo doc<span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>title <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>content <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>url <span class="token operator">=</span> results<span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
      doc<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 注意这里是 正派拉链</span>

      forward_index<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>doc<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">return</span> <span class="token operator">&</span><span class="token punctuation">(</span>forward_index<span class="token punctuation">[</span>forward_index<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 为了词频统计</span>
    <span class="token keyword">struct</span> <span class="token class-name">word_cnt</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">int</span> title_cnt<span class="token punctuation">;</span>
      <span class="token keyword">int</span> content_cnt<span class="token punctuation">;</span>
      <span class="token function">word_cnt</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">title_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">content_cnt</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token punctuation">}</span><span class="token punctuation">;</span>

    <span class="token comment">/// @brief 根据一个文档内容的结构体建立倒排索引,需要经行分词  --</span>
    <span class="token comment">/// @param doc  这个是一个结构体</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">bool</span> <span class="token function">BuildInvertedIndex</span><span class="token punctuation">(</span><span class="token keyword">const</span> DocInfo <span class="token operator">&</span>doc<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>

      <span class="token comment">// 用来暂存 词频</span>
      std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> word_cnt<span class="token operator">></span> word_map<span class="token punctuation">;</span>
      <span class="token comment">// 1.对标题 分词</span>
      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> title_words<span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>title<span class="token punctuation">,</span> <span class="token operator">&</span>title_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token comment">// 不区分大小写</span>
      <span class="token comment">// 那么用户也不因该区分大小写</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> title_words<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
        word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>title_cnt<span class="token operator">++</span><span class="token punctuation">;</span> <span class="token comment">// 解释一下</span>
      <span class="token punctuation">}</span>

      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> content_words<span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>doc<span class="token punctuation">.</span>content<span class="token punctuation">,</span> <span class="token operator">&</span>content_words<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> s <span class="token operator">:</span> content_words<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
        word_map<span class="token punctuation">[</span>s<span class="token punctuation">]</span><span class="token punctuation">.</span>content_cnt<span class="token operator">++</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token comment">// 3 构建倒排拉链</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>word_pair <span class="token operator">:</span> word_map<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        InvertedElem item<span class="token punctuation">;</span>
        item<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> doc<span class="token punctuation">.</span>doc_id<span class="token punctuation">;</span> <span class="token comment">// 这里解释了上面我们为何添加了id</span>
        item<span class="token punctuation">.</span>word <span class="token operator">=</span> word_pair<span class="token punctuation">.</span>first<span class="token punctuation">;</span>
        item<span class="token punctuation">.</span>weight <span class="token operator">=</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span>word_pair<span class="token punctuation">.</span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>

        <span class="token comment">// 加入倒排拉链中</span>
        InvertedList <span class="token operator">&</span>inverted_list <span class="token operator">=</span> inverted_index<span class="token punctuation">[</span>word_pair<span class="token punctuation">.</span>first<span class="token punctuation">]</span><span class="token punctuation">;</span>
        inverted_list<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">move</span><span class="token punctuation">(</span>item<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token keyword">return</span> <span class="token boolean">true</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">/// @brief 构建权重</span>
    <span class="token comment">/// @param word</span>
    <span class="token comment">/// @return</span>
    <span class="token keyword">int</span> <span class="token function">_build_relevance</span><span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">struct</span> <span class="token class-name">word_cnt</span> <span class="token operator">&</span>word<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">X</span> <span class="token expression"><span class="token number">10</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">Y</span> <span class="token expression"><span class="token number">1</span></span></span>
      <span class="token keyword">return</span> X <span class="token operator">*</span> word<span class="token punctuation">.</span>title_cnt <span class="token operator">+</span> Y <span class="token operator">*</span> word<span class="token punctuation">.</span>content_cnt<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">// 正排索引 -- 根据vector下标可以更加高效作为id找到内容</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span><span class="token keyword">struct</span> <span class="token class-name">DocInfo</span><span class="token operator">></span>
        forward_index<span class="token punctuation">;</span>
    <span class="token comment">// 倒排索引 一个关键字 可能在很多的文档中出现,一定是一个关键字和一组InvertedElem对应</span>
    std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> InvertedList<span class="token operator">></span> inverted_index<span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  Index <span class="token operator">*</span>Index<span class="token double-colon punctuation">::</span>instance <span class="token operator">=</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>mutex Index<span class="token double-colon punctuation">::</span>mtx<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h1>搜索引擎模块</h1> 
  <p>下面我们开始编写搜索模块,这里我们先来写出基本代码结构.我们也创建一个文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> searcher.hpp 
</code></pre> 
  <p>下面是我们的框架.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">namespace</span> ns_searcher
<span class="token punctuation">{</span>

  <span class="token keyword">struct</span> <span class="token class-name">InvertedElemPrint</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦id</span>

    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>                     <span class="token comment">// 权重</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span> <span class="token comment">// 关键字></span>
    <span class="token function">InvertedElemPrint</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">doc_id</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">weight</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Searcher</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token operator">~</span><span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token comment">//input 这个是我们去标签后面的文件</span>
    <span class="token keyword">void</span> <span class="token function">InitSearcher</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>input<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token comment">// 1. 获取index</span>
        <span class="token comment">// 2. 根绝index建立索引</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// query: 这个是我们要搜索的词或者是语句</span>
    <span class="token comment">// json_string: 这个是我们结果,是一个json串</span>
    <span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
        <span class="token comment">//1. 分词 我们的搜索的语句,注意转成小写</span>
        <span class="token comment">//2. 根据关键字,拿到倒排拉链,</span>
        <span class="token comment">//3. 合并排序: 根据我们的结果按照权重进行降序排序</span>
        <span class="token comment">//4. 构建json串</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    ns_index<span class="token double-colon punctuation">::</span>Index <span class="token operator">*</span>index<span class="token punctuation">;</span> <span class="token comment">// 提供系统经行查找索引</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>InitSearcher</h2> 
  <p>这个是我们初始化的工作,一共两个内容.</p> 
  <ul> 
   <li>拿到index对象</li> 
   <li>根据index建立索引</li> 
  </ul> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">InitSearcher</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>input<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 获取创建index对象</span>
  index <span class="token operator">=</span> ns_index<span class="token double-colon punctuation">::</span><span class="token class-name">Index</span><span class="token double-colon punctuation">::</span><span class="token function">GetInstance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// std::cout << "获取单例成功" << std::endl;</span>
  <span class="token comment">//  根据index对象建立索引</span>
  index<span class="token operator">-></span><span class="token function">BuildIndex</span><span class="token punctuation">(</span>input<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// std::cout << "建立正派倒排索引成功" << std::endl;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>Search</h2> 
  <p>这个是我们查找实现的具体流程.我们输入我们想要查找的内容,下面是我们函数的流程</p> 
  <ul> 
   <li>切分输入的内容,小写的保存在数组中</li> 
   <li>根据额数组的每一个元素,拿到倒排拉链,然后把所有的倒排拉量的内容保存在一个拉链中</li> 
   <li>我们以降序的方式排序整个拉链</li> 
   <li>根据拉链的id找到文档内容,构建json串</li> 
  </ul> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 1 分词  先来分词后面在进行查找</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>words<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 2 根据分词结果依次触发  搜索</span>
  ns_index<span class="token double-colon punctuation">::</span>InvertedList inverted_list_all<span class="token punctuation">;</span> <span class="token comment">// 保存所有的倒排拉链里面的内容</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 建立索引的时候是忽略大小写的,我们搜索的时候也需要</span>

    <span class="token comment">// 先查倒排</span>
    ns_index<span class="token double-colon punctuation">::</span>InvertedList <span class="token operator">*</span>inverted_list <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> inverted_list<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 此时找到了 保存所有的 拉链里面的值</span>
    <span class="token comment">// 不完美 一个词可能和多个文档相关 一个文档可以和多个关键词相关.</span>
    inverted_list_all<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  std<span class="token double-colon punctuation">::</span><span class="token function">sort</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
            <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e1<span class="token punctuation">,</span> <span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e2<span class="token punctuation">)</span>
            <span class="token punctuation">{</span>
              <span class="token keyword">return</span> e1<span class="token punctuation">.</span>weight <span class="token operator">></span> e2<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
            <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
<span class="token punctuation">}</span>
<span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre> 
  <p>上面我们的实现有一个完美的地方,我们知道一个词可以映射到多个文档的id,那么多个关键字映射的文档id,就有可能进行冲突.例如下面的例子.</p> 
  <table> 
   <thead> 
    <tr> 
     <th>关键字</th> 
     <th>文档ID</th> 
     <th></th> 
    </tr> 
   </thead> 
   <tbody> 
    <tr> 
     <td>你好</td> 
     <td>1, 2</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>我</td> 
     <td>1, 2</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>是</td> 
     <td>1, 2</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>大学生</td> 
     <td>1</td> 
     <td></td> 
    </tr> 
    <tr> 
     <td>社会人</td> 
     <td>2</td> 
     <td></td> 
    </tr> 
   </tbody> 
  </table> 
  <blockquote> 
   <p>我们把"你好,我"进行分词,然后得到拉链,放在总拉链里面,这就是[文档1, 文档2,文档1, 文档2],这我们后期弥补.</p> 
  </blockquote> 
  <h3>jsoncpp安装与使用</h3> 
  <p>下面我们需要说一下<code>jsoncpp</code>的安装与使用.毕竟我们这里要构建json串.json是序列化和反序列化的.</p> 
  <pre><code class="prism language-cpp"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ sudo yum install <span class="token operator">-</span>y jsoncpp<span class="token operator">-</span>devel
</code></pre> 
  <p>下面我们使用一下json.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">touch</span> test.cc
</code></pre> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><iostream></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><string></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><jsoncpp/json/json.h></span></span>

<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
  
  Json<span class="token double-colon punctuation">::</span>Value item1<span class="token punctuation">;</span>
  item1<span class="token punctuation">[</span><span class="token string">"key1"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value11"</span><span class="token punctuation">;</span>
  item1<span class="token punctuation">[</span><span class="token string">"key2"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value22"</span><span class="token punctuation">;</span>

  Json<span class="token double-colon punctuation">::</span>Value item2<span class="token punctuation">;</span>
  item2<span class="token punctuation">[</span><span class="token string">"key1"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value1"</span><span class="token punctuation">;</span>
  item2<span class="token punctuation">[</span><span class="token string">"key2"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string">"value2"</span><span class="token punctuation">;</span>

  root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>item1<span class="token punctuation">)</span><span class="token punctuation">;</span>
  root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>item2<span class="token punctuation">)</span><span class="token punctuation">;</span>

  Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>string s <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> s <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面就是我们的结果.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ g++ test.cc  <span class="token parameter variable">-ljsoncpp</span>
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ./a.out 
<span class="token punctuation">[</span>
   <span class="token punctuation">{</span>
      <span class="token string">"key1"</span> <span class="token builtin class-name">:</span> <span class="token string">"value11"</span>,
      <span class="token string">"key2"</span> <span class="token builtin class-name">:</span> <span class="token string">"value22"</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">{</span>
      <span class="token string">"key1"</span> <span class="token builtin class-name">:</span> <span class="token string">"value1"</span>,
      <span class="token string">"key2"</span> <span class="token builtin class-name">:</span> <span class="token string">"value2"</span>
   <span class="token punctuation">}</span>
<span class="token punctuation">]</span>

<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>下面我们继续编写这个代码.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 1 分词  先来分词后面在进行查找</span>
  std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span>
  ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>words<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 2 根据分词结果依次触发  搜索</span>
  ns_index<span class="token double-colon punctuation">::</span>InvertedList inverted_list_all<span class="token punctuation">;</span> <span class="token comment">// 保存所有的倒排拉链里面的内容</span>

  <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> words<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 建立索引的时候是忽略大小写的,我们搜索的时候也需要</span>

    <span class="token comment">// 先查倒排</span>
    ns_index<span class="token double-colon punctuation">::</span>InvertedList <span class="token operator">*</span>inverted_list <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> inverted_list<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 此时找到了 保存所有的 拉链里面的值</span>
    <span class="token comment">// 不完美 一个词可能和多个文档相关 一个文档可以和多个关键词相关.</span>
    inverted_list_all<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list<span class="token operator">-></span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  std<span class="token double-colon punctuation">::</span><span class="token function">sort</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
            <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e1<span class="token punctuation">,</span> <span class="token keyword">const</span> ns_index<span class="token double-colon punctuation">::</span>InvertedElem <span class="token operator">&</span>e2<span class="token punctuation">)</span>
            <span class="token punctuation">{</span>
              <span class="token keyword">return</span> e1<span class="token punctuation">.</span>weight <span class="token operator">></span> e2<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
            <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
  Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> inverted_list_all<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// 此时拿到正派</span>
    ns_index<span class="token double-colon punctuation">::</span>DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span>item<span class="token punctuation">.</span>doc_id<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> doc<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">continue</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">// 获取了 文档内容</span>
    Json<span class="token double-colon punctuation">::</span>Value elem<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"title"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>title<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"desc"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>content<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"url"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>url<span class="token punctuation">;</span>

    root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>elem<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是有序的</span>
  <span class="token punctuation">}</span>

   Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span> <span class="token comment">// 这里我们暂时用这个格式</span>
  <span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>搜索测试</h2> 
  <p>下面我们这里统一做一个搜索测试.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"searcher.hpp"</span></span>
<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string input <span class="token operator">=</span> <span class="token string">"data/raw_html/raw.txt"</span><span class="token punctuation">;</span>
<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  ns_searcher<span class="token double-colon punctuation">::</span>Searcher <span class="token operator">*</span>search <span class="token operator">=</span> <span class="token keyword">new</span> ns_searcher<span class="token double-colon punctuation">::</span><span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  search<span class="token operator">-></span><span class="token function">InitSearcher</span><span class="token punctuation">(</span>input<span class="token punctuation">)</span><span class="token punctuation">;</span>

  std<span class="token double-colon punctuation">::</span>string query<span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>string json_string<span class="token punctuation">;</span>
  
  <span class="token keyword">while</span> <span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"请输入关键字# "</span><span class="token punctuation">;</span>
    <span class="token comment">//std::cin >> query;</span>
    std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>cin<span class="token punctuation">,</span> query<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">//std::cout << query;</span>
    search<span class="token operator">-></span><span class="token function">Search</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>json_string<span class="token punctuation">)</span><span class="token punctuation">;</span>
    std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> json_string <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>下面是Mekefile.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
SSVR=search_server 

.PHONY:all
all:$(PARSER) $(SSVR)

$(SSVR):server.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

.PHONY:clean
clean:
	rm -f $(PARSER) $(SSVR)
</code></pre> 
  <p>下面我们测试一下.这是一个html文档的内容,我们的内容实在是太多了.此时这我们应该把内容给裁出来一部分.这样比较好.</p> 
  <pre><code>{
      "desc" : "Struct template bound_launcherHomeLibrariesPeopleFAQMoreStruct template bound_launcherboost::process::v2::bound_launcher — Utility class to bind initializers to a launcher. Synopsis// In header: <boost/process/v2/bind_launcher.hpp>template<typename Launcher, typename ... Init> struct bound_launcher {  // construct/copy/destruct  template<typename Launcher_, typename ... Init_>     bound_launcher(Launcher_ &&, Init_ &&...);  // public member functions  template<typename ExecutionContext, typename Args, typename ... Inits>     auto operator()(ExecutionContext &,                     const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  template<typename ExecutionContext, typename Args, typename ... Inits>     auto operator()(ExecutionContext &, error_code &,                     const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  template<typename Executor, typename Args, typename ... Inits>     auto operator()(Executor,                     const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  template<typename Executor, typename Args, typename ... Inits>     auto operator()(Executor, error_code &,                     const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  // private member functions  template<std::size_t ... Idx, typename ExecutionContext, typename Args,            typename ... Inits>     auto invoke(unspecified, ExecutionContext &,                 const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                 Args &&, Inits &&...);  template<std::size_t ... Idx, typename ExecutionContext, typename Args,            typename ... Inits>     auto invoke(unspecified, ExecutionContext &, error_code &,                 const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type &,                 Args &&, Inits &&...);  template<std::size_t ... Idx, typename Executor, typename Args,            typename ... Inits>     auto invoke(unspecified, Executor,                 const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                 Args &&, Inits &&...);  template<std::size_t ... Idx, typename Executor, typename Args,            typename ... Inits>     auto invoke(unspecified, Executor, error_code &,                 const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type &,                 Args &&, Inits &&...);};DescriptionThis can be used when multiple processes shared some settings, e.g. Template Parameterstypename LauncherThe inner launcher to be used typename ... Initbound_launcher         public       construct/copy/destructtemplate<typename Launcher_, typename ... Init_>   bound_launcher(Launcher_ && l, Init_ &&... init);bound_launcher public member functionstemplate<typename ExecutionContext, typename Args, typename ... Inits>   auto operator()(ExecutionContext & context,                   const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);template<typename ExecutionContext, typename Args, typename ... Inits>   auto operator()(ExecutionContext & context, error_code & ec,                   const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);template<typename Executor, typename Args, typename ... Inits>   auto operator()(Executor exec,                   const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);template<typename Executor, typename Args, typename ... Inits>   auto operator()(Executor exec, error_code & ec,                   const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,                   Args && args, Inits &&... inits);bound_launcher private member functionstemplate<std::size_t ... Idx, typename ExecutionContext, typename Args,          typename ... Inits>   auto invoke(unspecified, ExecutionContext & context,               const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);template<std::size_t ... Idx, typename ExecutionContext, typename Args,          typename ... Inits>   auto invoke(unspecified, ExecutionContext & context, error_code & ec,               const typename std::enable_if< std::is_convertible< ExecutionContext &, boost::asio::execution_context & >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);template<std::size_t ... Idx, typename Executor, typename Args,          typename ... Inits>   auto invoke(unspecified, Executor exec,               const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);template<std::size_t ... Idx, typename Executor, typename Args,          typename ... Inits>   auto invoke(unspecified, Executor exec, error_code & ec,               const typename std::enable_if< boost::asio::execution::is_executor< Executor >::value||boost::asio::is_executor< Executor >::value, filesystem::path >::type & executable,               Args && args, Inits &&... inits);Copyright © 2006-2012 Julio M. Merino Vidal, Ilya Sokolov,      Felipe Tanus, Jeff Flinn, Boris SchaelingCopyright © 2016 Klemens D. Morgenstern        Distributed under the Boost Software License, Version 1.0. (See accompanying        file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)      ",
      "title" : "Struct template bound_launcher",
      "url" : "https://www.boost.org/doc/libs/1_83_0/doc/html/boost/process/v2/bound_launcher.html"
   },
</code></pre> 
  <h2>获取摘要</h2> 
  <pre><code class="prism language-cpp"><span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// ...</span>
  <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
  Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
  <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> inverted_list_all<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    <span class="token comment">// ....</span>
    <span class="token comment">// 获取了 文档内容</span>
    Json<span class="token double-colon punctuation">::</span>Value elem<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"title"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>title<span class="token punctuation">;</span>
    elem<span class="token punctuation">[</span><span class="token string">"desc"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token function">make_summary</span><span class="token punctuation">(</span>doc<span class="token operator">-></span>content<span class="token punctuation">,</span> item<span class="token punctuation">.</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 我们需要根据关键字来提取摘要</span>
    elem<span class="token punctuation">[</span><span class="token string">"url"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>url<span class="token punctuation">;</span>

    root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>elem<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是有序的</span>
  <span class="token punctuation">}</span>

   Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span> <span class="token comment">// 这里我们暂时用这个格式</span>
  <span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>首先我们可以随便切分,但是一般我们想要与搜索关键字相关的内容.</p> 
  <pre><code class="prism language-cpp">std<span class="token double-colon punctuation">::</span>string <span class="token function">make_summary</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>content<span class="token punctuation">,</span> <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>word<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 这里有点问题  content是正排索引的里面的内容,是区分大小写的 是文档内容,不区分大小写  word 确是 小的的</span>
  <span class="token comment">//  这里获取摘要有点问题,关键字不一定会出现在内容中, 注意是非常小的概率</span>
  <span class="token comment">// std::size_t pos = content.find(words);</span>
  <span class="token comment">// if (pos == std::string::npos)</span>
  <span class="token comment">//   return "Node";</span>

  <span class="token keyword">auto</span> item <span class="token operator">=</span> std<span class="token double-colon punctuation">::</span><span class="token function">search</span><span class="token punctuation">(</span>content<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> content<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> word<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> word<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
                          <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">int</span> x<span class="token punctuation">,</span> <span class="token keyword">int</span> y<span class="token punctuation">)</span>
                          <span class="token punctuation">{</span>
                            <span class="token keyword">return</span> std<span class="token double-colon punctuation">::</span><span class="token function">tolower</span><span class="token punctuation">(</span>x<span class="token punctuation">)</span> <span class="token operator">==</span> std<span class="token double-colon punctuation">::</span><span class="token function">tolower</span><span class="token punctuation">(</span>y<span class="token punctuation">)</span><span class="token punctuation">;</span>
                          <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>item <span class="token operator">==</span> content<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> <span class="token string">"Node"</span><span class="token punctuation">;</span>

  <span class="token comment">// 找到了 计算 跌打器到begin的距离</span>
  std<span class="token double-colon punctuation">::</span>size_t pos <span class="token operator">=</span> std<span class="token double-colon punctuation">::</span><span class="token function">distance</span><span class="token punctuation">(</span>content<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> item<span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>size_t prev_step <span class="token operator">=</span> <span class="token number">50</span><span class="token punctuation">;</span>
  <span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>size_t next_step <span class="token operator">=</span> <span class="token number">100</span><span class="token punctuation">;</span>
  <span class="token comment">// 先前找 50个 向后找 50个</span>
  std<span class="token double-colon punctuation">::</span>size_t begin <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
  <span class="token comment">// 注意szie_t是一个无符号数,这里我们-1 绝对有问题</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>pos <span class="token operator">></span> prev_step<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    begin <span class="token operator">=</span> pos <span class="token operator">-</span> prev_step<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
   
  std<span class="token double-colon punctuation">::</span>size_t end <span class="token operator">=</span> pos <span class="token operator">+</span> next_step<span class="token punctuation">;</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>end <span class="token operator">></span> content<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    end <span class="token operator">=</span> content<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token comment">//这里是是避只有关键</span>
  <span class="token keyword">if</span> <span class="token punctuation">(</span>end <span class="token operator">></span> begin<span class="token punctuation">)</span>
  <span class="token punctuation">{</span>
    std<span class="token double-colon punctuation">::</span>string desc <span class="token operator">=</span> content<span class="token punctuation">.</span><span class="token function">substr</span><span class="token punctuation">(</span>begin<span class="token punctuation">,</span> end <span class="token operator">-</span> begin<span class="token punctuation">)</span><span class="token punctuation">;</span>
    desc <span class="token operator">+=</span> <span class="token string">"...."</span><span class="token punctuation">;</span>
    <span class="token keyword">return</span> desc<span class="token punctuation">;</span>
  <span class="token punctuation">}</span>
  <span class="token keyword">else</span>
    <span class="token keyword">return</span> <span class="token string">"Node"</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/d80e8e59b357470cbc2ff9a84b4aaaf1.png" target="_blank"><img src="http://img.e-com-net.com/image/info8/d80e8e59b357470cbc2ff9a84b4aaaf1.png" alt="Boost搜索引擎_第10张图片" width="851" height="283" style="border:1px solid black;"></a></p> 
  <p>这里测试一下.</p> 
  <pre><code class="prism language-shell">请输入关键字<span class="token comment"># filesystem</span>
<span class="token punctuation">[</span>
   <span class="token punctuation">{</span>
      <span class="token string">"desc"</span> <span class="token builtin class-name">:</span> <span class="token string">"boost::asio::execution_context & >::value, filesystem::path >::type &,                     Args &&, Inits &&...);  templ...."</span>,
      <span class="token string">"title"</span> <span class="token builtin class-name">:</span> <span class="token string">"Struct template bound_launcher"</span>,
      <span class="token string">"url"</span> <span class="token builtin class-name">:</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html/boost/process/v2/bound_launcher.html"</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">..</span><span class="token punctuation">..</span>.
<span class="token punctuation">]</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/22f678bdad044a91850ba63bf18cfd9c.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/22f678bdad044a91850ba63bf18cfd9c.jpg" alt="Boost搜索引擎_第11张图片" width="650" height="217" style="border:1px solid black;"></a></p> 
  <h2>综合调试</h2> 
  <p>下面我们这里要测试上面我们写的内容,是不是按照权重从大到小进行排序的,这里在json串哪里测试一下.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/278d900e1927448b98a58688e78b2216.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/278d900e1927448b98a58688e78b2216.jpg" alt="Boost搜索引擎_第12张图片" width="650" height="325" style="border:1px solid black;"></a></p> 
  <p>这个我们思路是.我们拿到所有的倒排拉链里面的内容,根据id找正文.但是我们倒排拉链哪里也是存在权重的.</p> 
  <pre><code class="prism language-shell">请输入关键字<span class="token comment"># split</span>
<span class="token punctuation">[</span>
   <span class="token punctuation">{</span>
      <span class="token string">"desc"</span> <span class="token builtin class-name">:</span> <span class="token string">"Class template split_iteratorHomeLibrariesPeopleFAQMoreClass template split_iteratorboost::algorithm::split_iterato...."</span>,
      <span class="token string">"title"</span> <span class="token builtin class-name">:</span> <span class="token string">"Class template split_iterator"</span>,
      <span class="token string">"url"</span> <span class="token builtin class-name">:</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html/boost/algorithm/split_iterator.html"</span>,
      <span class="token string">"weight"</span> <span class="token builtin class-name">:</span> <span class="token number">37</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">{</span>
      <span class="token string">"desc"</span> <span class="token builtin class-name">:</span> <span class="token string">"ual, BucketTraits, SizeType, BoolFlags >::type split_bucket_hash_equal_t;  typedef split_bucket_hash_equal_t::key_equal                            ...."</span>,
      <span class="token string">"title"</span> <span class="token builtin class-name">:</span> <span class="token string">"Struct template hashdata_internal"</span>,
      <span class="token string">"url"</span> <span class="token builtin class-name">:</span> <span class="token string">"https://www.boost.org/doc/libs/1_83_0/doc/html/boost/intrusive/hashdata_internal.html"</span>,
      <span class="token string">"weight"</span> <span class="token builtin class-name">:</span> <span class="token number">20</span>
   <span class="token punctuation">}</span>,
   <span class="token punctuation">..</span><span class="token punctuation">..</span>.
<span class="token punctuation">]</span>
</code></pre> 
  <p>关于调试我们这里需要总结几个内容.</p> 
  <ul> 
   <li>计算权重时,我们先去拿了标题,但是在内容中我们是对整个内容去标题.所以我们标题计算权重时要计算两次,那么一个标题是11</li> 
   <li>我们分词的具体规则不知道,不够这里我们就不关心了</li> 
   <li>上面我们还剩下最后一个内容,就是重复文档的问题.</li> 
  </ul> 
  <p>调试后,我们修改一下文件名.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> server.cc debug.cc
</code></pre> 
  <p>同时也修改一下makefile.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
DUG=debug

.PHONY:all
all:$(PARSER) $(DUG)

$(DUG):debug.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

.PHONY:clean
clean:
	rm -f $(PARSER) $(DUG)
</code></pre> 
  <h1>搜索服务端</h1> 
  <p>下面我们开始编写网络版本的服务端,我们先创建好文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> http_server.cc
</code></pre> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"searcher.hpp"</span></span>
<span class="token keyword">int</span> <span class="token function">mian</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>这里也修改下makefile.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
DUG=debug
HTTP_SERVER=http_server 
.PHONY:all
all:$(PARSER) $(DUG) $(HTTP_SERVER)

$(DUG):debug.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

$(HTTP_SERVER):http_server.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem -ljsoncpp

.PHONY:clean
clean:
	rm -f $(PARSER) $(DUG) $(HTTP_SERVER)

</code></pre> 
  <p>这里测试一下.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">make</span>
g++ <span class="token parameter variable">-o</span> parser parser.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11 <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span>
g++ <span class="token parameter variable">-o</span> debug debug.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11  <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span> <span class="token parameter variable">-ljsoncpp</span>
g++ <span class="token parameter variable">-o</span> http_server http_server.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11 <span class="token parameter variable">-lpthread</span> <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span> <span class="token parameter variable">-ljsoncpp</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">1548</span>
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">43</span> Sep  <span class="token number">9</span> 06:00 cppjieba -<span class="token operator">></span> /home/qkj/install/cppjieba/include/cppjieba
drwxrwxr-x. <span class="token number">4</span> qkj qkj     <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">658128</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 debug
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">483</span> Sep  <span class="token number">9</span> 09:16 debug.cc
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">32</span> Sep  <span class="token number">9</span> 06:01 dict -<span class="token operator">></span> /home/qkj/install/cppjieba/dict/
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">401400</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server
-rw-rw-r--. <span class="token number">1</span> qkj qkj     <span class="token number">51</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6102</span> Sep  <span class="token number">9</span> 08:33 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">446</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:58 Makefile
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">481760</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 parser
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">4626</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:42 searcher.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">1779</span> Sep  <span class="token number">9</span> 08:27 util.hpp
</code></pre> 
  <h2>升级gcc</h2> 
  <p>这里通信我们可以自己写,后面我们会升级.不过这里我们使用cpp-httplib库.这个库很简单.这里cpp-httplib有点问题,我们需要教新版本的编译器,否则就是编译不通过,或者是运行出现错误.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ gcc <span class="token parameter variable">-v</span>
Using built-in specs.
<span class="token assign-left variable">COLLECT_GCC</span><span class="token operator">=</span>gcc
<span class="token assign-left variable">COLLECT_LTO_WRAPPER</span><span class="token operator">=</span>/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: <span class="token punctuation">..</span>/configure <span class="token parameter variable">--prefix</span><span class="token operator">=</span>/usr <span class="token parameter variable">--mandir</span><span class="token operator">=</span>/usr/share/man --
<span class="token assign-left variable">infodir</span><span class="token operator">=</span>/usr/share/info --with-bugurl<span class="token operator">=</span>http://bugzilla.redhat.com/bugzilla <span class="token parameter variable">--enablebootstrap</span>
--enable-shared --enable-threads<span class="token operator">=</span>posix --enable-checking<span class="token operator">=</span>release --with-systemzlib
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --
enable-linker-build-id --with-linker-hash-style<span class="token operator">=</span>gnu --enable-languages<span class="token operator">=</span>c,c++,objc,objc++,
java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --
with-isl<span class="token operator">=</span>/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --
with-cloog<span class="token operator">=</span>/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install -
-enable-gnu-indirect-function --with-tune<span class="token operator">=</span>generic --with-arch_32<span class="token operator">=</span>x86-64 <span class="token parameter variable">--build</span><span class="token operator">=</span>x86_64-
redhat-linux
Thread model: posix
gcc version <span class="token number">4.8</span>.5 <span class="token number">20150623</span> <span class="token punctuation">(</span>Red Hat <span class="token number">4.8</span>.5-44<span class="token punctuation">)</span> <span class="token punctuation">(</span>GCC<span class="token punctuation">)</span>
</code></pre> 
  <p>下面直接升级.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">sudo</span> yum <span class="token function">install</span> centos-release-scl
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">sudo</span> yum <span class="token function">install</span> devtoolset-8-gcc*
scl <span class="token builtin class-name">enable</span> devtoolset-8 <span class="token function">bash</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token builtin class-name">source</span> /opt/rh/devtoolset-8/enable
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> /usr/bin/gcc /usr/bin/gcc-4.8.5
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /opt/rh/devtoolset-8/root/bin/gcc /usr/bin/gcc
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> /usr/bin/g++ /usr/bin/g++-4.8.5
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /opt/rh/devtoolset-8/root/bin/g++ /usr/bin/g++
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mv</span> /usr/bin/c++ /usr/bin/c++-4.8.5
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /opt/rh/devtoolset-8/root/bin/c++ /usr/bin/c++
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ gcc <span class="token parameter variable">-v</span>
Using built-in specs.
<span class="token assign-left variable">COLLECT_GCC</span><span class="token operator">=</span>gcc
<span class="token assign-left variable">COLLECT_LTO_WRAPPER</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
Target: x86_64-redhat-linux
Configured with: <span class="token punctuation">..</span>/configure --enable-bootstrap --enable-languages<span class="token operator">=</span>c,c++,fortran,lto <span class="token parameter variable">--prefix</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr <span class="token parameter variable">--mandir</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr/share/man <span class="token parameter variable">--infodir</span><span class="token operator">=</span>/opt/rh/devtoolset-8/root/usr/share/info --with-bugurl<span class="token operator">=</span>http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads<span class="token operator">=</span>posix --enable-checking<span class="token operator">=</span>release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style<span class="token operator">=</span>gnu --with-default-libstdcxx-abi<span class="token operator">=</span>gcc4-compatible --enable-plugin --enable-initfini-array --with-isl<span class="token operator">=</span>/builddir/build/BUILD/gcc-8.3.1-20190311/obj-x86_64-redhat-linux/isl-install --disable-libmpx --enable-gnu-indirect-function --with-tune<span class="token operator">=</span>generic --with-arch_32<span class="token operator">=</span>x86-64 <span class="token parameter variable">--build</span><span class="token operator">=</span>x86_64-redhat-linux
Thread model: posix
gcc version <span class="token number">8.3</span>.1 <span class="token number">20190311</span> <span class="token punctuation">(</span>Red Hat <span class="token number">8.3</span>.1-3<span class="token punctuation">)</span> <span class="token punctuation">(</span>GCC<span class="token punctuation">)</span> 
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <h2>引入cpp-httplib库</h2> 
  <pre><code>这里我们选择下载0.7.15版本,这是因为较新版本的可能运行时会报错.
这里我们选择下载到桌面,然后拖拽到虚拟机上,这些方法都试一遍.
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/3149b9de962e420ea7b8ede3f5549547.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/3149b9de962e420ea7b8ede3f5549547.jpg" alt="Boost搜索引擎_第13张图片" width="650" height="120" style="border:1px solid black;"></a></p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ rz <span class="token parameter variable">-E</span> 

<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ ll
total <span class="token number">596</span>
-rwxrwxr-x. <span class="token number">1</span> qkj qkj  <span class="token number">15424</span> Sep  <span class="token number">9</span> 09:09 a.out
drwxr-xr-x. <span class="token number">8</span> qkj qkj   <span class="token number">4096</span> Aug  <span class="token number">8</span> <span class="token number">14</span>:40 boost_1_83_0
-rw-r--r--. <span class="token number">1</span> qkj qkj <span class="token number">584053</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:23 cpp-httplib-v0.7.15.zip
drwxrwxr-x. <span class="token number">8</span> qkj qkj    <span class="token number">215</span> Sep  <span class="token number">9</span> 03:38 cppjieba
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">421</span> Sep  <span class="token number">9</span> 09:09 test.cc
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>然后我们创建软连接到我们的项目中.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">ln</span> <span class="token parameter variable">-s</span> /home/qkj/install/cpp-httplib-v0.7.15/ cpp-httplib
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ ll
total <span class="token number">1548</span>
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">38</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:30 cpp-httplib -<span class="token operator">></span> /home/qkj/install/cpp-httplib-v0.7.15/
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">43</span> Sep  <span class="token number">9</span> 06:00 cppjieba -<span class="token operator">></span> /home/qkj/install/cppjieba/include/cppjieba
drwxrwxr-x. <span class="token number">4</span> qkj qkj     <span class="token number">35</span> Sep  <span class="token number">9</span> 01:03 data
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">658128</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 debug
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">483</span> Sep  <span class="token number">9</span> 09:16 debug.cc
lrwxrwxrwx. <span class="token number">1</span> qkj qkj     <span class="token number">32</span> Sep  <span class="token number">9</span> 06:01 dict -<span class="token operator">></span> /home/qkj/install/cppjieba/dict/
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">401400</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server
-rw-rw-r--. <span class="token number">1</span> qkj qkj     <span class="token number">51</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 http_server.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6102</span> Sep  <span class="token number">9</span> 08:33 index.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">446</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:58 Makefile
-rwxrwxr-x. <span class="token number">1</span> qkj qkj <span class="token number">481760</span> Sep  <span class="token number">9</span> <span class="token number">20</span>:02 parser
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">6361</span> Sep  <span class="token number">9</span> 02:47 parser.cc
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">4626</span> Sep  <span class="token number">9</span> <span class="token number">19</span>:42 searcher.hpp
-rw-rw-r--. <span class="token number">1</span> qkj qkj   <span class="token number">1779</span> Sep  <span class="token number">9</span> 08:27 util.hpp
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <h3>测试cpp-httplib</h3> 
  <p>下面我们测试一下httplib库.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/3bf94b148acd4275b7ef33a80f22dc28.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/3bf94b148acd4275b7ef33a80f22dc28.jpg" alt="Boost搜索引擎_第14张图片" width="650" height="135" style="border:1px solid black;"></a></p> 
  <p>这里我们先来测试一下.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">make</span>
g++ <span class="token parameter variable">-o</span> http_server http_server.cc <span class="token parameter variable">-std</span><span class="token operator">=</span>c++11  <span class="token parameter variable">-lboost_system</span> <span class="token parameter variable">-lboost_filesystem</span> <span class="token parameter variable">-ljsoncpp</span>
/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/libstdc++_nonshared.a<span class="token punctuation">(</span>thread48.o<span class="token punctuation">)</span>: In <span class="token keyword">function</span> <span class="token variable"><span class="token variable">`</span>std::thread::_M_start_thread<span class="token punctuation">(</span>std::unique_ptr<span class="token operator"><</span>std::thread::_State, std::default_delete<span class="token operator"><</span>std::thread::_State<span class="token operator">></span> <span class="token operator">></span>, void <span class="token punctuation">(</span>*<span class="token punctuation">)</span><span class="token punctuation">(</span><span class="token punctuation">))</span>':
<span class="token punctuation">(</span>.text._ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE+0x11<span class="token punctuation">)</span>: undefined reference to <span class="token variable">`</span></span>pthread_create<span class="token string">'
/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8/libstdc++_nonshared.a(thread48.o): In function `std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)())'</span><span class="token builtin class-name">:</span>
<span class="token punctuation">(</span>.text._ZNSt6thread15_M_start_threadESt10shared_ptrINS_10_Impl_baseEEPFvvE+0x60<span class="token punctuation">)</span>: undefined reference to <span class="token variable"><span class="token variable">`</span>pthread_create'
/tmp/ccGWpu61.o: In <span class="token keyword">function</span> <span class="token variable">`</span></span>std::thread::thread<span class="token operator"><</span>httplib::ThreadPool::worker, , void<span class="token operator">></span><span class="token punctuation">(</span>httplib::ThreadPool::worker<span class="token operator">&&</span><span class="token punctuation">)</span><span class="token string">':
http_server.cc:(.text._ZNSt6threadC2IN7httplib10ThreadPool6workerEJEvEEOT_DpOT0_[_ZNSt6threadC5IN7httplib10ThreadPool6workerEJEvEEOT_DpOT0_]+0x21): undefined reference to `pthread_create'</span>
collect2: error: ld returned <span class="token number">1</span> <span class="token builtin class-name">exit</span> status
make: *** <span class="token punctuation">[</span>http_server<span class="token punctuation">]</span> Error <span class="token number">1</span>
<span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ 
</code></pre> 
  <p>这是由于我们httplib需要引入pthread库.</p> 
  <pre><code class="prism language-makefile">cc=g++
PARSER=parser
DUG=debug
HTTP_SERVER=http_server 
.PHONY:all
all:$(PARSER) $(DUG) $(HTTP_SERVER)

$(DUG):debug.cc
	$(cc) -o $@ $^ -std=c++11  -lboost_system -lboost_filesystem -ljsoncpp

$(PARSER):parser.cc
	$(cc) -o $@ $^ -std=c++11 -lboost_system -lboost_filesystem

$(HTTP_SERVER):http_server.cc
	$(cc) -o $@ $^ -std=c++11 -lpthread -lboost_system -lboost_filesystem -ljsoncpp

.PHONY:clean
clean:
	rm -f $(PARSER) $(DUG) $(HTTP_SERVER)

</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/288805c2e0964ba894e7e50fa48c349b.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/288805c2e0964ba894e7e50fa48c349b.jpg" alt="image-20230910113735355" width="650" height="44"></a></p> 
  <p>这里我们继续测试,先创建一个简单的功能.这个库是很好用的.<a href="http://img.e-com-net.com/image/info8/3e6fbcd8f074418c8c5f6a90aecadcbf.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/3e6fbcd8f074418c8c5f6a90aecadcbf.jpg" alt="image-20230910113849136" width="650" height="69"></a></p> 
  <p>这是我们代码.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cpp-httplib/httplib.h"</span></span>
<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  httplib<span class="token double-colon punctuation">::</span>Server svr<span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span><span class="token string">"hi"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> httplib<span class="token double-colon punctuation">::</span>Request<span class="token operator">&</span> req<span class="token punctuation">,</span> httplib<span class="token double-colon punctuation">::</span>Response<span class="token operator">&</span> rsp<span class="token punctuation">)</span><span class="token punctuation">{</span>
    rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span><span class="token string">"hello word!"</span><span class="token punctuation">,</span> <span class="token string">"text/plain; charset=utf-8"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">listen</span><span class="token punctuation">(</span><span class="token string">"0.0.0.0"</span><span class="token punctuation">,</span> <span class="token number">8081</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/b75b10d0462c441587bf4c17f9e22730.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/b75b10d0462c441587bf4c17f9e22730.jpg" alt="image-20230910114502524" width="650" height="39"></a></p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ <span class="token function">netstat</span> <span class="token parameter variable">-ntlp</span>
<span class="token punctuation">(</span>Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.<span class="token punctuation">)</span>
Active Internet connections <span class="token punctuation">(</span>only servers<span class="token punctuation">)</span>
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">127.0</span>.0.1:44227         <span class="token number">0.0</span>.0.0:*               LISTEN      <span class="token number">1903</span>/node           
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">0.0</span>.0.0:111             <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">0.0</span>.0.0:8081            <span class="token number">0.0</span>.0.0:*               LISTEN      <span class="token number">4191</span>/./http_server  
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">192.168</span>.122.1:53        <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">0.0</span>.0.0:22              <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">127.0</span>.0.1:631           <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp        <span class="token number">0</span>      <span class="token number">0</span> <span class="token number">127.0</span>.0.1:25            <span class="token number">0.0</span>.0.0:*               LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> :::111                  :::*                    LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> :::22                   :::*                    LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> ::1:631                 :::*                    LISTEN      -                   
tcp6       <span class="token number">0</span>      <span class="token number">0</span> ::1:25                  :::*                    LISTEN      -                   
<span class="token punctuation">[</span>qkj@localhost install<span class="token punctuation">]</span>$ 

</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/871601ba555c4c8b9390360e7fd4bba4.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/871601ba555c4c8b9390360e7fd4bba4.jpg" alt="Boost搜索引擎_第15张图片" width="650" height="229" style="border:1px solid black;"></a></p> 
  <h3>开放端口号</h3> 
  <p>这是因为我们的虚拟机没有开辟端口被外部网络进行访问.这里需要开放端口.我们看一下下面有那些端口被打开了.下面是打开的规则.</p> 
  <p>Centos开放端口号</p> 
  <p><a href="http://img.e-com-net.com/image/info8/c47d79b3f3c543b88e120f52e58730a3.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/c47d79b3f3c543b88e120f52e58730a3.jpg" alt="image-20230910120405201" width="650" height="91"></a></p> 
  <h2>设置根目录</h2> 
  <p>一般而言,我们都有一个根目录.这样就可以了.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">mkdir</span> wwwroot
</code></pre> 
  <p>这里在服务器上面设置跟目录.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cpp-httplib/httplib.h"</span></span>
<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string root_path <span class="token operator">=</span> <span class="token string">"./wwwroot"</span><span class="token punctuation">;</span>

<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  httplib<span class="token double-colon punctuation">::</span>Server svr<span class="token punctuation">;</span>
  <span class="token comment">// 设置跟目录</span>
  svr<span class="token punctuation">.</span><span class="token function">set_base_dir</span><span class="token punctuation">(</span>root_path<span class="token punctuation">.</span><span class="token function">c_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span><span class="token string">"hi"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> httplib<span class="token double-colon punctuation">::</span>Request<span class="token operator">&</span> req<span class="token punctuation">,</span> httplib<span class="token double-colon punctuation">::</span>Response<span class="token operator">&</span> rsp<span class="token punctuation">)</span><span class="token punctuation">{</span>
    rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span><span class="token string">"hello word!"</span><span class="token punctuation">,</span> <span class="token string">"text/plain; charset=utf-8"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">listen</span><span class="token punctuation">(</span><span class="token string">"0.0.0.0"</span><span class="token punctuation">,</span> <span class="token number">8080</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p>我们继续测试.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/62122d8ad57748468646e5fc00205b35.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/62122d8ad57748468646e5fc00205b35.jpg" alt="Boost搜索引擎_第16张图片" width="650" height="192" style="border:1px solid black;"></a></p> 
  <p>注意z合适因为我们的根目录下面什么都没有.一般而言,我们是名字为index.html文件.这里设置一下</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost wwwroot<span class="token punctuation">]</span>$ <span class="token function">touch</span> index.html
<span class="token punctuation">[</span>qkj@localhost wwwroot<span class="token punctuation">]</span>$ ll
total <span class="token number">8</span>
-rw-rw-r--. <span class="token number">1</span> qkj qkj    <span class="token number">0</span> Sep  <span class="token number">9</span> <span class="token number">21</span>:10 index.html
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/f72e574431ef41f19d914fb8ee44dcee.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/f72e574431ef41f19d914fb8ee44dcee.jpg" alt="Boost搜索引擎_第17张图片" width="650" height="157" style="border:1px solid black;"></a></p> 
  <p><a href="http://img.e-com-net.com/image/info8/9df106701fb7466c92dca77cfffb6cbd.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/9df106701fb7466c92dca77cfffb6cbd.jpg" alt="Boost搜索引擎_第18张图片" width="650" height="146" style="border:1px solid black;"></a></p> 
  <h2>编写搜索服务端</h2> 
  <p>下面我们就可以编写我们的服务端了.这里面是非常简单的.</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"cpp-httplib/httplib.h"</span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string">"searcher.hpp"</span></span>

<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string root_path <span class="token operator">=</span> <span class="token string">"./wwwroot"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string input <span class="token operator">=</span> <span class="token string">"data/raw_html/raw.txt"</span><span class="token punctuation">;</span>
<span class="token keyword">int</span> <span class="token function">main</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  <span class="token comment">// 初始化sercher</span>
  ns_searcher<span class="token double-colon punctuation">::</span>Searcher search<span class="token punctuation">;</span>
  search<span class="token punctuation">.</span><span class="token function">InitSearcher</span><span class="token punctuation">(</span>input<span class="token punctuation">)</span><span class="token punctuation">;</span>

  httplib<span class="token double-colon punctuation">::</span>Server svr<span class="token punctuation">;</span>
  svr<span class="token punctuation">.</span><span class="token function">set_base_dir</span><span class="token punctuation">(</span>root_path<span class="token punctuation">.</span><span class="token function">c_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 设置跟目录</span>

  svr<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span><span class="token string">"/s"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token operator">&</span>search<span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> httplib<span class="token double-colon punctuation">::</span>Request <span class="token operator">&</span>req<span class="token punctuation">,</span> httplib<span class="token double-colon punctuation">::</span>Response <span class="token operator">&</span>rsp<span class="token punctuation">)</span>
          <span class="token punctuation">{</span>
            <span class="token keyword">if</span> <span class="token punctuation">(</span>req<span class="token punctuation">.</span><span class="token function">has_param</span><span class="token punctuation">(</span><span class="token string">"word"</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
            <span class="token punctuation">{</span>
              rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span><span class="token string">"必须要搜索关键字"</span><span class="token punctuation">,</span> <span class="token string">"text/plain; charset=utf-8"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
              <span class="token keyword">return</span><span class="token punctuation">;</span>
            <span class="token punctuation">}</span>

            std<span class="token double-colon punctuation">::</span>string word <span class="token operator">=</span> req<span class="token punctuation">.</span><span class="token function">get_param_value</span><span class="token punctuation">(</span><span class="token string">"word"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
            std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"用户搜索的: "</span> <span class="token operator"><<</span> word <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>

            std<span class="token double-colon punctuation">::</span>string json_string<span class="token punctuation">;</span>
            search<span class="token punctuation">.</span><span class="token function">Search</span><span class="token punctuation">(</span>word<span class="token punctuation">,</span> <span class="token operator">&</span>json_string<span class="token punctuation">)</span><span class="token punctuation">;</span>
            rsp<span class="token punctuation">.</span><span class="token function">set_content</span><span class="token punctuation">(</span>json_string<span class="token punctuation">,</span> <span class="token string">"application/json"</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"服务器启动成功"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>

  svr<span class="token punctuation">.</span><span class="token function">listen</span><span class="token punctuation">(</span><span class="token string">"0.0.0.0"</span><span class="token punctuation">,</span> <span class="token number">8081</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

  <span class="token keyword">return</span> <span class="token number">0</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/72a6333e42a44e988154a299c28b3a05.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/72a6333e42a44e988154a299c28b3a05.jpg" alt="image-20230910122016183" width="650" height="86"></a></p> 
  <p><a href="http://img.e-com-net.com/image/info8/9038f34ecdc945a09d5ddde6402678f4.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/9038f34ecdc945a09d5ddde6402678f4.jpg" alt="Boost搜索引擎_第19张图片" width="650" height="113" style="border:1px solid black;"></a></p> 
  <h1>前端代码</h1> 
  <p>前端部分我们可以选学,这里我们也不谈.如果想学,可以去下面的网站.</p> 
  <ul> 
   <li>HTML: 编写网页结构, 网页的骨骼</li> 
   <li>CSS : 网页样式,网页的皮肉</li> 
   <li>Js : 前后端交互,网页的灵魂</li> 
  </ul> 
  <blockquote> 
   <p>前端学习网站推荐:http://www.w3school.com.cn</p> 
  </blockquote> 
  <h2>网页结构</h2> 
  <p>我们设置的网页结构是这样的.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/52dd3d674c194fe7b117abe5ea57554c.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/52dd3d674c194fe7b117abe5ea57554c.jpg" alt="Boost搜索引擎_第20张图片" width="650" height="460" style="border:1px solid black;"></a></p> 
  <p>按照上面的内容,我们的html可以这样写.</p> 
  <pre><code class="prism language-html"><span class="token doctype"><span class="token punctuation"><!</span><span class="token doctype-tag">DOCTYPE</span> <span class="token name">html</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>html</span> <span class="token attr-name">lang</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>en<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>head</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">charset</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>UTF-8<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">http-equiv</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>X-UA-Compatible<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>IE=edge<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">name</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>viewport<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>width=device-width, initial-scale=1.0<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>title</span><span class="token punctuation">></span></span>boost 搜索引擎<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>title</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>head</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>body</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>container<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>search<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>input</span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text<span class="token punctuation">"</span></span> <span class="token attr-name">value</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>输入搜索关键字...<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>button</span><span class="token punctuation">></span></span>搜索一下<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>button</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>result<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>body</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>html</span><span class="token punctuation">></span></span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/42e984ae2ffb4c8fa41abacfcdc5bf5b.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/42e984ae2ffb4c8fa41abacfcdc5bf5b.jpg" alt="Boost搜索引擎_第21张图片" width="650" height="288" style="border:1px solid black;"></a></p> 
  <h2>网页样式</h2> 
  <p>上面我们发现有点丑,所以这里我们要给他美颜一下.</p> 
  <pre><code class="prism language-html"><span class="token doctype"><span class="token punctuation"><!</span><span class="token doctype-tag">DOCTYPE</span> <span class="token name">html</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>html</span> <span class="token attr-name">lang</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>en<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>head</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">charset</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>UTF-8<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">http-equiv</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>X-UA-Compatible<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>IE=edge<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">name</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>viewport<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>width=device-width, initial-scale=1.0<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>title</span><span class="token punctuation">></span></span>boost 搜索引擎<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>title</span><span class="token punctuation">></span></span>

  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>style</span><span class="token punctuation">></span></span><span class="token style"><span class="token language-css">
    <span class="token comment">/* 去掉网页中的所有的默认内外边距,html的盒子模型 */</span>
    <span class="token selector">*</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置外边距 */</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
      <span class="token comment">/* 设置内边距 */</span>
      <span class="token property">padding</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 将我们的body内的内容100%和html的呈现吻合 */</span>
    <span class="token selector">html,
    body</span> <span class="token punctuation">{</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 类选择器.container */</span>
    <span class="token selector">.container</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置div的宽度 */</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 800px<span class="token punctuation">;</span>
      <span class="token comment">/* 通过设置外边距达到居中对齐的目的 */</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0px auto<span class="token punctuation">;</span>
      <span class="token comment">/* 设置外边距的上边距,保持元素和网页的上部距离 */</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 复合选择器,选中container 下的 search */</span>
    <span class="token selector">.container .search</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 宽度与父标签保持一致 */</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
      <span class="token comment">/* 高度设置为52px */</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 先选中input标签, 直接设置标签的属性,先要选中, input:标签选择器*/</span>
    <span class="token comment">/* input在进行高度设置的时候,没有考虑边框的问题 */</span>
    <span class="token selector">.container .search input</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置left浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 600px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 50px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置边框属性:边框的宽度,样式,颜色 */</span>
      <span class="token property">border</span><span class="token punctuation">:</span> 1px solid black<span class="token punctuation">;</span>
      <span class="token comment">/* 去掉input输入框的有边框 */</span>
      <span class="token property">border-right</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token comment">/* 设置内边距,默认文字不要和左侧边框紧挨着 */</span>
      <span class="token property">padding-left</span><span class="token punctuation">:</span> 10px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置input内部的字体的颜色和样式 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #CCC<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 先选中button标签, 直接设置标签的属性,先要选中, button:标签选择器*/</span>
    <span class="token selector">.container .search button</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置left浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 150px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置button的背景颜色,#4e6ef2 */</span>
      <span class="token property">background-color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>
      <span class="token comment">/* 设置button中的字体颜色 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #FFF<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体的大小 */</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 19px<span class="token punctuation">;</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> Georgia<span class="token punctuation">,</span> <span class="token string">'Times New Roman'</span><span class="token punctuation">,</span> Times<span class="token punctuation">,</span> serif<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result</span> <span class="token punctuation">{</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item</span> <span class="token punctuation">{</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置为块级元素,单独站一行 */</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>
      <span class="token comment">/* a标签的下划线去掉 */</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token comment">/* 设置a标签中的文字的字体大小 */</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 20px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体的颜色 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a:hover</span> <span class="token punctuation">{</span>
      <span class="token comment">/*设置鼠标放在a之上的动态效果*/</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> underline<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item p</span> <span class="token punctuation">{</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 5px<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 16px<span class="token punctuation">;</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> <span class="token string">'Lucida Sans'</span><span class="token punctuation">,</span> <span class="token string">'Lucida Sans Regular'</span><span class="token punctuation">,</span> <span class="token string">'Lucida Grande'</span><span class="token punctuation">,</span> <span class="token string">'Lucida SansUnicode'</span><span class="token punctuation">,</span> Geneva<span class="token punctuation">,</span> Verdana<span class="token punctuation">,</span> sans-serif<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item i</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置为块级元素,单独站一行 */</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>
      <span class="token comment">/* 取消斜体风格 */</span>
      <span class="token property">font-style</span><span class="token punctuation">:</span> normal<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> green<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  </span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>style</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>head</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>body</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>container<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>search<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>input</span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text<span class="token punctuation">"</span></span> <span class="token attr-name">value</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>输入搜索关键字...<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>button</span><span class="token punctuation">></span></span>搜索一下<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>button</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>result<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>item<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>这是标题<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>p</span><span class="token punctuation">></span></span>这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘要这是摘
          要这是摘要这是摘要这是摘要这是摘要<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>p</span><span class="token punctuation">></span></span>
        <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>i</span><span class="token punctuation">></span></span>https://search.gitee.com/?skin=rec&type=repository&q=cpp-httplib<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>i</span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>body</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>html</span><span class="token punctuation">></span></span>
</code></pre> 
  <p><a href="http://img.e-com-net.com/image/info8/2d012732abf149d8a40e5ee0877a8411.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/2d012732abf149d8a40e5ee0877a8411.jpg" alt="Boost搜索引擎_第22张图片" width="650" height="251" style="border:1px solid black;"></a></p> 
  <h2>前后端交互</h2> 
  <p>下面我们继续使用前后端交互.也是直接贴代码.</p> 
  <pre><code class="prism language-html"><span class="token comment"><!-- 形成骨架 --></span>
<span class="token doctype"><span class="token punctuation"><!</span><span class="token doctype-tag">DOCTYPE</span> <span class="token name">html</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>html</span> <span class="token attr-name">lang</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>en<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>head</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">charset</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>UTF-8<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">http-equiv</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>X-UA-Compatible<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>IE=edge<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>meta</span> <span class="token attr-name">name</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>viewport<span class="token punctuation">"</span></span> <span class="token attr-name">content</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>width=device-width, initial-scale=1.0<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>script</span> <span class="token attr-name">src</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>http://code.jquery.com/jquery-2.1.1.min.js<span class="token punctuation">"</span></span><span class="token punctuation">></span></span><span class="token script"></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>script</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>title</span><span class="token punctuation">></span></span>boost 搜索引擎<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>title</span><span class="token punctuation">></span></span>
  <span class="token comment"><!-- 把内外边距清零 --></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>style</span><span class="token punctuation">></span></span><span class="token style"><span class="token language-css">
    <span class="token selector">*</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 设置外边距 */</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
      <span class="token comment">/* 设置内边距 */</span>
      <span class="token property">padding</span><span class="token punctuation">:</span> 0<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">html,
    body</span> <span class="token punctuation">{</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 居中显式  以点开头的我们称之类选择器 */</span>
    <span class="token selector">.container</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 这是最大框架 */</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 800px<span class="token punctuation">;</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 0px auto<span class="token punctuation">;</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token comment">/* 复合选择器 */</span>
    <span class="token selector">.container .search</span> <span class="token punctuation">{</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
      <span class="token comment">/* 为何是52我们后面解释 */</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .search input</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 加上浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 600px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 50px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置边框 */</span>
      <span class="token property">border</span><span class="token punctuation">:</span> 1px solid black<span class="token punctuation">;</span>
      <span class="token comment">/* 去掉右边距 */</span>
      <span class="token property">border-right</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token property">padding-left</span><span class="token punctuation">:</span> 10px<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #ccc<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .search button</span> <span class="token punctuation">{</span>
      <span class="token comment">/* 加上浮动 */</span>
      <span class="token property">float</span><span class="token punctuation">:</span> left<span class="token punctuation">;</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 120px<span class="token punctuation">;</span>
      <span class="token property">height</span><span class="token punctuation">:</span> 52px<span class="token punctuation">;</span>

      <span class="token comment">/* 设置背景颜色 */</span>
      <span class="token property">background-color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体颜色 */</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #fff<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体大小 */</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 19px<span class="token punctuation">;</span>
      <span class="token comment">/* 设置字体样式 */</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> <span class="token string">'Times New Roman'</span><span class="token punctuation">,</span> Times<span class="token punctuation">,</span> serif<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>


    <span class="token selector">.container .result</span> <span class="token punctuation">{</span>
      <span class="token property">width</span><span class="token punctuation">:</span> 100%<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item</span> <span class="token punctuation">{</span>
      <span class="token property">margin-top</span><span class="token punctuation">:</span> 15px<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a</span> <span class="token punctuation">{</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>
      <span class="token comment">/* 去掉下划线 */</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> none<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 20px<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> #4e6ef2<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item a:hover</span> <span class="token punctuation">{</span>
      <span class="token property">text-decoration</span><span class="token punctuation">:</span> underline<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item p</span> <span class="token punctuation">{</span>
      <span class="token property">margin</span><span class="token punctuation">:</span> 5px<span class="token punctuation">;</span>
      <span class="token property">font-size</span><span class="token punctuation">:</span> 16px<span class="token punctuation">;</span>
      <span class="token property">font-family</span><span class="token punctuation">:</span> <span class="token string">'Times New Roman'</span><span class="token punctuation">,</span> Times<span class="token punctuation">,</span> serif<span class="token punctuation">;</span>

    <span class="token punctuation">}</span>

    <span class="token selector">.container .result .item i</span> <span class="token punctuation">{</span>
      <span class="token property">display</span><span class="token punctuation">:</span> block<span class="token punctuation">;</span>

      <span class="token comment">/* 取消斜体 */</span>
      <span class="token property">font-style</span><span class="token punctuation">:</span> normal<span class="token punctuation">;</span>
      <span class="token property">color</span><span class="token punctuation">:</span> green<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
  </span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>style</span><span class="token punctuation">></span></span>
<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>head</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"><</span>body</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>container<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>search<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>input</span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text<span class="token punctuation">"</span></span> <span class="token attr-name">value</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>输入搜索关键字...<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>button</span> <span class="token special-attr"><span class="token attr-name">onclick</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span><span class="token value javascript language-javascript"><span class="token function">Search</span><span class="token punctuation">(</span><span class="token punctuation">)</span></span><span class="token punctuation">"</span></span></span><span class="token punctuation">></span></span>搜索一下<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>button</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">class</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>result<span class="token punctuation">"</span></span><span class="token punctuation">></span></span>
      <span class="token comment"><!-- 动态生成网页内容 --></span>

      <span class="token comment"><!-- <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要,这是摘要这是摘要,这是摘要这是摘要,这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div>
      <div class="item">
        <a href="#">这是标题</a>
        <p>这是摘要这是摘要</p>
        <i>https://www.bilibili.com/</i>
      </div> --></span>
    <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span>
  <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>script</span><span class="token punctuation">></span></span><span class="token script"><span class="token language-javascript">
    <span class="token keyword">function</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
      <span class="token comment">// alert("hello js");</span>
      <span class="token comment">// 1. 提取数据 jquery</span>

      <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">".container .search input"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">val</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">if</span><span class="token punctuation">(</span>query <span class="token operator">==</span> <span class="token string">''</span> <span class="token operator">||</span> query <span class="token operator">==</span> <span class="token keyword">null</span><span class="token punctuation">)</span><span class="token punctuation">{</span>
        <span class="token keyword">return</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span><span class="token string">"query = "</span> <span class="token operator">+</span> query<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token comment">// 2. 发起http 请求</span>
      $<span class="token punctuation">.</span><span class="token function">ajax</span><span class="token punctuation">(</span><span class="token punctuation">{</span>
        <span class="token literal-property property">type</span><span class="token operator">:</span> <span class="token string">"GET"</span><span class="token punctuation">,</span>
        <span class="token literal-property property">url</span><span class="token operator">:</span> <span class="token string">"/s?word="</span> <span class="token operator">+</span> query<span class="token punctuation">,</span>
        <span class="token function-variable function">success</span><span class="token operator">:</span> <span class="token keyword">function</span> <span class="token punctuation">(</span><span class="token parameter">data</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
          console<span class="token punctuation">.</span><span class="token function">log</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>
          <span class="token comment">// 构建新网页  -- 动态的</span>
          <span class="token function">BuildHtml</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
      <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>


    <span class="token keyword">function</span> <span class="token function">BuildHtml</span><span class="token punctuation">(</span><span class="token parameter">data</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>

      <span class="token keyword">if</span><span class="token punctuation">(</span>date <span class="token operator">==</span> <span class="token string">''</span> <span class="token operator">||</span> data <span class="token operator">==</span> <span class="token keyword">null</span><span class="token punctuation">)</span><span class="token punctuation">{</span>
        document<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token string">"搜索的内容没有"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">return</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">let</span> result_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">".container .result"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      result_lable<span class="token punctuation">.</span><span class="token function">empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">let</span> elem <span class="token keyword">of</span> data<span class="token punctuation">)</span> <span class="token punctuation">{</span>

        <span class="token comment">// console.log(elem.title);</span>
        <span class="token comment">// console.log(elem.url);</span>

        <span class="token keyword">let</span> a_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<a>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token literal-property property">text</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>title<span class="token punctuation">,</span>
          <span class="token literal-property property">href</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>url<span class="token punctuation">,</span>
          <span class="token literal-property property">target</span><span class="token operator">:</span> <span class="token string">"_blank"</span>
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">let</span> p_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<p>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token literal-property property">text</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>desc
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">let</span> i_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<i>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token literal-property property">text</span><span class="token operator">:</span> elem<span class="token punctuation">.</span>url
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

        <span class="token keyword">let</span> div_lable <span class="token operator">=</span> <span class="token function">$</span><span class="token punctuation">(</span><span class="token string">"<div>"</span><span class="token punctuation">,</span> <span class="token punctuation">{</span>
          <span class="token keyword">class</span><span class="token operator">:</span> <span class="token string">"item"</span>
        <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>


        a_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>div_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
        p_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>div_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
        i_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>div_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
        div_lable<span class="token punctuation">.</span><span class="token function">appendTo</span><span class="token punctuation">(</span>result_lable<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

    <span class="token punctuation">}</span>

  </span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>script</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>body</span><span class="token punctuation">></span></span>

<span class="token tag"><span class="token tag"><span class="token punctuation"></</span>html</span><span class="token punctuation">></span></span>
</code></pre> 
  <h1>项目成果</h1> 
  <p>下面我们就可以使用我们的项目做搜索服务了看一下.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/b4f8a1125b234116a9e415065b101dea.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/b4f8a1125b234116a9e415065b101dea.jpg" alt="Boost搜索引擎_第23张图片" width="650" height="204" style="border:1px solid black;"></a></p> 
  <h1>项目补充</h1> 
  <p>下面我们补充点内容,有些小细节我们还没有谈.</p> 
  <h2>取重完善</h2> 
  <p>我们在搜索服务那里说过,对于我们关键词的搜索结果,在多个关键字之间,我们的文档id可能会重复.这个时候我们需要进行去重分为两步.</p> 
  <ul> 
   <li>找到在重复的id</li> 
   <li>把id里面的权重尽心相加</li> 
   <li>重新构造,让后进行查找构建json串</li> 
  </ul> 
  <p>下面是我们的遇到的情况.</p> 
  <p><a href="http://img.e-com-net.com/image/info8/2c984779dfcb40fc8a4530bab48f38ce.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/2c984779dfcb40fc8a4530bab48f38ce.jpg" alt="Boost搜索引擎_第24张图片" width="650" height="245" style="border:1px solid black;"></a></p> 
  <p>这里我们应该要处理.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">struct</span> <span class="token class-name">InvertedElemPrint</span>
  <span class="token punctuation">{</span>
    <span class="token keyword">uint64_t</span> doc_id<span class="token punctuation">;</span> <span class="token comment">// 文旦id</span>

    <span class="token keyword">int</span> weight<span class="token punctuation">;</span>                     <span class="token comment">// 权重</span>
    std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span> <span class="token comment">// 一个id里面可以对饮多个词</span>
    <span class="token function">InvertedElemPrint</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">doc_id</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token function">weight</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>

  <span class="token keyword">class</span> <span class="token class-name">Searcher</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token function">Searcher</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>
    <span class="token keyword">void</span> <span class="token function">Search</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>query<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">*</span>json_string<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token comment">// 1 分词  先来分词后面在进行查找</span>
      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> words<span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">CutString</span><span class="token punctuation">(</span>query<span class="token punctuation">,</span> <span class="token operator">&</span>words<span class="token punctuation">)</span><span class="token punctuation">;</span>

      <span class="token comment">// 2 根据分词结果依次触发  搜索</span>
      std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span><span class="token keyword">uint64_t</span><span class="token punctuation">,</span> InvertedElemPrint<span class="token operator">></span> tokens_map<span class="token punctuation">;</span> <span class="token comment">//根据id,找到InvertedElemPrint</span>
      
      std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>InvertedElemPrint<span class="token operator">></span> inverted_list_all<span class="token punctuation">;</span> <span class="token comment">// 为了去重</span>

      <span class="token keyword">for</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string s <span class="token operator">:</span> words<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        boost<span class="token double-colon punctuation">::</span><span class="token function">to_lower</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span> 
        <span class="token comment">// 先查倒排</span>
        ns_index<span class="token double-colon punctuation">::</span>InvertedList <span class="token operator">*</span>inverted_list <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetInvertedList</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> inverted_list<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token keyword">continue</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
       
        <span class="token comment">// 根据倒排拉量找到我们所有的文档id</span>
        <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">auto</span> <span class="token operator">&</span>elem <span class="token operator">:</span> <span class="token operator">*</span>inverted_list<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token comment">// 去看这个id是不在哈希表中,如果在,拿到InvertedElemPrint</span>
          <span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">=</span> tokens_map<span class="token punctuation">[</span>elem<span class="token punctuation">.</span>doc_id<span class="token punctuation">]</span><span class="token punctuation">;</span> 
          item<span class="token punctuation">.</span>doc_id <span class="token operator">=</span> elem<span class="token punctuation">.</span>doc_id<span class="token punctuation">;</span> 
          <span class="token comment">// 把关键字也插入其中</span>
          item<span class="token punctuation">.</span>words<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>elem<span class="token punctuation">.</span>word<span class="token punctuation">)</span><span class="token punctuation">;</span>
          <span class="token comment">// 计算权重</span>
          item<span class="token punctuation">.</span>weight <span class="token operator">+=</span> elem<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        <span class="token comment">// 此时我们相同的id 已经被保存了</span>
      <span class="token punctuation">}</span>
      <span class="token comment">// 这里就把我们相同id的InvertedElemPrint插入所有的数组中</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">const</span> <span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> tokens_map<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        inverted_list_all<span class="token punctuation">.</span><span class="token function">push_back</span><span class="token punctuation">(</span>item<span class="token punctuation">.</span>second<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>

      <span class="token comment">// 3 合并排序  -- 按照相关性进行降序排序,这里是根据新的权重.</span>
      std<span class="token double-colon punctuation">::</span><span class="token function">sort</span><span class="token punctuation">(</span>inverted_list_all<span class="token punctuation">.</span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> inverted_list_all<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
                <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">(</span><span class="token keyword">const</span> InvertedElemPrint <span class="token operator">&</span>e1<span class="token punctuation">,</span> <span class="token keyword">const</span> InvertedElemPrint <span class="token operator">&</span>e2<span class="token punctuation">)</span>
                <span class="token punctuation">{</span>
                  <span class="token keyword">return</span> e1<span class="token punctuation">.</span>weight <span class="token operator">></span> e2<span class="token punctuation">.</span>weight<span class="token punctuation">;</span>
                <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span>


      <span class="token comment">// 4 构建json串 使用序列化和反序列化</span>
      Json<span class="token double-colon punctuation">::</span>Value root<span class="token punctuation">;</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> <span class="token operator">&</span>item <span class="token operator">:</span> inverted_list_all<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token comment">// 此时拿到正派</span>
        ns_index<span class="token double-colon punctuation">::</span>DocInfo <span class="token operator">*</span>doc <span class="token operator">=</span> index<span class="token operator">-></span><span class="token function">GetForwardIndex</span><span class="token punctuation">(</span>item<span class="token punctuation">.</span>doc_id<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> doc<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token keyword">continue</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>

        <span class="token comment">// 获取了 文档内容</span>
        Json<span class="token double-colon punctuation">::</span>Value elem<span class="token punctuation">;</span>
        elem<span class="token punctuation">[</span><span class="token string">"title"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>title<span class="token punctuation">;</span>
        elem<span class="token punctuation">[</span><span class="token string">"desc"</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token function">make_summary</span><span class="token punctuation">(</span>doc<span class="token operator">-></span>content<span class="token punctuation">,</span> item<span class="token punctuation">.</span>words<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 我们需要根据关键字来提取摘要</span>
        elem<span class="token punctuation">[</span><span class="token string">"url"</span><span class="token punctuation">]</span> <span class="token operator">=</span> doc<span class="token operator">-></span>url<span class="token punctuation">;</span>

        <span class="token comment">// fordebug</span>
        <span class="token comment">//  elem["id"] = (int)item.doc_id;</span>
        <span class="token comment">//  elem["weight"] = item.weight; // 会自动转成string</span>
        root<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>elem<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// 这里是有序的</span>
      <span class="token punctuation">}</span>

      Json<span class="token double-colon punctuation">::</span>StyledWriter writer<span class="token punctuation">;</span> <span class="token comment">// 这里我们暂时用这个格式</span>
      <span class="token operator">*</span>json_string <span class="token operator">=</span> writer<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>root<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span><span class="token punctuation">.</span>
    ns_index<span class="token double-colon punctuation">::</span>Index <span class="token operator">*</span>index<span class="token punctuation">;</span> <span class="token comment">// 提供系统经行查找索引</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
</code></pre> 
  <h2>添加日志</h2> 
  <p>这里我们添加日志创建一个文件.</p> 
  <pre><code class="prism language-shell"><span class="token punctuation">[</span>qkj@localhost BoostSearchEngine<span class="token punctuation">]</span>$ <span class="token function">touch</span> log.hpp
</code></pre> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">pragma</span> <span class="token expression">once</span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><iostream></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><string></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><ctime></span></span>

<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">NORMAL</span> <span class="token expression"><span class="token number">1</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">WARNING</span> <span class="token expression"><span class="token number">2</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">DEBUG</span> <span class="token expression"><span class="token number">3</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name">FATAL</span> <span class="token expression"><span class="token number">4</span></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">define</span> <span class="token macro-name function">LOG</span><span class="token expression"><span class="token punctuation">(</span>LEVEL<span class="token punctuation">,</span> MESSAGE<span class="token punctuation">)</span> <span class="token function">log</span><span class="token punctuation">(</span>#LEVEL<span class="token punctuation">,</span> MESSAGE<span class="token punctuation">,</span> <span class="token constant">__FILE__</span><span class="token punctuation">,</span> <span class="token constant">__LINE__</span><span class="token punctuation">)</span></span></span>

<span class="token keyword">void</span> <span class="token function">log</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span>string level<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string message<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>string file<span class="token punctuation">,</span> <span class="token keyword">int</span> line<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  std<span class="token double-colon punctuation">::</span>cout <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> level <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> <span class="token function">time</span><span class="token punctuation">(</span><span class="token keyword">nullptr</span><span class="token punctuation">)</span> <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> message <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"["</span> <span class="token operator"><<</span> file <span class="token operator"><<</span> <span class="token string">"]"</span>
            <span class="token operator"><<</span> <span class="token string">"[:"</span> <span class="token operator"><<</span> line <span class="token operator"><<</span> <span class="token string">"]"</span> <span class="token operator"><<</span> std<span class="token double-colon punctuation">::</span>endl<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h3>在索引那里建立日志</h3> 
  <p><a href="http://img.e-com-net.com/image/info8/749bad2f4ece4cbc97ab323c30631dd1.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/749bad2f4ece4cbc97ab323c30631dd1.jpg" alt="Boost搜索引擎_第25张图片" width="650" height="154" style="border:1px solid black;"></a></p> 
  <h3>在搜索那里建立日志</h3> 
  <p><a href="http://img.e-com-net.com/image/info8/e2d2a083e9374e45b5f7a8f7f8e6e516.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/e2d2a083e9374e45b5f7a8f7f8e6e516.jpg" alt="Boost搜索引擎_第26张图片" width="650" height="224" style="border:1px solid black;"></a></p> 
  <h3>在服务端那里建立日志</h3> 
  <p><a href="http://img.e-com-net.com/image/info8/e6756302362a45b9922671004117a7de.jpg" target="_blank"><img src="http://img.e-com-net.com/image/info8/e6756302362a45b9922671004117a7de.jpg" alt="Boost搜索引擎_第27张图片" width="650" height="316" style="border:1px solid black;"></a></p> 
  <h1>项目拓展</h1> 
  <p>这里我们可以扩展一下项目.</p> 
  <h2>摘要完善</h2> 
  <p>我们知道,分词的时候是可以去掉暂停词的.上面的我们都没有这么做.这是因为我们的如果加上去掉暂停词,此时对资源的要求非常大.那么这里可以作为一个扩展.jieba里面也有暂停词的集合.我们使用一下.</p> 
  <pre><code class="prism language-cpp"><span class="token keyword">class</span> <span class="token class-name">JiebaUtil</span>
  <span class="token punctuation">{</span>
  <span class="token keyword">public</span><span class="token operator">:</span>
    <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">CutString</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token function">assert</span><span class="token punctuation">(</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
      ns_util<span class="token double-colon punctuation">::</span><span class="token class-name">JiebaUtil</span><span class="token double-colon punctuation">::</span><span class="token function">get_instance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">-></span><span class="token function">CutStringHelper</span><span class="token punctuation">(</span>src<span class="token punctuation">,</span> out<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

<span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token comment">/// @brief 这里是分词</span>
    <span class="token comment">/// @param src</span>
    <span class="token comment">/// @param out</span>
    <span class="token keyword">void</span> <span class="token function">CutStringHelper</span><span class="token punctuation">(</span><span class="token keyword">const</span> std<span class="token double-colon punctuation">::</span>string <span class="token operator">&</span>src<span class="token punctuation">,</span> std<span class="token double-colon punctuation">::</span>vector<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token operator">></span> <span class="token operator">*</span>out<span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      jieba<span class="token punctuation">.</span><span class="token function">CutForSearch</span><span class="token punctuation">(</span>src<span class="token punctuation">,</span> <span class="token operator">*</span>out<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">for</span> <span class="token punctuation">(</span><span class="token keyword">auto</span> iter <span class="token operator">=</span> out<span class="token operator">-></span><span class="token function">begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> iter <span class="token operator">!=</span> out<span class="token operator">-></span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token keyword">auto</span> it <span class="token operator">=</span> stop_words<span class="token punctuation">.</span><span class="token function">find</span><span class="token punctuation">(</span><span class="token operator">*</span>iter<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span>it <span class="token operator">!=</span> stop_words<span class="token punctuation">.</span><span class="token function">end</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          <span class="token comment">// 此时是暂停词 删除</span>
          <span class="token comment">//  避免迭代器失效</span>
          <span class="token comment">// std::cout << *iter << std::endl;</span>
          iter <span class="token operator">=</span> out<span class="token operator">-></span><span class="token function">erase</span><span class="token punctuation">(</span>iter<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        <span class="token keyword">else</span>
        <span class="token punctuation">{</span>
          iter<span class="token operator">++</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
      <span class="token punctuation">}</span>
    <span class="token punctuation">}</span>
    <span class="token keyword">static</span> JiebaUtil <span class="token operator">*</span><span class="token function">get_instance</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      <span class="token keyword">static</span> std<span class="token double-colon punctuation">::</span>mutex mtx<span class="token punctuation">;</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> instance<span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        mtx<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token keyword">nullptr</span> <span class="token operator">==</span> instance<span class="token punctuation">)</span>
        <span class="token punctuation">{</span>
          instance <span class="token operator">=</span> <span class="token keyword">new</span> JiebaUtil<span class="token punctuation">;</span>
          instance<span class="token operator">-></span><span class="token function">InitJiebaUtil</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
        mtx<span class="token punctuation">.</span><span class="token function">unlock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      <span class="token keyword">return</span> instance<span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token comment">// 这是我们的切分词</span>

    <span class="token keyword">void</span> <span class="token function">InitJiebaUtil</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token punctuation">{</span>
      std<span class="token double-colon punctuation">::</span>ifstream <span class="token function">in</span><span class="token punctuation">(</span>STOP_WORD_PATH<span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token keyword">if</span> <span class="token punctuation">(</span>in<span class="token punctuation">.</span><span class="token function">is_open</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token boolean">false</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        <span class="token function">LOG</span><span class="token punctuation">(</span>FATAL<span class="token punctuation">,</span> <span class="token string">"加载暂停词错误"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token keyword">return</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      std<span class="token double-colon punctuation">::</span>string line<span class="token punctuation">;</span>
      <span class="token keyword">while</span> <span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">getline</span><span class="token punctuation">(</span>in<span class="token punctuation">,</span> line<span class="token punctuation">)</span><span class="token punctuation">)</span>
      <span class="token punctuation">{</span>
        stop_words<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>std<span class="token double-colon punctuation">::</span><span class="token function">make_pair</span><span class="token punctuation">(</span>line<span class="token punctuation">,</span> <span class="token boolean">true</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
      <span class="token punctuation">}</span>
      in<span class="token punctuation">.</span><span class="token function">close</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>

  <span class="token keyword">private</span><span class="token operator">:</span>
    <span class="token keyword">static</span> JiebaUtil <span class="token operator">*</span>instance<span class="token punctuation">;</span>

    cppjieba<span class="token double-colon punctuation">::</span>Jieba jieba<span class="token punctuation">;</span>
    std<span class="token double-colon punctuation">::</span>unordered_map<span class="token operator"><</span>std<span class="token double-colon punctuation">::</span>string<span class="token punctuation">,</span> <span class="token keyword">bool</span><span class="token operator">></span> stop_words<span class="token punctuation">;</span>
    <span class="token function">JiebaUtil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">:</span> <span class="token function">jieba</span><span class="token punctuation">(</span>DICT_PATH<span class="token punctuation">,</span> HMM_PATH<span class="token punctuation">,</span> USER_DICT_PATH<span class="token punctuation">,</span> IDF_PATH<span class="token punctuation">,</span> STOP_WORD_PATH<span class="token punctuation">)</span> <span class="token punctuation">{</span><span class="token punctuation">}</span>
    <span class="token comment">// 拷贝构造等 delte</span>
  <span class="token punctuation">}</span><span class="token punctuation">;</span>
  JiebaUtil <span class="token operator">*</span>JiebaUtil<span class="token double-colon punctuation">::</span>instance <span class="token operator">=</span> <span class="token keyword">nullptr</span><span class="token punctuation">;</span>
</code></pre> 
  <h2>后台部署服务</h2> 
  <p>我们可以把它设置为精灵进程.</p> 
  <h3>nohup指令</h3> 
  <blockquote> 
   <p><strong>nohup的执行:</strong></p> 
   <p>nohup指令: 将服务进程以守护进程的方式执行 , 使关闭XShell之后仍可以访问该服务。</p> 
   <p>例如 nohup ./http_server</p> 
   <p>如果让程序在后台执行, 可以在末尾加上 & , 程序就会隐身 , 不会显示在终端。</p> 
   <p>例如 nohup ./http_server &</p> 
  </blockquote> 
  <blockquote> 
   <p><strong>nohup形成的文件:</strong></p> 
   <p>执行完上述的nohup指令之后,将会形成一个nohup.out存储日志信息文件,可以cat查看该文件</p> 
  </blockquote> 
  <h3>setsid</h3> 
  <p>我们也是可以使用下面的方式惊醒守护进程化</p> 
  <pre><code class="prism language-cpp"><span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">pragma</span> <span class="token expression">once</span></span>

<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><cstdio></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><iostream></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><signal.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><unistd.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><sys/types.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><sys/stat.h></span></span>
<span class="token macro property"><span class="token directive-hash">#</span><span class="token directive keyword">include</span> <span class="token string"><fcntl.h></span></span>

<span class="token keyword">void</span> <span class="token function">daemonize</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
    <span class="token keyword">int</span> fd <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>
    <span class="token comment">// 1. 忽略SIGPIPE</span>
    <span class="token function">signal</span><span class="token punctuation">(</span>SIGPIPE<span class="token punctuation">,</span> SIG_IGN<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// 2. 更改进程的工作目录</span>
    <span class="token comment">// chdir();</span>
    <span class="token comment">// 3. 让自己不要成为进程组组长</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token function">fork</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">></span> <span class="token number">0</span><span class="token punctuation">)</span>
        <span class="token function">exit</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// 4. 设置自己是一个独立的会话</span>
    <span class="token function">setsid</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token comment">// 5. 重定向0,1,2</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>fd <span class="token operator">=</span> <span class="token function">open</span><span class="token punctuation">(</span><span class="token string">"/dev/null"</span><span class="token punctuation">,</span> O_RDWR<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token comment">// fd == 3</span>
    <span class="token punctuation">{</span>
        <span class="token function">dup2</span><span class="token punctuation">(</span>fd<span class="token punctuation">,</span> STDIN_FILENO<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token function">dup2</span><span class="token punctuation">(</span>fd<span class="token punctuation">,</span> STDOUT_FILENO<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token function">dup2</span><span class="token punctuation">(</span>fd<span class="token punctuation">,</span> STDERR_FILENO<span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token comment">// 6. 关闭掉不需要的fd</span>
        <span class="token keyword">if</span><span class="token punctuation">(</span>fd <span class="token operator">></span> STDERR_FILENO<span class="token punctuation">)</span> <span class="token function">close</span><span class="token punctuation">(</span>fd<span class="token punctuation">)</span><span class="token punctuation">;</span>
       <span class="token comment">// 6. close(0,1,2)// 严重不推荐</span>
<span class="token punctuation">}</span>
</code></pre> 
  <h2>其他拓展</h2> 
  <ul> 
   <li>我们在搜索引擎中,对于权重的设置先后显示顺序,我们其实可以叠加一些算法,比如可以设置竞价排名,热点统计,额外增加某些文档的权重。</li> 
   <li>我们可以利用数据库,设置用户登录注册,引入对MySQL的使用。</li> 
  </ul> 
 </div> 
</div>
                            </div>
                        </div>
                    </div>
                    <!--PC和WAP自适应版-->
                    <div id="SOHUCS" sid="1701002251797082112"></div>
                    <script type="text/javascript" src="/views/front/js/chanyan.js"></script>
                    <!-- 文章页-底部 动态广告位 -->
                    <div class="youdao-fixed-ad" id="detail_ad_bottom"></div>
                </div>
                <div class="col-md-3">
                    <div class="row" id="ad">
                        <!-- 文章页-右侧1 动态广告位 -->
                        <div id="right-1" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad">
                            <div class="youdao-fixed-ad" id="detail_ad_1"> </div>
                        </div>
                        <!-- 文章页-右侧2 动态广告位 -->
                        <div id="right-2" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad">
                            <div class="youdao-fixed-ad" id="detail_ad_2"></div>
                        </div>
                        <!-- 文章页-右侧3 动态广告位 -->
                        <div id="right-3" class="col-lg-12 col-md-12 col-sm-4 col-xs-4 ad">
                            <div class="youdao-fixed-ad" id="detail_ad_3"></div>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
    <div class="container">
        <h4 class="pt20 mb15 mt0 border-top">你可能感兴趣的:(项目,搜索引擎,git,github,centos,c++,visualstudio)</h4>
        <div id="paradigm-article-related">
            <div class="recommend-post mb30">
                <ul class="widget-links">
                    <li><a href="/article/1835514207114719232.htm"
                           title="关于沟通这件事,项目经理不需要每次都面对面进行" target="_blank">关于沟通这件事,项目经理不需要每次都面对面进行</a>
                        <span class="text-muted">流程大师兄</span>

                        <div>很多项目经理都会遇到这样的问题,项目中由于事情太多,根本没有足够的时间去召开会议,那在这种情况下如何去有效地管理项目中的利益相关者?当然,不建议电子邮件也不需要开会的话,建议可以采取下面几种方式来形成有效的沟通,这几种方式可以帮助你努力的通过各种办法来保持和各方面的联系。项目经理首先要问自己几个问题,项目中哪些利益相关者是必须要进行沟通的?可以列出项目中所有的利益相关者清单,同时也整理出项目中哪些</div>
                    </li>
                    <li><a href="/article/1835513551142350848.htm"
                           title="OC语言多界面传值五大方式" target="_blank">OC语言多界面传值五大方式</a>
                        <span class="text-muted">Magnetic_h</span>
<a class="tag" taget="_blank" href="/search/ios/1.htm">ios</a><a class="tag" taget="_blank" href="/search/ui/1.htm">ui</a><a class="tag" taget="_blank" href="/search/%E5%AD%A6%E4%B9%A0/1.htm">学习</a><a class="tag" taget="_blank" href="/search/objective-c/1.htm">objective-c</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a>
                        <div>前言在完成暑假仿写项目时,遇到了许多需要用到多界面传值的地方,这篇博客来总结一下比较常用的五种多界面传值的方式。属性传值属性传值一般用前一个界面向后一个界面传值,简单地说就是通过访问后一个视图控制器的属性来为它赋值,通过这个属性来做到从前一个界面向后一个界面传值。首先在后一个界面中定义属性@interfaceBViewController:UIViewController@propertyNSSt</div>
                    </li>
                    <li><a href="/article/1835512920797179904.htm"
                           title="element实现动态路由+面包屑" target="_blank">element实现动态路由+面包屑</a>
                        <span class="text-muted">软件技术NINI</span>
<a class="tag" taget="_blank" href="/search/vue%E6%A1%88%E4%BE%8B/1.htm">vue案例</a><a class="tag" taget="_blank" href="/search/vue.js/1.htm">vue.js</a><a class="tag" taget="_blank" href="/search/%E5%89%8D%E7%AB%AF/1.htm">前端</a>
                        <div>el-breadcrumb是ElementUI组件库中的一个面包屑导航组件,它用于显示当前页面的路径,帮助用户快速理解和导航到应用的各个部分。在Vue.js项目中,如果你已经安装了ElementUI,就可以很方便地使用el-breadcrumb组件。以下是一个基本的使用示例:安装ElementUI(如果你还没有安装的话):你可以通过npm或yarn来安装ElementUI。bash复制代码npmi</div>
                    </li>
                    <li><a href="/article/1835511030260789248.htm"
                           title="c++ 的iostream 和 c++的stdio的区别和联系" target="_blank">c++ 的iostream 和 c++的stdio的区别和联系</a>
                        <span class="text-muted">黄卷青灯77</span>
<a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a><a class="tag" taget="_blank" href="/search/%E7%AE%97%E6%B3%95/1.htm">算法</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a><a class="tag" taget="_blank" href="/search/iostream/1.htm">iostream</a><a class="tag" taget="_blank" href="/search/stdio/1.htm">stdio</a>
                        <div>在C++中,iostream和C语言的stdio.h都是用于处理输入输出的库,但它们在设计、用法和功能上有许多不同。以下是两者的区别和联系:区别1.编程风格iostream(C++风格):C++标准库中的输入输出流类库,支持面向对象的输入输出操作。典型用法是cin(输入)和cout(输出),使用>操作符来处理数据。更加类型安全,支持用户自定义类型的输入输出。#includeintmain(){in</div>
                    </li>
                    <li><a href="/article/1835509643619692544.htm"
                           title="如何在 Fork 的 GitHub 项目中保留自己的修改并同步上游更新?github_fork_update" target="_blank">如何在 Fork 的 GitHub 项目中保留自己的修改并同步上游更新?github_fork_update</a>
                        <span class="text-muted">iBaoxing</span>
<a class="tag" taget="_blank" href="/search/github/1.htm">github</a>
                        <div>如何在Fork的GitHub项目中保留自己的修改并同步上游更新?在GitHub上Fork了一个项目后,你可能会对项目进行一些修改,同时原作者也在不断更新。如果想要在保留自己修改的基础上,同步原作者的最新更新,很多人会不知所措。本文将详细讲解如何在不丢失自己改动的情况下,将上游仓库的更新合并到自己的仓库中。问题描述假设你在GitHub上Fork了一个项目,并基于该项目做了一些修改,随后你发现原作者对</div>
                    </li>
                    <li><a href="/article/1835508122383380480.htm"
                           title="抖音乐买买怎么加入赚钱?赚钱方法是什么" target="_blank">抖音乐买买怎么加入赚钱?赚钱方法是什么</a>
                        <span class="text-muted">测评君高省</span>

                        <div>你会在抖音买东西吗?如果会,那么一定要免费注册一个乐买买,抖音直播间,橱窗,小视频里的小黄车买东西都可以返佣金!省下来都是自己的,分享还可以赚钱乐买买是好省旗下的抖音返佣平台,乐买买分析社交电商的价值,乐买买属于今年难得的副业项目风口机会,2019年错过做好省的搞钱的黄金时期,那么2022年千万别再错过乐买买至于我为何转到高省呢?当然是高省APP佣金更高,模式更好,终端用户不流失。【高省】是一个自</div>
                    </li>
                    <li><a href="/article/1835505858444881920.htm"
                           title="git常用命令笔记" target="_blank">git常用命令笔记</a>
                        <span class="text-muted">咩酱-小羊</span>
<a class="tag" taget="_blank" href="/search/git/1.htm">git</a><a class="tag" taget="_blank" href="/search/%E7%AC%94%E8%AE%B0/1.htm">笔记</a>
                        <div>###用习惯了idea总是不记得git的一些常见命令,需要用到的时候总是担心旁边站了人~~~记个笔记@_@,告诉自己看笔记不丢人初始化初始化一个新的Git仓库gitinit配置配置用户信息gitconfig--globaluser.name"YourName"gitconfig--globaluser.email"youremail@example.com"基本操作克隆远程仓库gitclone查看</div>
                    </li>
                    <li><a href="/article/1835502578511736832.htm"
                           title="下载github patch到本地" target="_blank">下载github patch到本地</a>
                        <span class="text-muted">小米人er</span>
<a class="tag" taget="_blank" href="/search/%E6%88%91%E7%9A%84%E5%8D%9A%E5%AE%A2/1.htm">我的博客</a><a class="tag" taget="_blank" href="/search/git/1.htm">git</a><a class="tag" taget="_blank" href="/search/patch/1.htm">patch</a>
                        <div>以下是几种从GitHub上下载以.patch结尾的补丁文件的方法:通过浏览器直接下载打开包含该.patch文件的GitHub仓库。在仓库的文件列表中找到对应的.patch文件。点击该文件,浏览器会显示文件的内容,在页面的右上角通常会有一个“Raw”按钮,点击它可以获取原始文件内容。然后在浏览器中使用快捷键(如Ctrl+S或者Command+S)将原始文件保存到本地,选择保存的文件名并确保后缀为.p</div>
                    </li>
                    <li><a href="/article/1835502282603589632.htm"
                           title="509. 斐波那契数(每日一题)" target="_blank">509. 斐波那契数(每日一题)</a>
                        <span class="text-muted">lzyprime</span>

                        <div>lzyprime博客(github)创建时间:2021.01.04qq及邮箱:2383518170leetcode笔记题目描述斐波那契数,通常用F(n)表示,形成的序列称为斐波那契数列。该数列由0和1开始,后面的每一项数字都是前面两项数字的和。也就是:F(0)=0,F(1)=1F(n)=F(n-1)+F(n-2),其中n>1给你n,请计算F(n)。示例1:输入:2输出:1解释:F(2)=F(1)+</div>
                    </li>
                    <li><a href="/article/1835499052125483008.htm"
                           title="Git常用命令-修改远程仓库地址" target="_blank">Git常用命令-修改远程仓库地址</a>
                        <span class="text-muted">猿大师</span>
<a class="tag" taget="_blank" href="/search/Linux/1.htm">Linux</a><a class="tag" taget="_blank" href="/search/Java/1.htm">Java</a><a class="tag" taget="_blank" href="/search/git/1.htm">git</a><a class="tag" taget="_blank" href="/search/java/1.htm">java</a>
                        <div>查看远程仓库地址gitremote-v返回结果originhttps://git.coding.net/*****.git(fetch)originhttps://git.coding.net/*****.git(push)修改远程仓库地址gitremoteset-urloriginhttps://git.coding.net/*****.git先删除后增加远程仓库地址gitremotermori</div>
                    </li>
                    <li><a href="/article/1835496402042580992.htm"
                           title="GitHub上克隆项目" target="_blank">GitHub上克隆项目</a>
                        <span class="text-muted">bigbig猩猩</span>
<a class="tag" taget="_blank" href="/search/github/1.htm">github</a>
                        <div>从GitHub上克隆项目是一个简单且直接的过程,它允许你将远程仓库中的项目复制到你的本地计算机上,以便进行进一步的开发、测试或学习。以下是一个详细的步骤指南,帮助你从GitHub上克隆项目。一、准备工作1.安装Git在克隆GitHub项目之前,你需要在你的计算机上安装Git工具。Git是一个开源的分布式版本控制系统,用于跟踪和管理代码变更。你可以从Git的官方网站(https://git-scm.</div>
                    </li>
                    <li><a href="/article/1835495170972413952.htm"
                           title="git - Webhook让部署自动化" target="_blank">git - Webhook让部署自动化</a>
                        <span class="text-muted">大猪大猪</span>

                        <div>我们现在有一个需求,将项目打包上传到gitlab或者github后,程序能自动部署,不用手动地去服务器中进行项目更新并运行,如何做到?这里我们可以使用gitlab与github的挂钩,挂钩的原理就是,每当我们有请求到gitlab与github服务器时,这时他俩会根据我们配置的挂钩地扯进行访问,webhook挂钩程序会一直监听着某个端口请求,一但收到他们发过来的请求,这时就知道用户有请求提交了,这时</div>
                    </li>
                    <li><a href="/article/1835494258262503424.htm"
                           title="【JS】执行时长(100分) |思路参考+代码解析(C++)" target="_blank">【JS】执行时长(100分) |思路参考+代码解析(C++)</a>
                        <span class="text-muted">l939035548</span>
<a class="tag" taget="_blank" href="/search/JS/1.htm">JS</a><a class="tag" taget="_blank" href="/search/%E7%AE%97%E6%B3%95/1.htm">算法</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/1.htm">数据结构</a><a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a>
                        <div>题目为了充分发挥GPU算力,需要尽可能多的将任务交给GPU执行,现在有一个任务数组,数组元素表示在这1秒内新增的任务个数且每秒都有新增任务。假设GPU最多一次执行n个任务,一次执行耗时1秒,在保证GPU不空闲情况下,最少需要多长时间执行完成。题目输入第一个参数为GPU一次最多执行的任务个数,取值范围[1,10000]第二个参数为任务数组长度,取值范围[1,10000]第三个参数为任务数组,数字范围</div>
                    </li>
                    <li><a href="/article/1835493753557708800.htm"
                           title="每日算法&面试题,大厂特训二十八天——第二十天(树)" target="_blank">每日算法&面试题,大厂特训二十八天——第二十天(树)</a>
                        <span class="text-muted">肥学</span>
<a class="tag" taget="_blank" href="/search/%E2%9A%A1%E7%AE%97%E6%B3%95%E9%A2%98%E2%9A%A1%E9%9D%A2%E8%AF%95%E9%A2%98%E6%AF%8F%E6%97%A5%E7%B2%BE%E8%BF%9B/1.htm">⚡算法题⚡面试题每日精进</a><a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/%E7%AE%97%E6%B3%95/1.htm">算法</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/1.htm">数据结构</a>
                        <div>目录标题导读算法特训二十八天面试题点击直接资料领取导读肥友们为了更好的去帮助新同学适应算法和面试题,最近我们开始进行专项突击一步一步来。上一期我们完成了动态规划二十一天现在我们进行下一项对各类算法进行二十八天的一个小总结。还在等什么快来一起肥学进行二十八天挑战吧!!特别介绍小白练手专栏,适合刚入手的新人欢迎订阅编程小白进阶python有趣练手项目里面包括了像《机器人尬聊》《恶搞程序》这样的有趣文章</div>
                    </li>
                    <li><a href="/article/1835493373906087936.htm"
                           title="libyuv之linux编译" target="_blank">libyuv之linux编译</a>
                        <span class="text-muted">jaronho</span>
<a class="tag" taget="_blank" href="/search/Linux/1.htm">Linux</a><a class="tag" taget="_blank" href="/search/linux/1.htm">linux</a><a class="tag" taget="_blank" href="/search/%E8%BF%90%E7%BB%B4/1.htm">运维</a><a class="tag" taget="_blank" href="/search/%E6%9C%8D%E5%8A%A1%E5%99%A8/1.htm">服务器</a>
                        <div>文章目录一、下载源码二、编译源码三、注意事项1、银河麒麟系统(aarch64)(1)解决armv8-a+dotprod+i8mm指令集支持问题(2)解决armv9-a+sve2指令集支持问题一、下载源码到GitHub网站下载https://github.com/lemenkov/libyuv源码,或者用直接用git克隆到本地,如:gitclonehttps://github.com/lemenko</div>
                    </li>
                    <li><a href="/article/1835493247179386880.htm"
                           title="Faiss Tips:高效向量搜索与聚类的利器" target="_blank">Faiss Tips:高效向量搜索与聚类的利器</a>
                        <span class="text-muted">焦习娜Samantha</span>

                        <div>FaissTips:高效向量搜索与聚类的利器faiss_tipsSomeusefultipsforfaiss项目地址:https://gitcode.com/gh_mirrors/fa/faiss_tips项目介绍Faiss是由FacebookAIResearch开发的一个用于高效相似性搜索和密集向量聚类的库。它支持多种硬件平台,包括CPU和GPU,能够在海量数据集上实现快速的近似最近邻搜索(AN</div>
                    </li>
                    <li><a href="/article/1835492869062881280.htm"
                           title="pyecharts——绘制柱形图折线图" target="_blank">pyecharts——绘制柱形图折线图</a>
                        <span class="text-muted">2224070247</span>
<a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">信息可视化</a><a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">数据可视化</a>
                        <div>一、pyecharts概述自2013年6月百度EFE(ExcellentFrontEnd)数据可视化团队研发的ECharts1.0发布到GitHub网站以来,ECharts一直备受业界权威的关注并获得广泛好评,成为目前成熟且流行的数据可视化图表工具,被应用到诸多数据可视化的开发领域。Python作为数据分析领域最受欢迎的语言,也加入ECharts的使用行列,并研发出方便Python开发者使用的数据</div>
                    </li>
                    <li><a href="/article/1835492740536823808.htm"
                           title="node.js学习" target="_blank">node.js学习</a>
                        <span class="text-muted">小猿L</span>
<a class="tag" taget="_blank" href="/search/node.js/1.htm">node.js</a><a class="tag" taget="_blank" href="/search/node.js/1.htm">node.js</a><a class="tag" taget="_blank" href="/search/%E5%AD%A6%E4%B9%A0/1.htm">学习</a><a class="tag" taget="_blank" href="/search/vim/1.htm">vim</a>
                        <div>node.js学习实操及笔记温故node.js,node.js学习实操过程及笔记~node.js学习视频node.js官网node.js中文网实操笔记githubcsdn笔记为什么学node.js可以让别人访问我们编写的网页为后续的框架学习打下基础,三大框架vuereactangular离不开node.jsnode.js是什么官网:node.js是一个开源的、跨平台的运行JavaScript的运行</div>
                    </li>
                    <li><a href="/article/1835492244547792896.htm"
                           title="冬天短期的暴利小生意有哪些?那些小生意适合新手做?" target="_blank">冬天短期的暴利小生意有哪些?那些小生意适合新手做?</a>
                        <span class="text-muted">一起高省</span>

                        <div>短期生意不失为创业的一个商机,不过短期生意的商机是转瞬即逝的,而且这类生意也很难作为长期的生意去做,那冬天短期暴利小生意查看更多关于短期暴利小生意的文章有哪些呢?给大家先推荐一个2023年风口项目吧,真很不错的项目,全程零投资,当做副业来做真的很稳定,不管你什么阶层的人,或多或少都网购吧?你们知道网购是可以拿提成,拿返利,拿分佣的吗?你们知道很多优惠券群里面,天天群主和管理发一些商品吗?他们其实在</div>
                    </li>
                    <li><a href="/article/1835490712716668928.htm"
                           title="第六集如何安装CentOS7.0,3分钟学会centos7安装教程" target="_blank">第六集如何安装CentOS7.0,3分钟学会centos7安装教程</a>
                        <span class="text-muted">date分享</span>

                        <div>从光盘引导系统按回车键继续进入引导程序安装界面,选择语言这里选择简体中文版点击继续选择桌面安装下面给系统分区选择磁盘,点击完成选择基本分区,点击加号swap分区,大小填内存的两倍在选择根分区,使用所有可用的磁盘空间选择文件系统ext4点击完成,点击开始安装设置root密码,点击完成设置普通用户和密码,点击完成整个过程持续八分钟左右根据个人配置不同,时间长短不同好,现在点击重启系统进入重启状态点击本</div>
                    </li>
                    <li><a href="/article/1835490471032483840.htm"
                           title="高级 ECharts 技巧:自定义图表主题与样式" target="_blank">高级 ECharts 技巧:自定义图表主题与样式</a>
                        <span class="text-muted">SnowMan1993</span>
<a class="tag" taget="_blank" href="/search/echarts/1.htm">echarts</a><a class="tag" taget="_blank" href="/search/%E4%BF%A1%E6%81%AF%E5%8F%AF%E8%A7%86%E5%8C%96/1.htm">信息可视化</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/1.htm">数据分析</a>
                        <div>ECharts是一个强大的数据可视化库,提供了多种内置主题和样式,但你也可以根据项目的设计需求,自定义图表的主题与样式。本文将介绍如何使用ECharts自定义图表主题,以提升数据可视化的吸引力和一致性。1.什么是ECharts主题?ECharts的主题是指定义图表样式的配置项,包括颜色、字体、线条样式等。通过预设主题,你可以快速更改图表的整体风格,而自定义主题则允许你在此基础上进行个性化设置。2.</div>
                    </li>
                    <li><a href="/article/1835490218409553920.htm"
                           title="01-Git初识" target="_blank">01-Git初识</a>
                        <span class="text-muted">Meereen</span>
<a class="tag" taget="_blank" href="/search/Git/1.htm">Git</a><a class="tag" taget="_blank" href="/search/git/1.htm">git</a>
                        <div>01-Git初识概念:一个免费开源,分布式的代码版本控制系统,帮助开发团队维护代码作用:记录代码内容。切换代码版本,多人开发时高效合并代码内容如何学:个人本机使用:Git基础命令和概念多人共享使用:团队开发同一个项目的代码版本管理Git配置用户信息配置:用户名和邮箱,应用在每次提交代码版本时表明自己的身份命令:查看git版本号git-v配置用户名gitconfig--globaluser.name</div>
                    </li>
                    <li><a href="/article/1835489207716507648.htm"
                           title="基于CODESYS的多轴运动控制程序框架:逻辑与运动控制分离,快速开发灵活操作" target="_blank">基于CODESYS的多轴运动控制程序框架:逻辑与运动控制分离,快速开发灵活操作</a>
                        <span class="text-muted">GPJnCrbBdl</span>
<a class="tag" taget="_blank" href="/search/python/1.htm">python</a><a class="tag" taget="_blank" href="/search/%E5%BC%80%E5%8F%91%E8%AF%AD%E8%A8%80/1.htm">开发语言</a>
                        <div>基于codesys开发的多轴运动控制程序框架,将逻辑与运动控制分离,将单轴控制封装成功能块,对该功能块的操作包含了所有的单轴控制(归零、点动、相对定位、绝对定位、设置当前位置、伺服模式切换等等)。程序框架由主程序按照状态调用分归零模式、手动模式、自动模式、故障模式,程序状态的跳转都已完成,只需要根据不同的工艺要求完成所需的动作即可。变量的声明、地址的规划都严格按照C++的标准定义,能帮助开发者快速</div>
                    </li>
                    <li><a href="/article/1835489081480540160.htm"
                           title="C++ | Leetcode C++题解之第409题最长回文串" target="_blank">C++ | Leetcode C++题解之第409题最长回文串</a>
                        <span class="text-muted">Ddddddd_158</span>
<a class="tag" taget="_blank" href="/search/%E7%BB%8F%E9%AA%8C%E5%88%86%E4%BA%AB/1.htm">经验分享</a><a class="tag" taget="_blank" href="/search/C%2B%2B/1.htm">C++</a><a class="tag" taget="_blank" href="/search/Leetcode/1.htm">Leetcode</a><a class="tag" taget="_blank" href="/search/%E9%A2%98%E8%A7%A3/1.htm">题解</a>
                        <div>题目:题解:classSolution{public:intlongestPalindrome(strings){unordered_mapcount;intans=0;for(charc:s)++count[c];for(autop:count){intv=p.second;ans+=v/2*2;if(v%2==1andans%2==0)++ans;}returnans;}};</div>
                    </li>
                    <li><a href="/article/1835488955101966336.htm"
                           title="C++菜鸟教程 - 从入门到精通 第二节" target="_blank">C++菜鸟教程 - 从入门到精通 第二节</a>
                        <span class="text-muted">DreamByte</span>
<a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a>
                        <div>一.上节课的补充(数据类型)1.前言继上节课,我们主要讲解了输入,输出和运算符,我们现在来补充一下数据类型的知识上节课遗漏了这个知识点,非常的抱歉顺便说一下,博主要上高中了,更新会慢,2-4周更新一次对了,正好赶上中秋节,小编跟大家说一句:中秋节快乐!2.int类型上节课,我们其实只用了int类型int类型,是整数类型,它们存贮的是整数,不能存小数(浮点数)定义变量的方式很简单inta;//定义一</div>
                    </li>
                    <li><a href="/article/1835488702881689600.htm"
                           title="Faiss:高效相似性搜索与聚类的利器" target="_blank">Faiss:高效相似性搜索与聚类的利器</a>
                        <span class="text-muted">网络·魚</span>
<a class="tag" taget="_blank" href="/search/%E5%A4%A7%E6%95%B0%E6%8D%AE/1.htm">大数据</a><a class="tag" taget="_blank" href="/search/faiss/1.htm">faiss</a>
                        <div>Faiss是一个针对大规模向量集合的相似性搜索库,由FacebookAIResearch开发。它提供了一系列高效的算法和数据结构,用于加速向量之间的相似性搜索,特别是在大规模数据集上。本文将介绍Faiss的原理、核心功能以及如何在实际项目中使用它。Faiss原理:近似最近邻搜索:Faiss的核心功能之一是近似最近邻搜索,它能够高效地在大规模数据集中找到与给定查询向量最相似的向量。这种搜索是近似的,</div>
                    </li>
                    <li><a href="/article/1835484293607026688.htm"
                           title="【Git】常见命令(仅笔记)" target="_blank">【Git】常见命令(仅笔记)</a>
                        <span class="text-muted">好想有猫猫</span>
<a class="tag" taget="_blank" href="/search/Git/1.htm">Git</a><a class="tag" taget="_blank" href="/search/Linux%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0/1.htm">Linux学习笔记</a><a class="tag" taget="_blank" href="/search/git/1.htm">git</a><a class="tag" taget="_blank" href="/search/%E7%AC%94%E8%AE%B0/1.htm">笔记</a><a class="tag" taget="_blank" href="/search/elasticsearch/1.htm">elasticsearch</a><a class="tag" taget="_blank" href="/search/linux/1.htm">linux</a><a class="tag" taget="_blank" href="/search/c%2B%2B/1.htm">c++</a>
                        <div>文章目录创建/初始化本地仓库添加本地仓库配置项提交文件查看仓库状态回退仓库查看日志分支删除文件暂存工作区代码远程仓库使用`.gitigore`文件让git不追踪一些文件标签创建/初始化本地仓库gitinit添加本地仓库配置项gitconfig-l#以列表形式显示配置项gitconfiguser.name"ljh"#配置user.namegitconfiguser.email"123123@qq.c</div>
                    </li>
                    <li><a href="/article/1835483729036931072.htm"
                           title="果然只有离职的时候,才有人敢说真话!" target="_blank">果然只有离职的时候,才有人敢说真话!</a>
                        <span class="text-muted">return2ok</span>

                        <div>今天公司出了神贴。今天中午吃饭,同事问我看了论坛上的神贴了吗?什么帖子?我问。同事显得很惊讶,你居然没看,现在那个帖子可能会成为年度最佳帖子。这么厉害?我等不及了,饭没吃完就快速的奔向办公室,打开公司论坛,我要一睹这个帖子的神奇。写这帖子的童鞋胆儿真肥。这哪里是一个帖子,这是很多个帖子,组成了一个系列。某人从公司文化、管理、人事、项目管理等多个方面分析了公司的概况,并抨击了公司的各种弊端,并提出了</div>
                    </li>
                    <li><a href="/article/1835481396408315904.htm"
                           title="如何选择最适合你的项目研发管理软件?TAPD卓越版全面解析" target="_blank">如何选择最适合你的项目研发管理软件?TAPD卓越版全面解析</a>
                        <span class="text-muted">北京云巴巴信息技术有限公司</span>
<a class="tag" taget="_blank" href="/search/%E4%BA%A7%E5%93%81%E7%BB%8F%E7%90%86/1.htm">产品经理</a><a class="tag" taget="_blank" href="/search/%E9%9C%80%E6%B1%82%E5%88%86%E6%9E%90/1.htm">需求分析</a>
                        <div>在当今快速发展的科技时代,项目研发管理软件已成为企业不可或缺的重要工具。面对市场上琳琅满目的产品,如何选择一款适合自己团队的项目研发管理软件呢?本文将围绕项目研发管理软件的选择标准,重点介绍TAPD卓越版的特点、优势以及使用体验,让你更好地理解和选择适合自己的项目研发管理软件。项目研发管理软件的选择标准在选择项目研发管理软件时,我们需要考虑以下几个方面的因素:功能全面性:软件是否覆盖了从需求管理、</div>
                    </li>
                    <li><a href="/article/1835479758515826688.htm"
                           title="OPENAIGC开发者大赛企业组AI黑马奖 | AIGC数智传媒解决方案" target="_blank">OPENAIGC开发者大赛企业组AI黑马奖 | AIGC数智传媒解决方案</a>
                        <span class="text-muted">RPA中国</span>
<a class="tag" taget="_blank" href="/search/%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/1.htm">人工智能</a><a class="tag" taget="_blank" href="/search/AIGC/1.htm">AIGC</a><a class="tag" taget="_blank" href="/search/%E4%BC%A0%E5%AA%92/1.htm">传媒</a>
                        <div>在第二届拯救者杯OPENAIGC开发者大赛中,涌现出一批技术突出、创意卓越的作品。为了让这些优秀项目被更多人看到,我们特意开设了优秀作品报道专栏,旨在展示其独特之处和开发者的精彩故事。无论您是技术专家还是爱好者,希望能带给您不一样的知识和启发。让我们一起探索AIGC的无限可能,见证科技与创意的完美融合!创未来AI应用赛-企业组AI黑马奖作品名称:AIGC数智传媒解决方案参赛团队:深圳市三象智能技术</div>
                    </li>
                                <li><a href="/article/92.htm"
                                       title="log4j对象改变日志级别" target="_blank">log4j对象改变日志级别</a>
                                    <span class="text-muted">3213213333332132</span>
<a class="tag" taget="_blank" href="/search/java/1.htm">java</a><a class="tag" taget="_blank" href="/search/log4j/1.htm">log4j</a><a class="tag" taget="_blank" href="/search/level/1.htm">level</a><a class="tag" taget="_blank" href="/search/log4j%E5%AF%B9%E8%B1%A1%E5%90%8D%E7%A7%B0/1.htm">log4j对象名称</a><a class="tag" taget="_blank" href="/search/%E6%97%A5%E5%BF%97%E7%BA%A7%E5%88%AB/1.htm">日志级别</a>
                                    <div>log4j对象改变日志级别可批量的改变所有级别,或是根据条件改变日志级别。 
 
log4j配置文件: 
 
 

log4j.rootLogger=ERROR,FILE,CONSOLE,EXECPTION
 
#log4j.appender.FILE=org.apache.log4j.RollingFileAppender
log4j.appender.FILE=org.apache.l</div>
                                </li>
                                <li><a href="/article/219.htm"
                                       title="elk+redis 搭建nginx日志分析平台" target="_blank">elk+redis 搭建nginx日志分析平台</a>
                                    <span class="text-muted">ronin47</span>
<a class="tag" taget="_blank" href="/search/elasticsearch/1.htm">elasticsearch</a><a class="tag" taget="_blank" href="/search/kibana/1.htm">kibana</a><a class="tag" taget="_blank" href="/search/logstash/1.htm">logstash</a>
                                    <div>              elk+redis 搭建nginx日志分析平台 
logstash,elasticsearch,kibana 怎么进行nginx的日志分析呢?首先,架构方面,nginx是有日志文件的,它的每个请求的状态等都有日志文件进行记录。其次,需要有个队 列,redis的l</div>
                                </li>
                                <li><a href="/article/346.htm"
                                       title="Yii2设置时区" target="_blank">Yii2设置时区</a>
                                    <span class="text-muted">dcj3sjt126com</span>
<a class="tag" taget="_blank" href="/search/PHP/1.htm">PHP</a><a class="tag" taget="_blank" href="/search/timezone/1.htm">timezone</a><a class="tag" taget="_blank" href="/search/yii2/1.htm">yii2</a>
                                    <div>时区这东西,在开发的时候,你说重要吧,也还好,毕竟没它也能正常运行,你说不重要吧,那就纠结了。特别是linux系统,都TMD差上几小时,你能不痛苦吗?win还好一点。有一些常规方法,是大家目前都在采用的1、php.ini中的设置,这个就不谈了,2、程序中公用文件里设置,date_default_timezone_set一下时区3、或者。。。自己写时间处理函数,在遇到时间的时候,用这个函数处理(比较</div>
                                </li>
                                <li><a href="/article/473.htm"
                                       title="js实现前台动态添加文本框,后台获取文本框内容" target="_blank">js实现前台动态添加文本框,后台获取文本框内容</a>
                                    <span class="text-muted">171815164</span>
<a class="tag" taget="_blank" href="/search/%E6%96%87%E6%9C%AC%E6%A1%86/1.htm">文本框</a>
                                    <div>

<%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://w</div>
                                </li>
                                <li><a href="/article/600.htm"
                                       title="持续集成工具" target="_blank">持续集成工具</a>
                                    <span class="text-muted">g21121</span>
<a class="tag" taget="_blank" href="/search/%E6%8C%81%E7%BB%AD%E9%9B%86%E6%88%90/1.htm">持续集成</a>
                                    <div>        持续集成是什么?我们为什么需要持续集成?持续集成带来的好处是什么?什么样的项目需要持续集成?...        持续集成(Continuous integration ,简称CI),所谓集成可以理解为将互相依赖的工程或模块合并成一个能单独运行</div>
                                </li>
                                <li><a href="/article/727.htm"
                                       title="数据结构哈希表(hash)总结" target="_blank">数据结构哈希表(hash)总结</a>
                                    <span class="text-muted">永夜-极光</span>
<a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E7%BB%93%E6%9E%84/1.htm">数据结构</a>
                                    <div>1.什么是hash 
来源于百度百科: 
Hash,一般翻译做“散列”,也有直接音译为“哈希”的,就是把任意长度的输入,通过散列算法,变换成固定长度的输出,该输出就是散列值。这种转换是一种压缩映射,也就是,散列值的空间通常远小于输入的空间,不同的输入可能会散列成相同的输出,所以不可能从散列值来唯一的确定输入值。简单的说就是一种将任意长度的消息压缩到某一固定长度的消息摘要的函数。 
  
</div>
                                </li>
                                <li><a href="/article/854.htm"
                                       title="乱七八糟" target="_blank">乱七八糟</a>
                                    <span class="text-muted">程序员是怎么炼成的</span>

                                    <div>eclipse中的jvm字节码查看插件地址: 
http://andrei.gmxhome.de/eclipse/ 
安装该地址的outline 插件  后重启,打开window下的view下的bytecode视图 
http://andrei.gmxhome.de/eclipse/ 
  
jvm博客: 
http://yunshen0909.iteye.com/blog/2</div>
                                </li>
                                <li><a href="/article/981.htm"
                                       title="职场人伤害了“上司” 怎样弥补" target="_blank">职场人伤害了“上司” 怎样弥补</a>
                                    <span class="text-muted">aijuans</span>
<a class="tag" taget="_blank" href="/search/%E8%81%8C%E5%9C%BA/1.htm">职场</a>
                                    <div> 由于工作中的失误,或者平时不注意自己的言行“伤害”、“得罪”了自己的上司,怎么办呢? 
  在职业生涯中这种问题尽量不要发生。下面提供了一些解决问题的建议: 
  一、利用一些轻松的场合表示对他的尊重 
  即使是开明的上司也很注重自己的权威,都希望得到下属的尊重,所以当你与上司冲突后,最好让不愉快成为过去,你不妨在一些轻松的场合,比如会餐、联谊活动等,向上司问个好,敬下酒,表示你对对方的尊重,</div>
                                </li>
                                <li><a href="/article/1108.htm"
                                       title="深入浅出url编码" target="_blank">深入浅出url编码</a>
                                    <span class="text-muted">antonyup_2006</span>
<a class="tag" taget="_blank" href="/search/%E5%BA%94%E7%94%A8%E6%9C%8D%E5%8A%A1%E5%99%A8/1.htm">应用服务器</a><a class="tag" taget="_blank" href="/search/%E6%B5%8F%E8%A7%88%E5%99%A8/1.htm">浏览器</a><a class="tag" taget="_blank" href="/search/servlet/1.htm">servlet</a><a class="tag" taget="_blank" href="/search/weblogic/1.htm">weblogic</a><a class="tag" taget="_blank" href="/search/IE/1.htm">IE</a>
                                    <div>出处:http://blog.csdn.net/yzhz  杨争   
http://blog.csdn.net/yzhz/archive/2007/07/03/1676796.aspx 
 
一、问题: 
        编码问题是JAVA初学者在web开发过程中经常会遇到问题,网上也有大量相关的</div>
                                </li>
                                <li><a href="/article/1235.htm"
                                       title="建表后创建表的约束关系和增加表的字段" target="_blank">建表后创建表的约束关系和增加表的字段</a>
                                    <span class="text-muted">百合不是茶</span>
<a class="tag" taget="_blank" href="/search/%E6%A0%87%E7%9A%84%E7%BA%A6%E6%9D%9F%E5%85%B3%E7%B3%BB/1.htm">标的约束关系</a><a class="tag" taget="_blank" href="/search/%E5%A2%9E%E5%8A%A0%E8%A1%A8%E7%9A%84%E5%AD%97%E6%AE%B5/1.htm">增加表的字段</a>
                                    <div>  
下面所有的操作都是在表建立后操作的,主要目的就是熟悉sql的约束,约束语句的万能公式 
  
1,增加字段(student表中增加 姓名字段) 
  
alter table 增加字段的表名 add  增加的字段名   增加字段的数据类型

  alter table student add name varchar2(10);
 
  
&nb</div>
                                </li>
                                <li><a href="/article/1362.htm"
                                       title="Uploadify 3.2 参数属性、事件、方法函数详解" target="_blank">Uploadify 3.2 参数属性、事件、方法函数详解</a>
                                    <span class="text-muted">bijian1013</span>
<a class="tag" taget="_blank" href="/search/JavaScript/1.htm">JavaScript</a><a class="tag" taget="_blank" href="/search/uploadify/1.htm">uploadify</a>
                                    <div>一.属性     
属性名称   
默认值   
说明     
auto   
true   
设置为true当选择文件后就直接上传了,为false需要点击上传按钮才上传。     
buttonClass   
”   
按钮样式     
buttonCursor   
‘hand’   
鼠标指针悬停在按钮上的样子     
buttonImage   
null   
浏览按钮的图片的路</div>
                                </li>
                                <li><a href="/article/1489.htm"
                                       title="精通Oracle10编程SQL(16)使用LOB对象" target="_blank">精通Oracle10编程SQL(16)使用LOB对象</a>
                                    <span class="text-muted">bijian1013</span>
<a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a><a class="tag" taget="_blank" href="/search/%E6%95%B0%E6%8D%AE%E5%BA%93/1.htm">数据库</a><a class="tag" taget="_blank" href="/search/plsql/1.htm">plsql</a>
                                    <div>/*
 *使用LOB对象
 */
--LOB(Large Object)是专门用于处理大对象的一种数据类型,其所存放的数据长度可以达到4G字节
--CLOB/NCLOB用于存储大批量字符数据,BLOB用于存储大批量二进制数据,而BFILE则存储着指向OS文件的指针

/*
 *综合实例
 */
--建立表空间 
--#指定区尺寸为128k,如不指定,区尺寸默认为64k 
CR</div>
                                </li>
                                <li><a href="/article/1616.htm"
                                       title="【Resin一】Resin服务器部署web应用" target="_blank">【Resin一】Resin服务器部署web应用</a>
                                    <span class="text-muted">bit1129</span>
<a class="tag" taget="_blank" href="/search/resin/1.htm">resin</a>
                                    <div>工作中,在Resin服务器上部署web应用,通常有如下三种方式: 
  
 
 配置多个web-app 
 配置多个http id 
 为每个应用配置一个propeties、xml以及sh脚本文件 
 配置多个web-app 
  
 
 在resin.xml中,可以为一个host配置多个web-app 
 
  
<cluster id="app&q</div>
                                </li>
                                <li><a href="/article/1743.htm"
                                       title="red5简介及基础知识" target="_blank">red5简介及基础知识</a>
                                    <span class="text-muted">白糖_</span>
<a class="tag" taget="_blank" href="/search/%E5%9F%BA%E7%A1%80/1.htm">基础</a>
                                    <div>  
 
 简介 
 
  
Red5的主要功能和Macromedia公司的FMS类似,提供基于Flash的流媒体服务的一款基于Java的开源流媒体服务器。它由Java语言编写,使用RTMP作为流媒体传输协议,这与FMS完全兼容。它具有流化FLV、MP3文件,实时录制客户端流为FLV文件,共享对象,实时视频播放、Remoting等功能。用Red5替换FMS后,客户端不用更改可正</div>
                                </li>
                                <li><a href="/article/1870.htm"
                                       title="angular.fromJson" target="_blank">angular.fromJson</a>
                                    <span class="text-muted">boyitech</span>
<a class="tag" taget="_blank" href="/search/AngularJS/1.htm">AngularJS</a><a class="tag" taget="_blank" href="/search/AngularJS+%E5%AE%98%E6%96%B9API/1.htm">AngularJS 官方API</a><a class="tag" taget="_blank" href="/search/AngularJS+API/1.htm">AngularJS API</a>
                                    <div>angular.fromJson   描述:   把Json字符串转为对象    使用方法:   angular.fromJson(json);   参数详解:      Param Type Details   json 
string  
JSON 字符串      返回值:   对象, 数组, 字符串 或者是一个数字   示例:   
<!DOCTYPE HTML>
<h</div>
                                </li>
                                <li><a href="/article/1997.htm"
                                       title="java-颠倒一个句子中的词的顺序。比如: I am a student颠倒后变成:student a am I" target="_blank">java-颠倒一个句子中的词的顺序。比如: I am a student颠倒后变成:student a am I</a>
                                    <span class="text-muted">bylijinnan</span>
<a class="tag" taget="_blank" href="/search/java/1.htm">java</a>
                                    <div>

public class ReverseWords {

	/**
	 * 题目:颠倒一个句子中的词的顺序。比如: I am a student颠倒后变成:student a am I.词以空格分隔。
	 * 要求:
	 * 1.实现速度最快,移动最少
	 * 2.不能使用String的方法如split,indexOf等等。
	 * 解答:两次翻转。
	 */
	publ</div>
                                </li>
                                <li><a href="/article/2124.htm"
                                       title="web实时通讯" target="_blank">web实时通讯</a>
                                    <span class="text-muted">Chen.H</span>
<a class="tag" taget="_blank" href="/search/Web/1.htm">Web</a><a class="tag" taget="_blank" href="/search/%E6%B5%8F%E8%A7%88%E5%99%A8/1.htm">浏览器</a><a class="tag" taget="_blank" href="/search/socket/1.htm">socket</a><a class="tag" taget="_blank" href="/search/%E8%84%9A%E6%9C%AC/1.htm">脚本</a>
                                    <div>关于web实时通讯,做一些监控软件。 
由web服务器组件从消息服务器订阅实时数据,并建立消息服务器到所述web服务器之间的连接,web浏览器利用从所述web服务器下载到web页面的客户端代理与web服务器组件之间的socket连接,建立web浏览器与web服务器之间的持久连接;利用所述客户端代理与web浏览器页面之间的信息交互实现页面本地更新,建立一条从消息服务器到web浏览器页面之间的消息通路</div>
                                </li>
                                <li><a href="/article/2251.htm"
                                       title="[基因与生物]远古生物的基因可以嫁接到现代生物基因组中吗?" target="_blank">[基因与生物]远古生物的基因可以嫁接到现代生物基因组中吗?</a>
                                    <span class="text-muted">comsci</span>
<a class="tag" taget="_blank" href="/search/%E7%94%9F%E7%89%A9/1.htm">生物</a>
                                    <div> 
 
      大家仅仅把我说的事情当作一个IT行业的笑话来听吧..没有其它更多的意思 
 
 
    如果我们把大自然看成是一位伟大的程序员,专门为地球上的生态系统编制基因代码,并创造出各种不同的生物来,那么6500万年前的程序员开发的代码,是否兼容现代派的程序员的代码和架构呢? 
 
  </div>
                                </li>
                                <li><a href="/article/2378.htm"
                                       title="oracle 外部表" target="_blank">oracle 外部表</a>
                                    <span class="text-muted">daizj</span>
<a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a><a class="tag" taget="_blank" href="/search/%E5%A4%96%E9%83%A8%E8%A1%A8/1.htm">外部表</a><a class="tag" taget="_blank" href="/search/external+tables/1.htm">external tables</a>
                                    <div>    oracle外部表是只允许只读访问,不能进行DML操作,不能创建索引,可以对外部表进行的查询,连接,排序,创建视图和创建同义词操作。 
you can select, join, or sort external table data. You can also create views and synonyms for external tables. Ho</div>
                                </li>
                                <li><a href="/article/2505.htm"
                                       title="aop相关的概念及配置" target="_blank">aop相关的概念及配置</a>
                                    <span class="text-muted">daysinsun</span>
<a class="tag" taget="_blank" href="/search/AOP/1.htm">AOP</a>
                                    <div>切面(Aspect): 
通常在目标方法执行前后需要执行的方法(如事务、日志、权限),这些方法我们封装到一个类里面,这个类就叫切面。 
 
 
连接点(joinpoint) 
spring里面的连接点指需要切入的方法,通常这个joinpoint可以作为一个参数传入到切面的方法里面(非常有用的一个东西)。 
 
 
通知(Advice) 
通知就是切面里面方法的具体实现,分为前置、后置、最终、异常环</div>
                                </li>
                                <li><a href="/article/2632.htm"
                                       title="初一上学期难记忆单词背诵第二课" target="_blank">初一上学期难记忆单词背诵第二课</a>
                                    <span class="text-muted">dcj3sjt126com</span>
<a class="tag" taget="_blank" href="/search/english/1.htm">english</a><a class="tag" taget="_blank" href="/search/word/1.htm">word</a>
                                    <div>middle 中间的,中级的 
well 喔,那么;好吧 
phone 电话,电话机 
policeman 警察 
ask 问 
take 拿到;带到 
address 地址 
glad 高兴的,乐意的 
why 为什么  
China 中国 
family 家庭 
grandmother (外)祖母 
grandfather (外)祖父 
wife 妻子 
husband 丈夫 
da</div>
                                </li>
                                <li><a href="/article/2759.htm"
                                       title="Linux日志分析常用命令" target="_blank">Linux日志分析常用命令</a>
                                    <span class="text-muted">dcj3sjt126com</span>
<a class="tag" taget="_blank" href="/search/linux/1.htm">linux</a><a class="tag" taget="_blank" href="/search/log/1.htm">log</a>
                                    <div>1.查看文件内容 
cat 
-n 显示行号 2.分页显示 
more 
Enter 显示下一行 
空格 显示下一页 
F 显示下一屏 
B 显示上一屏 
less 
/get 查询"get"字符串并高亮显示 3.显示文件尾 
tail 
-f 不退出持续显示 
-n 显示文件最后n行 4.显示头文件 
head 
-n 显示文件开始n行 5.内容排序 
sort 
-n 按照</div>
                                </li>
                                <li><a href="/article/2886.htm"
                                       title="JSONP 原理分析" target="_blank">JSONP 原理分析</a>
                                    <span class="text-muted">fantasy2005</span>
<a class="tag" taget="_blank" href="/search/JavaScript/1.htm">JavaScript</a><a class="tag" taget="_blank" href="/search/jsonp/1.htm">jsonp</a><a class="tag" taget="_blank" href="/search/jsonp+%E8%B7%A8%E5%9F%9F/1.htm">jsonp 跨域</a>
                                    <div>转自 http://www.nowamagic.net/librarys/veda/detail/224 
JavaScript是一种在Web开发中经常使用的前端动态脚本技术。在JavaScript中,有一个很重要的安全性限制,被称为“Same-Origin Policy”(同源策略)。这一策略对于JavaScript代码能够访问的页面内容做了很重要的限制,即JavaScript只能访问与包含它的</div>
                                </li>
                                <li><a href="/article/3013.htm"
                                       title="使用connect by进行级联查询" target="_blank">使用connect by进行级联查询</a>
                                    <span class="text-muted">234390216</span>
<a class="tag" taget="_blank" href="/search/oracle/1.htm">oracle</a><a class="tag" taget="_blank" href="/search/%E6%9F%A5%E8%AF%A2/1.htm">查询</a><a class="tag" taget="_blank" href="/search/%E7%88%B6%E5%AD%90/1.htm">父子</a><a class="tag" taget="_blank" href="/search/Connect+by/1.htm">Connect by</a><a class="tag" taget="_blank" href="/search/%E7%BA%A7%E8%81%94/1.htm">级联</a>
                                    <div>使用connect by进行级联查询 
  
       connect by可以用于级联查询,常用于对具有树状结构的记录查询某一节点的所有子孙节点或所有祖辈节点。 
  
       来看一个示例,现假设我们拥有一个菜单表t_menu,其中只有三个字段:</div>
                                </li>
                                <li><a href="/article/3140.htm"
                                       title="一个不错的能将HTML表格导出为excel,pdf等的jquery插件" target="_blank">一个不错的能将HTML表格导出为excel,pdf等的jquery插件</a>
                                    <span class="text-muted">jackyrong</span>
<a class="tag" taget="_blank" href="/search/jquery%E6%8F%92%E4%BB%B6/1.htm">jquery插件</a>
                                    <div>发现一个老外写的不错的jquery插件,可以实现将HTML 
表格导出为excel,pdf等格式, 
地址在: 
https://github.com/kayalshri/ 
 
下面看个例子,实现导出表格到excel,pdf 
 
 


<html>
			<head>
				<title>Export html table to excel an</div>
                                </li>
                                <li><a href="/article/3267.htm"
                                       title="UI设计中我们为什么需要设计动效" target="_blank">UI设计中我们为什么需要设计动效</a>
                                    <span class="text-muted">lampcy</span>
<a class="tag" taget="_blank" href="/search/UI/1.htm">UI</a><a class="tag" taget="_blank" href="/search/UI%E8%AE%BE%E8%AE%A1/1.htm">UI设计</a>
                                    <div>关于Unity3D中的Shader的知识 
首先先解释下Unity3D的Shader,Unity里面的Shaders是使用一种叫ShaderLab的语言编写的,它同微软的FX文件或者NVIDIA的CgFX有些类似。传统意义上的vertex shader和pixel shader还是使用标准的Cg/HLSL 编程语言编写的。因此Unity文档里面的Shader,都是指用ShaderLab编写的代码,</div>
                                </li>
                                <li><a href="/article/3394.htm"
                                       title="如何禁止页面缓存" target="_blank">如何禁止页面缓存</a>
                                    <span class="text-muted">nannan408</span>
<a class="tag" taget="_blank" href="/search/html/1.htm">html</a><a class="tag" taget="_blank" href="/search/jsp/1.htm">jsp</a><a class="tag" taget="_blank" href="/search/cache/1.htm">cache</a>
                                    <div>禁止页面使用缓存~ 
------------------------------------------------ 
jsp:页面no cache: 
 
response.setHeader("Pragma","No-cache"); 
response.setHeader("Cache-Control","no-cach</div>
                                </li>
                                <li><a href="/article/3521.htm"
                                       title="以代码的方式管理quartz定时任务的暂停、重启、删除、添加等" target="_blank">以代码的方式管理quartz定时任务的暂停、重启、删除、添加等</a>
                                    <span class="text-muted">Everyday都不同</span>
<a class="tag" taget="_blank" href="/search/%E5%AE%9A%E6%97%B6%E4%BB%BB%E5%8A%A1%E7%AE%A1%E7%90%86/1.htm">定时任务管理</a><a class="tag" taget="_blank" href="/search/spring-quartz/1.htm">spring-quartz</a>
                                    <div>      【前言】在项目的管理功能中,对定时任务的管理有时会很常见。因为我们不能指望只在配置文件中配置好定时任务就行了,因为如果要控制定时任务的 “暂停” 呢?暂停之后又要在某个时间点 “重启” 该定时任务呢?或者说直接 “删除” 该定时任务呢?要改变某定时任务的触发时间呢? “添加” 一个定时任务对于系统的使用者而言,是不太现实的,因为一个定时任务的处理逻辑他是不</div>
                                </li>
                                <li><a href="/article/3648.htm"
                                       title="EXT实例" target="_blank">EXT实例</a>
                                    <span class="text-muted">tntxia</span>
<a class="tag" taget="_blank" href="/search/ext/1.htm">ext</a>
                                    <div>  
(1) 增加一个按钮 
  
JSP: 
  
<%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%>
<%
String path = request.getContextPath();
Stri</div>
                                </li>
                                <li><a href="/article/3775.htm"
                                       title="数学学习在计算机研究领域的作用和重要性" target="_blank">数学学习在计算机研究领域的作用和重要性</a>
                                    <span class="text-muted">xjnine</span>
<a class="tag" taget="_blank" href="/search/Math/1.htm">Math</a>
                                    <div>最近一直有师弟师妹和朋友问我数学和研究的关系,研一要去学什么数学课。毕竟在清华,衡量一个研究生最重要的指标之一就是paper,而没有数学,是肯定上不了世界顶级的期刊和会议的,这在计算机学界尤其重要!你会发现,不论哪个领域有价值的东西,都一定离不开数学!在这样一个信息时代,当google已经让世界没有秘密的时候,一种卓越的数学思维,绝对可以成为你的核心竞争力.  无奈本人实在见地</div>
                                </li>
                </ul>
            </div>
        </div>
    </div>

<div>
    <div class="container">
        <div class="indexes">
            <strong>按字母分类:</strong>
            <a href="/tags/A/1.htm" target="_blank">A</a><a href="/tags/B/1.htm" target="_blank">B</a><a href="/tags/C/1.htm" target="_blank">C</a><a
                href="/tags/D/1.htm" target="_blank">D</a><a href="/tags/E/1.htm" target="_blank">E</a><a href="/tags/F/1.htm" target="_blank">F</a><a
                href="/tags/G/1.htm" target="_blank">G</a><a href="/tags/H/1.htm" target="_blank">H</a><a href="/tags/I/1.htm" target="_blank">I</a><a
                href="/tags/J/1.htm" target="_blank">J</a><a href="/tags/K/1.htm" target="_blank">K</a><a href="/tags/L/1.htm" target="_blank">L</a><a
                href="/tags/M/1.htm" target="_blank">M</a><a href="/tags/N/1.htm" target="_blank">N</a><a href="/tags/O/1.htm" target="_blank">O</a><a
                href="/tags/P/1.htm" target="_blank">P</a><a href="/tags/Q/1.htm" target="_blank">Q</a><a href="/tags/R/1.htm" target="_blank">R</a><a
                href="/tags/S/1.htm" target="_blank">S</a><a href="/tags/T/1.htm" target="_blank">T</a><a href="/tags/U/1.htm" target="_blank">U</a><a
                href="/tags/V/1.htm" target="_blank">V</a><a href="/tags/W/1.htm" target="_blank">W</a><a href="/tags/X/1.htm" target="_blank">X</a><a
                href="/tags/Y/1.htm" target="_blank">Y</a><a href="/tags/Z/1.htm" target="_blank">Z</a><a href="/tags/0/1.htm" target="_blank">其他</a>
        </div>
    </div>
</div>
<footer id="footer" class="mb30 mt30">
    <div class="container">
        <div class="footBglm">
            <a target="_blank" href="/">首页</a> -
            <a target="_blank" href="/custom/about.htm">关于我们</a> -
            <a target="_blank" href="/search/Java/1.htm">站内搜索</a> -
            <a target="_blank" href="/sitemap.txt">Sitemap</a> -
            <a target="_blank" href="/custom/delete.htm">侵权投诉</a>
        </div>
        <div class="copyright">版权所有 IT知识库 CopyRight © 2000-2050 E-COM-NET.COM , All Rights Reserved.
<!--            <a href="https://beian.miit.gov.cn/" rel="nofollow" target="_blank">京ICP备09083238号</a><br>-->
        </div>
    </div>
</footer>
<!-- 代码高亮 -->
<script type="text/javascript" src="/static/syntaxhighlighter/scripts/shCore.js"></script>
<script type="text/javascript" src="/static/syntaxhighlighter/scripts/shLegacy.js"></script>
<script type="text/javascript" src="/static/syntaxhighlighter/scripts/shAutoloader.js"></script>
<link type="text/css" rel="stylesheet" href="/static/syntaxhighlighter/styles/shCoreDefault.css"/>
<script type="text/javascript" src="/static/syntaxhighlighter/src/my_start_1.js"></script>





</body>

</html>