新闻标题
新闻内容新闻内容 新闻内容新闻内容 新闻内容新闻内容
目录
1.Node.js介绍
2.安装Node.js
3.使用Node.js实现第一个服务器
3.1初步感受Node.js
3.2Google Chrome 默认非安全端口列表,尽量避免以下端口。
3.3nodemon自动监控服务端更改
4.模块化——Node.js使用commonjs规范
4.1创建自定义模块(引入文件/文件夹,按需导出)
4.1.1引入文件
4.1.2引入文件夹
4.1.3引入node_modules里的文件夹
4.1.4描述功能性文件package.json
4.1.5 自定义模块的按需导出
4.2.2内置模块;
5.npm包管理器
5.1 npm常用指令;
5.2注册与发布npm项目
5.2.1 注册账号:
5.2.2发布包
5.2.3cnpm的安装及使用
6.fs模块(内置模块)
6.1写入文件writeFile()
6.2删除文件
6.3修改文件名
6.4读取文件
6.5复制文件
6.6自定义复制文件实现原理
6.7创建目录
6.8修改目录名
6.9读取目录中的文件及子目录
6.10删除目录
6.10判断文件或目录是否存在
6.11获取文件或目录的详细信息
6.12删除非空文件夹
7.buffer缓冲区
7.1buffer的创建:
8.stream流
8.1data:监控得到的数据,并将其分成多份
8.2 end监控文件是否读取完成
8.3pipe:管道,将得到的数据通过pipe进行写入(fs.createWriteStream())
9.包管理工具yarn
10.node.js版本管理工具NVM
10.1NVM下载
10.2安装NVM
10.3NVM常用指令
11.通过fs模块加载页面
11.1- 普通方式加载页面
11.2- 通过stream流方式加载页面
12.node+cheerio实现爬虫获取数据
13.实现新闻列表页面
本节知识要点
- Node.js 诞生于2009年,Node.js采用C++语言编写而成,是 一个Javascript的运行环境。Node.js 是一个基于 Chrome V8 引擎的 JavaScript 运行环境 ,让JavaScript的运行脱离浏览器端,可以使用JavaScript语言 书写服务器端代码。
[Node.js官网](https://nodejs.org)下载稳定版本,node偶数版本为稳定版本,奇数版本为非稳定版本。
//引入http模块
let http = require("http");
//创建一个服务器
let serve = http.createServer((req, res) => {
console.log("hello");
res.end("hello world");
})
//设置端口号
serve.listen(3000);
结果:浏览器端打印hello world; 服务端打印hello
1, // tcpmux
7, // echo
9, // discard
11, // systat
13, // daytime
15, // netstat
17, // qotd
19, // chargen
20, // ftp data
21, // ftp access
22, // ssh
23, // telnet
25, // smtp
37, // time
42, // name
43, // nicname
53, // domain
77, // priv-rjs
79, // finger
87, // ttylink
95, // supdup
101, // hostriame
102, // iso-tsap
103, // gppitnp
104, // acr-nema
109, // pop2
110, // pop3
111, // sunrpc
113, // auth
115, // sftp
117, // uucp-path
119, // nntp
123, // NTP
135, // loc-srv /epmap
139, // netbios
143, // imap2
179, // BGP
389, // ldap
465, // smtp+ssl
512, // print / exec
513, // login
514, // shell
515, // printer
526, // tempo
530, // courier
531, // chat
532, // netnews
540, // uucp
556, // remotefs
563, // nntp+ssl
587, // stmp?
601, // ??
636, // ldap+ssl
993, // ldap+ssl
995, // pop3+ssl
2049, // nfs
3659, // apple-sasl / PasswordServer
4045, // lockd
6000, // X11
6665, // Alternate IRC [Apple addition]
6666, // Alternate IRC [Apple addition]
6667, // Standard IRC [Apple addition]
6668, // Alternate IRC [Apple addition]
6669, // Alternate IRC [Apple addition]
使用node js文件名,方式在node.js下启动时,如果JS文件进行了修改就必须,重启服务;
解决:使用nodemon会自动监听服务端JS文件的修改
在全局下安装nodemon的命令:npm i nodemon -g
nodemon启动:通过nodemon命令 nodemon JS文件名 启动文件后,会自动监听和重启服务器。如 nodemon index.js
为什么会有模块化:
- 在JavaScript发展初期就是为了实现简单的页面交互逻辑,随着前端代码日益膨胀,JavaScript作为嵌入式的脚本语言的定位动摇了,JavaScript却没有为组织代码提供任何明显帮助,JavaScript极其简单的代码组织规范不足以驾驭如此庞大规模的代码;
每个模块之间有独立的空间,从而防止模块间的变量污染
Node.js中的模块化 commonjs规范:
- CommonJS就是为JS的表现来制定规范,因为js没有模块的功能所以CommonJS应运而生,它希望js可以在任何地方运行,不只是浏览器中。
前端规范:AMD sea.js 和 CMD require.js
node.js安装完成后,就自带commonjs规范。
模块化引入:可以引入文件,文件夹和node_modules文件夹下的自定义文件夹。
注意:当引入文件或自定义文件夹时require("./mydir")中需要加"./",当引入的是node_modules时,不能加"./",如require("node_modules")
moduleA.js:
console.log("这是moduleA.js 文件");
home.js执行文件:输入命令:nodemon home.js
console.log("这是home主页JS文件");
require("./moduleA.js");
结果:执行home.js文件中的内容时也能执行引入文件moduleA.js文件中的内容
如下:home.js为执行JS文件,自身执行及引入moduleA.js的同时引入了文件夹mydir ,则在文件夹mydir中会默认去找index.js文件,发现index.js引入了a.js文件,而a.js文件引入了b.js文件,所以执行home.js时,会执行index.js a.js b.js中的内容。
home.js执行文件:
console.log("这是home主页JS文件");
require("./moduleA.js");//引入moduleA.js文件
require("./mydir");//引入文件夹mydir,就会自动查找文件夹mydir下的index.js文件
home.js引入mydir文件夹后,会自动查找文件夹下的index.js文件执行:
index.js文件:
console.log("这是index.js文件");
require("./a");
index.js中引入了a.js文件:
console.log("这是a.js文件");
require("./b");
a.js文件中又引入了b.js文件:
console.log("这是b.js文件");
结果:所以最后结果是,会执行mydir下所有js文件,index.js a.js b.js文件
{
"name":"aModule",
"version":"1.0.0",
"main":"test.js"
}
node_modules主要针对第三方JS文件的管理。
引入node_modules里的文件夹时,文件夹不需要加"./",如,require("文件夹")
package.json:描述功能性文件。可以在里面设置node_modules中的默认执行文件,版本号,文件夹名等配置
示例:
文件目录层级:
home.js为执行文件:在home.js中引入node_modules中的文件夹mytest
require("mytest");
index.js:
console.log("这是index.js文件");
require("./a");
a.js:
console.log("这是a.js文件");
require("./b");
b.js:
console.log("这是b.js文件");
结果:
项目上线或者项目转移时,不会转移或上线node_modules文件夹,而是使用时再通过package.json去下载第三方模块。
node_modules的查找规则:向上查找。首先会在当前文件夹下查找node_modules,没有再查找上一级文件夹下有没有node_modules,再没有会找到系统根目录user文件夹下的node_modules。
引入node_modules文件夹下的自定义文件夹时,默认去执行文件夹下面的index.js。
如果想引入的文件夹默认执行文件不是index.js,而执行自定义的文件夹,需要通过package.json文件定义:
package.json文件:
{
"name":"mytest",
"version":"1.0.0",
"main":"test.js"
}
test.js:
console.log("使用了package.json文件,默认引入text.js");
结果:
通过module.exports 导出; \_\__dirname , \_\_filename
exports是module.exports的引用 :可以使用module.exports = {}进行导出,而不能使用exports = {}进行导出,要使用exports导出,只能使用exports.a 的形式。因为exports是module.exports的一个引用。
__dirname: 获得当前执行文件所在目录的完整目录名
__filename: 获得当前执行文件的带有完整绝对路径的文件名
process.cwd():获得当前执行node命令时候的文件夹目录名
Ma.js:有变量a和类Person,通过module.exports = {}将变量a和Person类进行导出
console.log("这是Ma.js文件");
let a = 10;
class Person{
constructor(){
this.name = "zhangsan";
}
hobby(){
console.log("喜欢篮球");
}
}
module.exports = {
a,
Person
}
Mb.js:使用require("./Ma.js")引入文件,将引入后的结果存起来,再获取里面的变量a和Person类
console.log("这是Mb.js文件");
let Ma = require("./Ma");
console.log(Ma.a);
let p = new Ma.Person();
p.hobby();
exports是module.exports的一个引用,使用时不能使用exports = {}进行导出(不会改变module.exports的值,只会改变exports 的值就没有意义),必须使用exports.a 和exports.Person
所以上例可以改写为以下:
Ma.js:
console.log("这是Ma.js文件");
let a = 10;
class Person{
constructor(){
this.name = "zhangsan";
}
hobby(){
console.log("喜欢篮球");
}
}
module.exports = {
a,
Person
}
exports.a = a;
exports.Person = Person;
//解构赋值
// exports.hobby = new Person().hobby;
Mb.js::注意使用解构赋值时 {hobby}结构的值必须和exports.hobby的值相同
console.log("这是Mb.js文件");
let Ma = require("./Ma");
console.log(Ma.a);
let p = new Ma.Person();
p.hobby();
//也可以通过结构赋值方式得到值
// 或者 通过解构赋值
// let { hobby } = require("./Ma");
// hobby();
内置模块即不需要下载,nodejs中本身就有的模块,内置模块不需要安装,外置模块需要安装;
nodejs内置模块有:Buffer,C/C++Addons,Child Processes,Cluster,Console,Crypto,Debugger,DNS,Domain,Errors,Events,File System,Globals,HTTP,HTTPS,Modules,Net,OS,Path,Process,P unycode,Query Strings,Readline,REPL,Stream,String De coder,Timers,TLS/SSL,TTY,UDP/Datagram,URL, Utilities,V8,VM,ZLIB;
安装node.js时,npm也会一起安装,npm的版本号和node.js的版本号是关联的。一般如果npm版本不够,只要升级node.js的版本即可。
NPM(Node Package Manager) 官网的地址是 [npm官网](https://www.npmjs.com) ,可以注册和发布项目。
node.js是轻量级加载,用到某个模块时再去安装,没有用到就先不安装。
示例:使用npm i cookie创建cookie模块:
如在npm文件夹下执行这一命令后,会自定创建一个node_modules文件夹以及package.json和package-lock.json文件,并创建cookie模块
"cookie": "^0.4.0"的^表示安装时自动查找或使用0.4.0及其以上版本
使用npm init 引导创建package.json文件:
{
"name": "test",
"version": "1.0.0",
"description": "",
"main": "main.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"author": "",
"license": "ISC"
}
[https://www.npmjs.com/](https://www.npmjs.com/) (邮箱验证)
- npm adduser 输入刚刚注册好的用户名和密码 ;
如果注册地址不对需要重新改回来:使用淘宝源地址源下载速度快些。
查询源地址:npm config list
创建项目:index.js默认入口文件;npm i 创建package.json文件,npm start启动项目
cnpm是淘宝出的。使用cnpm会直接在淘宝源地址进行下载。
一般不建议使用cnpm命令 。下载地址会很怪,且会出现未知问题。
可以使用npm命令,使用淘宝源,下载地址就会是国内地址
安装命令:会到国内淘宝源地址进行下载安装
$ npm install -g cnpm --registry=https://registry.npm.taobao.org
内置模块只需要引入,不需要安装。
fs模块:针对文件和目录的增删改查。
- fs是文件操作模块,所有文件操作都是有同步和异步之分,特点是同步会加上 "Sync" 如:异步读取文件 "readFile",同步读取文件 "readFileSync";
文件操作:
let fs = require("fs");
fs.writeFile(filename,data,[options],callback)写入内容到文件.
data
是一个 buffer,则 encoding
选项会被忽略。如果 options
是一个字符串,则可以指定字符编码。在同一个文件上多次使用 fs.writeFile()
且不等待回调是不安全的。 对于这种情况,建议使用 fs.createWriteStream()
。
异步写入:
let fs = require("fs");
//异步写入
fs.writeFile("1.txt","323",{flag:'w'},function(err){
//如果有错误,就返回错误
if(err){
return console.log(err);
}
//写入成功
console.log("写入成功");
});
同步写入:没有错误回调
fs.writeFileSync("2.txt","sdfa",{flags:'a'});
异步删除:
//异步删除文件
fs.unlink("2.txt",function(err){
//删除失败
if(err){
return console.log(err);
}
console.log("删除成功");
});
同步删除:
//同步删除文件
fs.unlinkSync("1.txt");
异步修改文件名:
fs.rename(oldFile,newFile,callback)
//异步修改文件
fs.rename("1.txt","11.txt",err=>{
if(err){
return console.log(err);
}
console.log("修改文件名成功");
});
同步修改文件名:
//同步修改文件名
fs.renameSync("11.txt","2.txt");
异步读取:
fs.readFile(path: string | number | Buffer | URL,callback)
//异步读取文件
fs.readFile("1.txt",(err,data)=>{
if(err){
return console.log(err);
}
console.log("读取成功"+data);
});
同步读取:
readFileSync(path: string | number | Buffer | URL, options?: {
encoding?: null;
flag?: string;
})
直接读取出来的数据时Buffer数据,需要使用toString()将其转换为字符串
//同步读取文件
let data = fs.readFileSync("1.txt").toString();
console.log(data);
异步复制文件:
copyFile(src: PathLike, dest: PathLike, callback)
//异步复制文件
fs.copyFile("1.txt","3.txt",err=>{
if(err){
return console.log(err);
}
console.log("复制成功");
});
同步复制文件:
//同步复制文件
fs.copyFileSync("3.txt","4.txt");
原理:先读取到文件,再将读取到的文件进行写入
//复制文件实现原理
function copyFile(src,dir){
let data = fs.readFileSync(src).toString();
fs.writeFileSync(dir,data);
}
copyFile("1.txt","5.txt");
异步创建目录:
//创建目录
fs.mkdir("11",err=>{
if(err){
return console.log(err);
}
console.log("创建成功");
});
同步创建目录:
fs.mkdirSync("22");
异步修改目录名:
//异步修改目录名
fs.rename("11","44",err=>{
if(err){
return console.log(err);
}
console.log(修改目录成功);
});
同步修改目录名:
fs.renameSync("44","33");
会将目录中的子目录和文件都读取出来,放到数组中
//异步读取目录中的文件和子目录
fs.readdir("33",(err,files)=>{
if(err){
return console.log(err);
}
console.log(files);//[ '1.txt', '2.txt', '22' ]
})
//同步读取目录中的文件和子目录
let data = fs.readdirSync("33");
console.log(data);//[ '1.txt', '2.txt', '22' ]
删除的目录必须是空目录,否则删除不了
//删除空目录
fs.rmdir("./33/22",err=>{
if(err){
return console.log(err);
}
console.log("删除成功");
});
//判断文件或目录是否存在
fs.exists("33",exists=>{
console.log(exists);
});
//同步判断文件或目录是否存在
let flag = fs.existsSync("33");
console.log(flag);
//获取文件或目录的详细信息
fs.stat("33",(err,stats)=>{
if(err){
return console.log(err);
}
console.log(stats);
//通过stas.isFile()判断是否是一个文件
console.log(stats.isFile());
//通过stats.isDirectory()判断是否是一个目录
console.log(stats.isDirectory());
});
结果:
//自定义删除非空文件夹
function removeDir(dir) {
let data = fs.readdirSync(dir);
data.forEach(item=>{
//注意这里的路径循环出来的是原来目录的子级,所以删除时,需要加上原来目录级别才能找到
let url = dir + "/" + item;
let stats = fs.statSync(url);
if(stats.isDirectory()){
//如果是文件夹继续向下查找
removeDir(url);
}else{
//如果是文件就删除文件
console.log("是文件");
fs.unlinkSync(url);
}
});
//循环完后删除空目录
fs.rmdirSync(dir);
}
removeDir("33");
buffer是以二进制的形式进行保存,但是是以十六进制的方式进行展示
buffer创建方式一:ES6.1之前:new Buffer()
//ES6.1之前buffer缓冲区创建
let buffer = new Buffer("大家好");
console.log(buffer);//
buffer创建方式二:ES6.1: let buffer = Buffer.alloc(10);
//buffer缓冲区创建方式二:
let buffer = Buffer.alloc(10);//创建一个10字节的buffer缓冲区
console.log(buffer);//
buffer创建方式三:Buffer.from()
//buffer缓冲区创建方式三:
let buffer = Buffer.from("大家好");
console.log(buffer);//
buffer创建方式四:数组方式创建
//buffer缓冲区创建方式四:
let buffer = Buffer.from([0xe5, 0xa4, 0xa7, 0xe5, 0xae, 0xb6, 0xe5, 0xa5, 0xbd]);
console.log(buffer.toString());//大家好
通过数组方式创建,一个中文对应3个16进制位,如果给的位数不对就会乱码:
//通过数组方式创建,一个中文对应3个16进制位,如果给的位数不对就会乱码
let buffer1 = Buffer.from([0xe5, 0xa4, 0xa7, 0xe5]);
let buffer2 = Buffer.from([ 0xae, 0xb6, 0xe5, 0xa5, 0xbd]);
console.log(buffer1.toString());//大�
console.log(buffer2.toString());//��好
乱码解决方法一:使用concat([buffer1,buffer2])连接多个buffer(注意参数必须是数组),并调用toString()再输出
//解决:使用concat连接多个buffer(注意参数必须是数组),再输出
console.log(Buffer.concat([buffer1,buffer2]).toString());
乱码解决方法二(性能更好):引入string_decoder模块,再通过其write()方法
// 乱码解决方法二:
let { StringDecoder } = require("string_decoder");
let decoder = new StringDecoder();
let buf1 = decoder.write(buffer1);
let buf2 = decoder.write(buffer2);
console.log(buf1);//大
console.log(buf2);//家好
数据传输时,如果数据过大,带宽不足,会造成内存溢出,或者叫内存爆仓。
stream流会将数据,分割成多份,然后依次进行传递。
文件上传读取等多会涉及到stream流。
示例:
读取home.js文件中的内容:
home.js:
// console.log("这是home主页JS文件");
// require("./moduleA.js");//引入moduleA.js文件
// require("./mydir");//引入文件夹mydir,就会自动查找文件夹mydir下的index.js文件
// require("mytest");
stream.js读取文件:
方式一:使用fs.readFileSync("home.js");读取时发现是一次性进行读取,一旦文件内容过大,就会造成内存溢出。
//读取文件home.js的数据
const fs = require("fs");
let data = fs.readFileSync("home.js");
console.log(data);//
方式二:fs.createReadStream("home.js"),通过on方法监听这个数据,并分成多份
//使用可读流进行读取
let re = fs.createReadStream("home.js");
re.on("data",chunk=>{
console.log(chunk.toString());
});
创建一个65kb的文件,再通过createReadStream()进行读取,发现会打印两次chunk
//创建一个65kb大小的buffer,再通过createReadStream()读取
let buffer = Buffer.alloc(65*1024);
fs.writeFile("65kb.txt",buffer,err=>{
if(err){
return console.log(err);
}
});
let res = fs.createReadStream("65kb.txt");
let num = 0;
res.on("data",chunk=>{
num++;
console.log(chunk);//发现会打印两次
console.log(num);
/*
1
2
*/
});
当创建64kb的文件,通过createReadStream()进行读取,发现只会打印一次:
//再读取64kb文件时,发现只会打印一次
let res = fs.createReadStream("64kb.txt");
let num = 0;
let str = "";
res.on("data",chunk=>{
num++;
str += chunk;
console.log(chunk);//发现会打印两次
console.log(num);
/*
1
*/
});
//再读取64kb文件时,发现只会打印一次
let res = fs.createReadStream("home.js");
let num = 0;
let str = "";
res.on("data",chunk=>{
num++;
str += chunk;
console.log(chunk);//发现会打印两次
console.log(num);
/*
1
*/
});
//end监控文件是否读取完成,并可以得到读取完成后的数据
res.on("end",()=>{
console.log(str);
});
结果:会将读取到的所有数据进行打印
let res = fs.createReadStream("home.js");
let num = 0;
let str = "";
res.on("data",chunk=>{
num++;
str += chunk;
console.log(chunk);//发现会打印两次
console.log(num);
/*
1
*/
});
//pipe管道,将读到的数据,通过pipe()写入到文件
let pipeTxt = fs.createWriteStream("pipe.txt");
res.pipe(pipeTxt);
npm install -g yarn
yarn会相对简单,但是最好使用npm
- 使用NVM(Node Version Manager)控制Node.js版本
mac上安装NVM:
- NVM [github的地址](
windows上直接上一步下一步;
通过路由地址/index 和/product路由到不同的页面
index.html:
这是主页........
product.html:
这是产品页........
style.css:
body {
background: red;
}
index.js:创建服务器,并使用fs模块加载不同页面,并使用nodemon index.js启动服务器,即可热更新,自动监控页面的变化
页面由页面内容和请求头,响应头组成,头信息必不可少。
//使用fs模块加载页面
const fs = require("fs");
const http = require("http");
const url = require("url");
const path = require("path");
//引入静态文件后缀json文件
const mime = require("./mime.json");
let server = http.createServer((req, res) => {
//如果不设置HTML请求头会乱码(注意参数的写法不能错)
// res.setHeader('Content-Type', 'text/html; charset=utf-8');
res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' });
if (req.url === "/index") {
// res.write("这是主页");
//方法一:通过fs模块读取页面,再进行写入内容
let index = fs.readFileSync("index.html");
res.write(index);
// 监控响应结束,不写的话会一直 请求
res.end();
} else if (req.url === "/product") {
// res.write("这是产品页");
//通过fs模块读取页面,再进行写入内容
let product = fs.readFileSync("product.html");
res.write(product);
res.end();
} else {
//其他静态文件(文件的后缀不统一),使用mime.json文件处理
//通过path模块获取到文件后缀名 /.css => .css
let extname = path.extname(req.url);
//注意这里必须设置响应头,否则没有响应头页面不会显示(注意拼接形式)
res.writeHead(200, { 'Content-Type': mime[extname]});
//再读取到静态文件,并响应到页面
let static = fs.createReadStream("."+req.url);
static.pipe(res);
}
});
server.listen("8000");
mime.json:各种静态文件(文件的后缀不统一),操作复杂,可以使用mime.json文件处理。
如,通过path模块获取到文件后缀名 /.css => .css , let extname = path.extname(req.url) ;然后必须设置响应头,否则没有响应头页面不会显示(注意拼接形式) res.writeHead(200, { 'Content-Type': mime[extname]});
{ ".323":"text/h323" ,
".3gp":"video/3gpp" ,
".aab":"application/x-authoware-bin" ,
".aam":"application/x-authoware-map" ,
".aas":"application/x-authoware-seg" ,
".acx":"application/internet-property-stream" ,
".ai":"application/postscript" ,
".aif":"audio/x-aiff" ,
".aifc":"audio/x-aiff" ,
".aiff":"audio/x-aiff" ,
".als":"audio/X-Alpha5" ,
".amc":"application/x-mpeg" ,
".ani":"application/octet-stream" ,
".apk":"application/vnd.android.package-archive" ,
".asc":"text/plain" ,
".asd":"application/astound" ,
".asf":"video/x-ms-asf" ,
".asn":"application/astound" ,
".asp":"application/x-asap" ,
".asr":"video/x-ms-asf" ,
".asx":"video/x-ms-asf" ,
".au":"audio/basic" ,
".avb":"application/octet-stream" ,
".avi":"video/x-msvideo" ,
".awb":"audio/amr-wb" ,
".axs":"application/olescript" ,
".bas":"text/plain" ,
".bcpio":"application/x-bcpio" ,
".bin ":"application/octet-stream" ,
".bld":"application/bld" ,
".bld2":"application/bld2" ,
".bmp":"image/bmp" ,
".bpk":"application/octet-stream" ,
".bz2":"application/x-bzip2" ,
".c":"text/plain" ,
".cal":"image/x-cals" ,
".cat":"application/vnd.ms-pkiseccat" ,
".ccn":"application/x-cnc" ,
".cco":"application/x-cocoa" ,
".cdf":"application/x-cdf" ,
".cer":"application/x-x509-ca-cert" ,
".cgi":"magnus-internal/cgi" ,
".chat":"application/x-chat" ,
".class":"application/octet-stream" ,
".clp":"application/x-msclip" ,
".cmx":"image/x-cmx" ,
".co":"application/x-cult3d-object" ,
".cod":"image/cis-cod" ,
".conf":"text/plain" ,
".cpio":"application/x-cpio" ,
".cpp":"text/plain" ,
".cpt":"application/mac-compactpro" ,
".crd":"application/x-mscardfile" ,
".crl":"application/pkix-crl" ,
".crt":"application/x-x509-ca-cert" ,
".csh":"application/x-csh" ,
".csm":"chemical/x-csml" ,
".csml":"chemical/x-csml" ,
".css":"text/css" ,
".cur":"application/octet-stream" ,
".dcm":"x-lml/x-evm" ,
".dcr":"application/x-director" ,
".dcx":"image/x-dcx" ,
".der":"application/x-x509-ca-cert" ,
".dhtml":"text/html" ,
".dir":"application/x-director" ,
".dll":"application/x-msdownload" ,
".dmg":"application/octet-stream" ,
".dms":"application/octet-stream" ,
".doc":"application/msword" ,
".docx":"application/vnd.openxmlformats-officedocument.wordprocessingml.document" ,
".dot":"application/msword" ,
".dvi":"application/x-dvi" ,
".dwf":"drawing/x-dwf" ,
".dwg":"application/x-autocad" ,
".dxf":"application/x-autocad" ,
".dxr":"application/x-director" ,
".ebk":"application/x-expandedbook" ,
".emb":"chemical/x-embl-dl-nucleotide" ,
".embl":"chemical/x-embl-dl-nucleotide" ,
".eps":"application/postscript" ,
".epub":"application/epub+zip" ,
".eri":"image/x-eri" ,
".es":"audio/echospeech" ,
".esl":"audio/echospeech" ,
".etc":"application/x-earthtime" ,
".etx":"text/x-setext" ,
".evm":"x-lml/x-evm" ,
".evy":"application/envoy" ,
".exe":"application/octet-stream" ,
".fh4":"image/x-freehand" ,
".fh5":"image/x-freehand" ,
".fhc":"image/x-freehand" ,
".fif":"application/fractals" ,
".flr":"x-world/x-vrml" ,
".flv":"flv-application/octet-stream" ,
".fm":"application/x-maker" ,
".fpx":"image/x-fpx" ,
".fvi":"video/isivideo" ,
".gau":"chemical/x-gaussian-input" ,
".gca":"application/x-gca-compressed" ,
".gdb":"x-lml/x-gdb" ,
".gif":"image/gif" ,
".gps":"application/x-gps" ,
".gtar":"application/x-gtar" ,
".gz":"application/x-gzip" ,
".h":"text/plain" ,
".hdf":"application/x-hdf" ,
".hdm":"text/x-hdml" ,
".hdml":"text/x-hdml" ,
".hlp":"application/winhlp" ,
".hqx":"application/mac-binhex40" ,
".hta":"application/hta" ,
".htc":"text/x-component" ,
".htm":"text/html" ,
".html":"text/html" ,
".hts":"text/html" ,
".htt":"text/webviewhtml" ,
".ice":"x-conference/x-cooltalk" ,
".ico":"image/x-icon" ,
".ief":"image/ief" ,
".ifm":"image/gif" ,
".ifs":"image/ifs" ,
".iii":"application/x-iphone" ,
".imy":"audio/melody" ,
".ins":"application/x-internet-signup" ,
".ips":"application/x-ipscript" ,
".ipx":"application/x-ipix" ,
".isp":"application/x-internet-signup" ,
".it":"audio/x-mod" ,
".itz":"audio/x-mod" ,
".ivr":"i-world/i-vrml" ,
".j2k":"image/j2k" ,
".jad":"text/vnd.sun.j2me.app-descriptor" ,
".jam":"application/x-jam" ,
".jar":"application/java-archive" ,
".java":"text/plain" ,
".jfif":"image/pipeg" ,
".jnlp":"application/x-java-jnlp-file" ,
".jpe":"image/jpeg" ,
".jpeg":"image/jpeg" ,
".jpg":"image/jpeg" ,
".jpz":"image/jpeg" ,
".js":"application/x-javascript" ,
".jwc":"application/jwc" ,
".kjx":"application/x-kjx" ,
".lak":"x-lml/x-lak" ,
".latex":"application/x-latex" ,
".lcc":"application/fastman" ,
".lcl":"application/x-digitalloca" ,
".lcr":"application/x-digitalloca" ,
".lgh":"application/lgh" ,
".lha":"application/octet-stream" ,
".lml":"x-lml/x-lml" ,
".lmlpack":"x-lml/x-lmlpack" ,
".log":"text/plain" ,
".lsf":"video/x-la-asf" ,
".lsx":"video/x-la-asf" ,
".lzh":"application/octet-stream" ,
".m13":"application/x-msmediaview" ,
".m14":"application/x-msmediaview" ,
".m15":"audio/x-mod" ,
".m3u":"audio/x-mpegurl" ,
".m3url":"audio/x-mpegurl" ,
".m4a":"audio/mp4a-latm" ,
".m4b":"audio/mp4a-latm" ,
".m4p":"audio/mp4a-latm" ,
".m4u":"video/vnd.mpegurl" ,
".m4v":"video/x-m4v" ,
".ma1":"audio/ma1" ,
".ma2":"audio/ma2" ,
".ma3":"audio/ma3" ,
".ma5":"audio/ma5" ,
".man":"application/x-troff-man" ,
".map":"magnus-internal/imagemap" ,
".mbd":"application/mbedlet" ,
".mct":"application/x-mascot" ,
".mdb":"application/x-msaccess" ,
".mdz":"audio/x-mod" ,
".me":"application/x-troff-me" ,
".mel":"text/x-vmel" ,
".mht":"message/rfc822" ,
".mhtml":"message/rfc822" ,
".mi":"application/x-mif" ,
".mid":"audio/mid" ,
".midi":"audio/midi" ,
".mif":"application/x-mif" ,
".mil":"image/x-cals" ,
".mio":"audio/x-mio" ,
".mmf":"application/x-skt-lbs" ,
".mng":"video/x-mng" ,
".mny":"application/x-msmoney" ,
".moc":"application/x-mocha" ,
".mocha":"application/x-mocha" ,
".mod":"audio/x-mod" ,
".mof":"application/x-yumekara" ,
".mol":"chemical/x-mdl-molfile" ,
".mop":"chemical/x-mopac-input" ,
".mov":"video/quicktime" ,
".movie":"video/x-sgi-movie" ,
".mp2":"video/mpeg" ,
".mp3":"audio/mpeg" ,
".mp4":"video/mp4" ,
".mpa":"video/mpeg" ,
".mpc":"application/vnd.mpohun.certificate" ,
".mpe":"video/mpeg" ,
".mpeg":"video/mpeg" ,
".mpg":"video/mpeg" ,
".mpg4":"video/mp4" ,
".mpga":"audio/mpeg" ,
".mpn":"application/vnd.mophun.application" ,
".mpp":"application/vnd.ms-project" ,
".mps":"application/x-mapserver" ,
".mpv2":"video/mpeg" ,
".mrl":"text/x-mrml" ,
".mrm":"application/x-mrm" ,
".ms":"application/x-troff-ms" ,
".msg":"application/vnd.ms-outlook" ,
".mts":"application/metastream" ,
".mtx":"application/metastream" ,
".mtz":"application/metastream" ,
".mvb":"application/x-msmediaview" ,
".mzv":"application/metastream" ,
".nar":"application/zip" ,
".nbmp":"image/nbmp" ,
".nc":"application/x-netcdf" ,
".ndb":"x-lml/x-ndb" ,
".ndwn":"application/ndwn" ,
".nif":"application/x-nif" ,
".nmz":"application/x-scream" ,
".nokia-op-logo":"image/vnd.nok-oplogo-color" ,
".npx":"application/x-netfpx" ,
".nsnd":"audio/nsnd" ,
".nva":"application/x-neva1" ,
".nws":"message/rfc822" ,
".oda":"application/oda" ,
".ogg":"audio/ogg" ,
".oom":"application/x-AtlasMate-Plugin" ,
".p10":"application/pkcs10" ,
".p12":"application/x-pkcs12" ,
".p7b":"application/x-pkcs7-certificates" ,
".p7c":"application/x-pkcs7-mime" ,
".p7m":"application/x-pkcs7-mime" ,
".p7r":"application/x-pkcs7-certreqresp" ,
".p7s":"application/x-pkcs7-signature" ,
".pac":"audio/x-pac" ,
".pae":"audio/x-epac" ,
".pan":"application/x-pan" ,
".pbm":"image/x-portable-bitmap" ,
".pcx":"image/x-pcx" ,
".pda":"image/x-pda" ,
".pdb":"chemical/x-pdb" ,
".pdf":"application/pdf" ,
".pfr":"application/font-tdpfr" ,
".pfx":"application/x-pkcs12" ,
".pgm":"image/x-portable-graymap" ,
".pict":"image/x-pict" ,
".pko":"application/ynd.ms-pkipko" ,
".pm":"application/x-perl" ,
".pma":"application/x-perfmon" ,
".pmc":"application/x-perfmon" ,
".pmd":"application/x-pmd" ,
".pml":"application/x-perfmon" ,
".pmr":"application/x-perfmon" ,
".pmw":"application/x-perfmon" ,
".png":"image/png" ,
".pnm":"image/x-portable-anymap" ,
".pnz":"image/png" ,
".pot,":"application/vnd.ms-powerpoint" ,
".ppm":"image/x-portable-pixmap" ,
".pps":"application/vnd.ms-powerpoint" ,
".ppt":"application/vnd.ms-powerpoint" ,
".pptx":"application/vnd.openxmlformats-officedocument.presentationml.presentation" ,
".pqf":"application/x-cprplayer" ,
".pqi":"application/cprplayer" ,
".prc":"application/x-prc" ,
".prf":"application/pics-rules" ,
".prop":"text/plain" ,
".proxy":"application/x-ns-proxy-autoconfig" ,
".ps":"application/postscript" ,
".ptlk":"application/listenup" ,
".pub":"application/x-mspublisher" ,
".pvx":"video/x-pv-pvx" ,
".qcp":"audio/vnd.qcelp" ,
".qt":"video/quicktime" ,
".qti":"image/x-quicktime" ,
".qtif":"image/x-quicktime" ,
".r3t":"text/vnd.rn-realtext3d" ,
".ra":"audio/x-pn-realaudio" ,
".ram":"audio/x-pn-realaudio" ,
".rar":"application/octet-stream" ,
".ras":"image/x-cmu-raster" ,
".rc":"text/plain" ,
".rdf":"application/rdf+xml" ,
".rf":"image/vnd.rn-realflash" ,
".rgb":"image/x-rgb" ,
".rlf":"application/x-richlink" ,
".rm":"audio/x-pn-realaudio" ,
".rmf":"audio/x-rmf" ,
".rmi":"audio/mid" ,
".rmm":"audio/x-pn-realaudio" ,
".rmvb":"audio/x-pn-realaudio" ,
".rnx":"application/vnd.rn-realplayer" ,
".roff":"application/x-troff" ,
".rp":"image/vnd.rn-realpix" ,
".rpm":"audio/x-pn-realaudio-plugin" ,
".rt":"text/vnd.rn-realtext" ,
".rte":"x-lml/x-gps" ,
".rtf":"application/rtf" ,
".rtg":"application/metastream" ,
".rtx":"text/richtext" ,
".rv":"video/vnd.rn-realvideo" ,
".rwc":"application/x-rogerwilco" ,
".s3m":"audio/x-mod" ,
".s3z":"audio/x-mod" ,
".sca":"application/x-supercard" ,
".scd":"application/x-msschedule" ,
".sct":"text/scriptlet" ,
".sdf":"application/e-score" ,
".sea":"application/x-stuffit" ,
".setpay":"application/set-payment-initiation" ,
".setreg":"application/set-registration-initiation" ,
".sgm":"text/x-sgml" ,
".sgml":"text/x-sgml" ,
".sh":"application/x-sh" ,
".shar":"application/x-shar" ,
".shtml":"magnus-internal/parsed-html" ,
".shw":"application/presentations" ,
".si6":"image/si6" ,
".si7":"image/vnd.stiwap.sis" ,
".si9":"image/vnd.lgtwap.sis" ,
".sis":"application/vnd.symbian.install" ,
".sit":"application/x-stuffit" ,
".skd":"application/x-Koan" ,
".skm":"application/x-Koan" ,
".skp":"application/x-Koan" ,
".skt":"application/x-Koan" ,
".slc":"application/x-salsa" ,
".smd":"audio/x-smd" ,
".smi":"application/smil" ,
".smil":"application/smil" ,
".smp":"application/studiom" ,
".smz":"audio/x-smd" ,
".snd":"audio/basic" ,
".spc":"application/x-pkcs7-certificates" ,
".spl":"application/futuresplash" ,
".spr":"application/x-sprite" ,
".sprite":"application/x-sprite" ,
".sdp":"application/sdp" ,
".spt":"application/x-spt" ,
".src":"application/x-wais-source" ,
".sst":"application/vnd.ms-pkicertstore" ,
".stk":"application/hyperstudio" ,
".stl":"application/vnd.ms-pkistl" ,
".stm":"text/html" ,
".svg":"image/svg+xml" ,
".sv4cpio":"application/x-sv4cpio" ,
".sv4crc":"application/x-sv4crc" ,
".svf":"image/vnd" ,
".svg":"image/svg+xml" ,
".svh":"image/svh" ,
".svr":"x-world/x-svr" ,
".swf":"application/x-shockwave-flash" ,
".swfl":"application/x-shockwave-flash" ,
".t":"application/x-troff" ,
".tad":"application/octet-stream" ,
".talk":"text/x-speech" ,
".tar":"application/x-tar" ,
".taz":"application/x-tar" ,
".tbp":"application/x-timbuktu" ,
".tbt":"application/x-timbuktu" ,
".tcl":"application/x-tcl" ,
".tex":"application/x-tex" ,
".texi":"application/x-texinfo" ,
".texinfo":"application/x-texinfo" ,
".tgz":"application/x-compressed" ,
".thm":"application/vnd.eri.thm" ,
".tif":"image/tiff" ,
".tiff":"image/tiff" ,
".tki":"application/x-tkined" ,
".tkined":"application/x-tkined" ,
".toc":"application/toc" ,
".toy":"image/toy" ,
".tr":"application/x-troff" ,
".trk":"x-lml/x-gps" ,
".trm":"application/x-msterminal" ,
".tsi":"audio/tsplayer" ,
".tsp":"application/dsptype" ,
".tsv":"text/tab-separated-values" ,
".ttf":"application/octet-stream" ,
".ttz":"application/t-time" ,
".txt":"text/plain" ,
".uls":"text/iuls" ,
".ult":"audio/x-mod" ,
".ustar":"application/x-ustar" ,
".uu":"application/x-uuencode" ,
".uue":"application/x-uuencode" ,
".vcd":"application/x-cdlink" ,
".vcf":"text/x-vcard" ,
".vdo":"video/vdo" ,
".vib":"audio/vib" ,
".viv":"video/vivo" ,
".vivo":"video/vivo" ,
".vmd":"application/vocaltec-media-desc" ,
".vmf":"application/vocaltec-media-file" ,
".vmi":"application/x-dreamcast-vms-info" ,
".vms":"application/x-dreamcast-vms" ,
".vox":"audio/voxware" ,
".vqe":"audio/x-twinvq-plugin" ,
".vqf":"audio/x-twinvq" ,
".vql":"audio/x-twinvq" ,
".vre":"x-world/x-vream" ,
".vrml":"x-world/x-vrml" ,
".vrt":"x-world/x-vrt" ,
".vrw":"x-world/x-vream" ,
".vts":"workbook/formulaone" ,
".wav":"audio/x-wav" ,
".wax":"audio/x-ms-wax" ,
".wbmp":"image/vnd.wap.wbmp" ,
".wcm":"application/vnd.ms-works" ,
".wdb":"application/vnd.ms-works" ,
".web":"application/vnd.xara" ,
".wi":"image/wavelet" ,
".wis":"application/x-InstallShield" ,
".wks":"application/vnd.ms-works" ,
".wm":"video/x-ms-wm" ,
".wma":"audio/x-ms-wma" ,
".wmd":"application/x-ms-wmd" ,
".wmf":"application/x-msmetafile" ,
".wml":"text/vnd.wap.wml" ,
".wmlc":"application/vnd.wap.wmlc" ,
".wmls":"text/vnd.wap.wmlscript" ,
".wmlsc":"application/vnd.wap.wmlscriptc" ,
".wmlscript":"text/vnd.wap.wmlscript" ,
".wmv":"audio/x-ms-wmv" ,
".wmx":"video/x-ms-wmx" ,
".wmz":"application/x-ms-wmz" ,
".wpng":"image/x-up-wpng" ,
".wps":"application/vnd.ms-works" ,
".wpt":"x-lml/x-gps" ,
".wri":"application/x-mswrite" ,
".wrl":"x-world/x-vrml" ,
".wrz":"x-world/x-vrml" ,
".ws":"text/vnd.wap.wmlscript" ,
".wsc":"application/vnd.wap.wmlscriptc" ,
".wv":"video/wavelet" ,
".wvx":"video/x-ms-wvx" ,
".wxl":"application/x-wxl" ,
".x-gzip":"application/x-gzip" ,
".xaf":"x-world/x-vrml" ,
".xar":"application/vnd.xara" ,
".xbm":"image/x-xbitmap" ,
".xdm":"application/x-xdma" ,
".xdma":"application/x-xdma" ,
".xdw":"application/vnd.fujixerox.docuworks" ,
".xht":"application/xhtml+xml" ,
".xhtm":"application/xhtml+xml" ,
".xhtml":"application/xhtml+xml" ,
".xla":"application/vnd.ms-excel" ,
".xlc":"application/vnd.ms-excel" ,
".xll":"application/x-excel" ,
".xlm":"application/vnd.ms-excel" ,
".xls":"application/vnd.ms-excel" ,
".xlsx":"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" ,
".xlt":"application/vnd.ms-excel" ,
".xlw":"application/vnd.ms-excel" ,
".xm":"audio/x-mod" ,
".xml":"text/plain",
".xml":"application/xml",
".xmz":"audio/x-mod" ,
".xof":"x-world/x-vrml" ,
".xpi":"application/x-xpinstall" ,
".xpm":"image/x-xpixmap" ,
".xsit":"text/xml" ,
".xsl":"text/xml" ,
".xul":"text/xul" ,
".xwd":"image/x-xwindowdump" ,
".xyz":"chemical/x-pdb" ,
".yz1":"application/x-yz1" ,
".z":"application/x-compress" ,
".zac":"application/x-zaurus-zac" ,
".zip":"application/zip" ,
".json":"application/json"
}
通过fs模块流 Stream读取文件:在文件内容很大时一点一点读取效率高,不会内存溢出。
注意点:
//使用fs模块加载页面
const fs = require("fs");
const http = require("http");
const url = require("url");
const path = require("path");
//引入静态文件后缀json文件
const mime = require("./mime.json");
let server = http.createServer((req, res) => {
//如果不设置HTML请求头会乱码(注意参数的写法不能错)
// res.setHeader('Content-Type', 'text/html; charset=utf-8');
res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' });
if (req.url === "/index") {
//方法二:通过fs模块流 Stream读取文件:在文件内容很大时效率高,不会内存溢出
let resIndex = fs.createReadStream("index.html");
//pipe管道,将读到的数据响应到页面
resIndex.pipe(res);
} else if (req.url === "/product") {
//注意createReadStream路由只需要读取并响应,不需要写入
let product = fs.createReadStream("product.html");
product.pipe(res);
} else {
//其他静态文件(文件的后缀不统一),使用mime.json文件处理
//通过path模块获取到文件后缀名 /.css => .css
let extname = path.extname(req.url);
//注意这里必须设置响应头,否则没有响应头页面不会显示(注意拼接形式)
res.writeHead(200, { 'Content-Type': mime[extname]});
//再读取到静态文件,并响应到页面
let static = fs.createReadStream("."+req.url);
static.pipe(res);
}
});
server.listen("8000");
cheerio:类似于前端jQuery的node.js的模块,用法基本和jQuery相同,但是没有window对象,即location,history,location等前端对象不存在。
crawlerData.js:
const http = require("http");
//通过get方法获取爬虫爬到的是整个页面,需要使用cheerio模块进行处理(类似JQuery)
const cheerio = require("cheerio");//注意需要使用 npm i cheerio -S 按照cheerio模块到运行环境
const fs = require("fs");
let newsData = '';
//http的get 方法会并自动调用 req.end()
http.get("http://news.ifeng.com/", (res) => {
res.on("data", chunk => {
newsData += chunk;
});
res.on("end", () => {
//数据必须在end里面处理,否则获取不到数据
//使用cheerio加载整个数据
let $ = cheerio.load(newsData);
//用于存json数据
let data = [];
//新闻标题
let titles = $(".news-stream-newsStream-news-item-infor h2 a");
titles.each((index,ele)=>{
data.push({
"id":index+1,
"title":ele.attribs['title']
});
});
//发布者
let publisher = $(".news-stream-newsStream-news-item-infor .clearfix span");
publisher.each((index,ele)=>{
data[index].publisher = ele.children[0]['data'];
});
//发布时间
let time = $(".news-stream-newsStream-news-item-infor .clearfix time");
time.each((index,ele)=>{
data[index].time = ele.children[0]['data'];
});
//将得到的数据写入到json文件中
let dataJson = fs.createWriteStream("data.json");
//注意需要使用write(),并且将数据转成字符串格式才能进行写出
dataJson.write(JSON.stringify(data));
res.pipe(dataJson);
});
});
获取到的json数据文件:文件已做一定修改
[
{
"id": 1,
"title": "13岁少年成社会的灾难",
"publisher": "海外网",
"time": "今天 17:08"
},
{
"id": 2,
"title": "(全文实录)",
"publisher": "中国网",
"time": "今天 16:59"
},
{
"id": 3,
"title": "禁读《哈利·波特》 称咒语召唤邪灵",
"publisher": "澎湃新闻",
"time": "今天 16:56"
},
{
"id": 4,
"title": "XXXXXXXXXXX",
"publisher": "海外网",
"time": "今天 16:54"
},
{
"id": 5,
"title": "空缺190天后,迎来王浩",
"publisher": "上游新闻",
"time": "今天 16:54"
},
{
"id": 6,
"title": "中国游客在日本突然昏迷 2名韩国消防员及时相救",
"publisher": "海外网",
"time": "今天 16:53"
},
{
"id": 7,
"title": "零售雪上加霜 奢侈品牌普拉达将关闭在港最大门店",
"publisher": "观察者网",
"time": "今天 16:46"
},
{
"id": 8,
"title": "拖了18年?",
"publisher": "海外网",
"time": "今天 16:41"
},
{
"id": 9,
"title": "误读",
"publisher": "环球网",
"time": "今天 16:37"
},
{
"id": 10,
"title": "被男议员骂“不生孩子没尽国家责任” 韩国55岁女学者懵了",
"publisher": "海外网",
"time": "今天 16:29"
},
{
"id": 11,
"title": "陈刚被提起公诉",
"publisher": "海外网",
"time": "今天 16:29"
},
{
"id": 12,
"title": "台假装尿急逃出营区 10天后在网吧被抓",
"publisher": "海外网",
"time": "今天 16:22"
},
{
"id": 13,
"title": "出糗:成语连说了3遍都没对",
"publisher": "海外网",
"time": "今天 16:21"
},
{
"id": 14,
"title": "XXXX",
"publisher": "澎湃新闻网",
"time": "今天 16:09"
},
{
"id": 15,
"title": "XXXX权",
"publisher": "新京报网",
"time": "今天 16:00"
},
{
"id": 16,
"title": "特征",
"publisher": "新京报即时新闻",
"time": "今天 15:53"
},
{
"id": 17,
"title": "XXXXXX",
"publisher": "新京报即时新闻",
"time": "今天 15:46"
},
{
"id": 18,
"title": "XXXXXX",
"publisher": "中国网",
"time": "今天 15:41"
},
{
"id": 19,
"title": "XXXX",
"publisher": "新京报即时新闻",
"time": "今天 15:35"
},
{
"id": 20,
"title": "XXXXXX",
"publisher": "新京报即时新闻",
"time": "今天 15:18"
},
{
"id": 21,
"title": "商务部:上周猪肉批发价格上涨8.9%",
"publisher": "新京报即时新闻",
"time": "今天 15:14"
},
{
"id": 22,
"title": "出糗:这个成语连说三遍都没对",
"publisher": "海外网",
"time": "今天 14:54"
},
{
"id": 23,
"title": "两高:高考等4类考试组织作弊属犯罪 最高判7年",
"publisher": "新京报即时新闻",
"time": "今天 14:33"
}
]
index.js:
//将爬虫得到的数据,渲染到页面,需要进行页面路由
const http = require("http");
const fs = require("fs");
const cheerio = require("cheerio");
const url = require("url");//处理地址
const path = require("path");//处理地址后缀名
const mime = require("./mime.json");//处理后缀名文件
const dataJson = require("./data.json");//新闻数据
let server = http.createServer((req, res) => {
res.writeHead(200, { 'Content-Type': 'text/html;charset=utf-8' });
if (req.url === "/" || req.url === "/index") {
//方法一:读取到index文件,获取文件标签,设置文件标签内容
let index = fs.readFileSync("index.html");
let $ = cheerio.load(index);
//读取到json文件新闻数据
//获取ul 将HTML设置进去
let ulHtml = '';
dataJson.forEach(item => {
ulHtml += `
${item.title}
纵火逮捕
| ${item.time}
`;
});
$(".news-list").html(ulHtml);
res.end($.html());
// //方法二:如果要使用createReadStream()流方式读取文件,就必须使用on("data")和on("end")监控页面读取完后再操作DOM
// let index = fs.createReadStream("index.html");
// let oldIndex = '';
// index.on("data", chunk => {
// oldIndex += chunk;
// });
// index.on('end', () => {
// console.log(oldIndex);
// let $ = cheerio.load(oldIndex);
// let ulHtml = '';
// dataJson.forEach(item => {
// ulHtml += `
//
//
//
//
//
//
// ${item.title}
//
//
// 纵火逮捕
//
// | ${item.time}
//
//
//
// `;
// });
// $(".news-list").html(ulHtml);
// res.end($.html());
// });
} else if (req.url !== "/favicon.ico") {
let extname = path.extname(req.url);
res.writeHead(200, { 'Content-Type': mime[extname] });
let static = fs.createReadStream("." + req.url);
static.pipe(res);
}
});
server.listen(4000);
index.html:
文章信息展示
-
18人死伤!韩国一男子纵火后持凶器伤害避险邻居
纵火韩国逮捕
| 1小时前
-
18人死伤!韩国一男子纵火后持凶器伤害避险邻居
纵火韩国逮捕
| 1小时前
-
18人死伤!韩国一男子纵火后持凶器伤害避险邻居
纵火韩国逮捕
| 1小时前
-
18人死伤!韩国一男子纵火后持凶器伤害避险邻居
纵火韩国逮捕
| 1小时前
-
18人死伤!韩国一男子纵火后持凶器伤害避险邻居
纵火韩国逮捕
| 1小时前
detail.html:
Document
新闻标题
类型:纵火 时间:2019-6-18
新闻内容新闻内容
新闻内容新闻内容
新闻内容新闻内容
data.json:同上,爬虫所得的数据
实现详情页展示和分页功能:
更改index.js为indexPager.js:其他文件不变,直接覆盖detail.html中的内容
//将爬虫得到的数据,渲染到页面,需要进行页面路由
const http = require("http");
const fs = require("fs");
const cheerio = require("cheerio");
const url = require("url");//处理地址
const path = require("path");//处理地址后缀名
const mime = require("./mime.json");//处理后缀名文件
const dataJson = require("./data.json");//新闻数据
let server = http.createServer((req, res) => {
//地址栏加上页码后,会有queryString :pathname:/index ; search: '?pageNum=2';query: 'pageNum=2'
let pathname = url.parse(req.url).pathname;
if (pathname === "/" || pathname === "/index") {
res.writeHead(200, { 'Content-Type': 'text/html;charset=utf-8' });
//方法一:读取到index文件,获取文件标签,设置文件标签内容
let index = fs.readFileSync("index.html");
let $ = cheerio.load(index);
//分页实现:总页数=总数据条数/每页多少条
//当点击某页时,需要获取到传过来的页码(通过path可以获取到queryString参数)
//url.parse(req.url, true).query 当没有参数时[Object: null prototype] {},有参数时:[Object: null prototype] { pageNum: '2' }
let pageNum = url.parse(req.url, true).query.pageNum || 1;//第几页(默认第一页)
let pageSize = 5;//每页多少条
let pageTotal = Math.ceil(dataJson.length/pageSize);//总共多少页
//通过数组slice方法进行分页(如果是数据库,需要调数据库)0-4 5-10 11-14 =>(pageNum-1)*pageSize , pageNum*pageSize
let pageData = dataJson.slice((pageNum-1)*pageSize,pageNum*pageSize);
//读取到json文件新闻数据
//获取ul 将HTML设置进去
let ulHtml = '';
pageData.forEach(item => {
ulHtml += `
${item.title}
纵火逮捕
| ${item.time}
`;
});
$(".news-list").html(ulHtml);
//渲染分页按钮
//cheerio没有事件,所以上下页切换不能绑定点击事件,只能通过操作pageNum实现,注意需要将pageNum转为Number类型
let p = parseInt(pageNum);
let pagerHtml = `⌜`;
for(let i=1;i<=pageTotal;i++){
//点击每页还是跳转到本页面,这是页码变化
pagerHtml += `${i}`;
}
pagerHtml += `⌝`;
$(".pagination").html(pagerHtml);
//点击事件:设置数组中某个元素的样式,.eq(pageNum)
$(".pagination a").each((index,item)=>{
if(pageNum == index){
$(".pagination a").eq(pageNum).addClass('active')
}
});
res.end($.html());
}else if(pathname === "/detail"){
//跳转到详情页
//使用id重新查找对应数据
let id = url.parse(req.url, true).query.id;
let detailData = dataJson.find(item=>id==item.id);
let detail = fs.readFileSync("detail.html");
let $ = cheerio.load(detail);
let detailHtml = `
${detailData.title}
类型:纵火 时间:${detailData.time}
${detailData.title}
`;
$(".text").html(detailHtml);
//一定要使用res.end()方式时响应结束
res.end($.html());
} else if (pathname !== "/favicon.ico") {
let extname = path.extname(req.url);
res.writeHead(200, { 'Content-Type': mime[extname] });
let static = fs.createReadStream("." + req.url);
static.pipe(res);
}
});
server.listen(3000);
效果: