Node.js字幕解析器,支持typescript,支持LRC、SRT

字幕解析器

简介

可以解析字幕文本的解析器

特性

  1. 你可以获得具体的秒和毫秒
  2. 它支持两种类型的文本:LRC、SRT (它将支持更多类型的文本)

待办

  1. 支持更多类型的文本
  2. 处理在SRT解析的零
  3. 处理结果以更方便地使用

链接

Gitee
Github

用法

下载此库

npm install subtitle-parser

或者使用yarn

yarn install subtitle-parser

例子

const subtitleParser = require("subtitle-parser");
const contentSRT = 
`
1
00:00:01,410 --> 00:00:04,220
我们已经走得太远,以至于忘记了为什么出发。
We already walked too far, down to we had forgotten why embarked.
2
00:00:04,251 --> 00:00:06,234
天下没有不散的筵席
There is no never-ending feast
`;
// 你可以不写前五行
// 这不是这首歌歌词对应的时间,只是测试使用
/**
 * ar: 艺人名
 * ti: 标题
 * al: 专辑
 * by: 编者(做LRC歌词的人)
 * offset: 时间补偿值(毫秒)
 */
const contentLRC = 
`
[ar:A lot]
[ti:Something just like this]
[al:Memories... Do Not Open]
[by:RotCool]
[offset:0]
[00:05:50]I've been reading books of old
[00:10:50][00:30:51]I want to something just like this
`;
// 第二个参数不是必要的
console.log(subtitleParser.parse(contentSRT, "SRT"));
console.log(JSON.stringify(subtitleParser.parse(contentLRC, "LRC")));

它将会输出:

[
  {
    id: '1',
    startTime: { baseForm: '00:00:01,410', htmlCurrentTime: 1.41 },
    endTime: { baseForm: '00:00:04,220', htmlCurrentTime: 4.22 },
    content: '我们已经走得太远,以至于忘记了为什么出发。\n' +
      'We already walked too far, down to we had forgotten why embarked.'
  },
  {
    id: '2',
    startTime: { baseForm: '00:00:04,251', htmlCurrentTime: 4.251 },
    endTime: { baseForm: '00:00:06,234', htmlCurrentTime: 6.234 },
    content: '天下没有不散的筵席\nThere is no never-ending feast'
  }
]
{
    "content": [{
        "time": {
            "baseForm": "00:05:50",
            "minute": 0,
            "second": 5,
            "millisecond": 50,
            "htmlCurrentTime": 5.05
        },
        "content": "I've been reading books of old"
    }, {
        "time": {
            "baseForm": "00:10:50",
            "minute": 0,
            "second": 10,
            "millisecond": 50,
            "htmlCurrentTime": 10.05
        },
        "content": "I want to something just like this"
    }, {
        "time": {
            "baseForm": "00:30:51",
            "minute": 0,
            "second": 30,
            "millisecond": 51,
            "htmlCurrentTime": 30.051
        },
        "content": "I want to something just like this"
    }],
    "ar": "A lot",
    "ti": "Something just like this",
    "al": "Memories... Do Not Open",
    "by": "RotCool",
    "offset": "0"
}

格式

LRC

  1. 你可以使用 [minute:second:millisecond] 或 [minute:second.millisecond] 或
    [hour:minute:second:millisecond] 或 [hour:minute:second.millisecond]
  2. 你也可以添加你自己的标签,比如
    [copyright:RotCool]
    它会被解析成 { ... copyright: RotCool ...}
  3. 每一行必须含“[xxx]”,否则将会被忽略
  4. 内容不能包含 “[” 或 “]”

SRT

  1. 你必须用“-->”来分割开始时间和结束时间
  2. “-->”前后有多少个空格都无所谓
  3. 开始时间和结束时间只支持 "hour:minute:second,millisecond"
    "hour:minute:second:millisecond" 或者 "hour:minute:second.millisecond" 不能被解析
  4. 不管有多少零都不会被处理,比如
    0:0:1,4
    它会被解析成 { baseForm: "0:0:1,4", htmlCurrentTime: 1.004 }

你可能感兴趣的:(Node.js字幕解析器,支持typescript,支持LRC、SRT)