sed, a stream editor (gnu.org)
离线文档man sed
和info sed
Next: Introduction, Up: (dir)
GNU 'sed'
*********
This file documents version 4.8 of GNU 'sed', a stream editor.
Copyright (C) 1998-2020 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.3 or any later version published by the Free Software
Foundation; with no Invariant Sections, no Front-Cover Texts, and
no Back-Cover Texts. A copy of the license is included in the
section entitled "GNU Free Documentation License".
* Menu:
* Introduction:: Introduction
* Invoking sed:: Invocation
* sed scripts:: 'sed' scripts
* sed addresses:: Addresses: selecting lines
* sed regular expressions:: Regular expressions: selecting text
* advanced sed:: Advanced 'sed': cycles and buffers
* Examples:: Some sample scripts
* Limitations:: Limitations and (non-)limitations of GNU 'sed'
* Other Resources:: Other resources for learning about 'sed'
* Reporting Bugs:: Reporting bugs
* GNU Free Documentation License:: Copying and sharing this manual
* Concept Index:: A menu with all the topics in this manual.
* Command and Option Index:: A menu with all 'sed' commands and
command-line options.
➜ ~ sed
Usage: sed [OPTION]... {script-only-if-no-other-script} [input-file]...
-n, --quiet, --silent
suppress automatic printing of pattern space
--debug
annotate program execution
-e script, --expression=script
add the script to the commands to be executed
-f script-file, --file=script-file
add the contents of script-file to the commands to be executed
--follow-symlinks
follow symlinks when processing in place
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)
-l N, --line-length=N
specify the desired line-wrap length for the `l' command
--posix
disable all GNU extensions.
-E, -r, --regexp-extended
use extended regular expressions in the script
(for portability use POSIX -E).
-s, --separate
consider files as separate rather than as a single,
continuous long stream.
--sandbox
operate in sandbox mode (disable e/r/w commands).
-u, --unbuffered
load minimal amounts of data from the input files and flush
the output buffers more often
-z, --null-data
separate lines by NUL characters
--help display this help and exit
--version output version information and exit
If no -e, --expression, -f, or --file option is given, then the first
non-option argument is taken as the sed script to interpret.
All remaining arguments are names of input files; if no input files are
specified, then the standard input is read.
GNU sed home page: <https://www.gnu.org/software/sed/>.
General help using GNU software: <https://www.gnu.org/gethelp/>.
当sed语句没有使用-e(--expression)
,也没有使用-f(--file)
选项指定参数
sed
)中的参数分为两类
non-option parameter
或command parameter
)option parameter
或command option parameter
)上述文档中的选项有两种
-s --separate
是一样的--vesion
,没有相应的段选项对于没有参数(形参)的选项,我们可以理解为bool型选项,表示功能开关
-s --separate
有的选项时带有参数的,使用这类选项,需要进一步指定参数,告诉选项作用的对象是谁
例如
-e script, --expression=script
add the script to the commands to be executed
-f script-file, --file=script-file
add the contents of script-file to the commands to be executed
-e
,-f
都要求传入参数
并且,对于长选项,还要求使用=
号来指定的取值,例如:--expression=script
短选项指定参数通常是用空格,但是有时候也不用空格,直接将选项的参数(实参)紧跟在选项后面
例如: -i[SUFFIX], --in-place[=SUFFIX]
又例如7zip的选项
-o{Directory} : set Output directory
-p{Password} : set Password
参数可选(具有默认行为)的选项
例如
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)
选项的参数使用中括号[]
包裹,说明是可选的
同类型参数重复
[input-file]...
input-file
是可选的...
表示这个类型的参数可以指定多个script,即sed script,是sed包含语句动作的字符串
Normally sed
is invoked like this:
sed SCRIPT INPUTFILE...
The full format for invoking sed
is:
完整一点的语法:
sed OPTIONS... [SCRIPT] [INPUTFILE...]
For example, to replace all occurrences of ‘hello’ to ‘world’ in the file input.txt:
sed 's/hello/world/' input.txt > output.txt
's/hello/world/'
input.txt
> output.txt
则是sed语句之外的部分,属于对sed处理的后续处理
If you do not specify INPUTFILE, or if INPUTFILE is -, sed
filters the contents of the standard input.
The following commands are equivalent:
sed 's/hello/world/' input.txt > output.txt
sed 's/hello/world/' < input.txt > output.txt
cat input.txt | sed 's/hello/world/' - > output.txt
cat input.txt | sed 's/hello/world/' > output.txt
可以看到,input.txt文件被写在不同的位置,上述方式都是可行的
此外,输入文件可以又多个
sed
writes output to standard output.
Use -i
to edit files in-place instead of printing to standard output.
See also the W
and s///w
commands for writing output to other files.
The following command modifies file.txt
and does not produce any output:
sed -i 's/hello/world/' file.txt
-i
,选项不是一条sed语句的必须部分-i[SUFFIX]
--in-place[=SUFFIX]
This option specifies that files are to be edited in-place.
GNU sed
does this by creating a temporary file and sending output to this file rather than to the standard output.1.
When the end of the file is reached, the temporary file is renamed to the output file’s original name.
The extension, if supplied, is used to modify the name of the old file before renaming the temporary file, thereby making a backup copy2).
也即是说,这里的[SUFFIX]相当于[Extension]的意思,指定了被修改的文件的备份文件名在源文件名追加后缀
*
, then it is appended to the end of the current filename as a suffix;*
characters, then each asterisk is replaced with the current filename.
Because -i takes an optional argument, it should not be followed by other short options:
sed -Ei '...' FILE
Same as -E -i with no backup suffix - FILE will be edited in-place without creating a backup.
sed -iE '...' FILE
This is equivalent to --in-place=E, creating FILEE as backup of FILE
sed -i.bak 's/BE/bbee/' demo.txt
.bak
,输入文件为demo.txt
demo.txt.bak
备份文件demo.txt
中的内容被script
指定的规则做了修改
BE
被替换为bbee
完整演示流程
➜ ~ cat demo.txt
1 #!/bin/sh
2 #
3 # This script should BE run via curl:
4 # sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
5 # or via wget:
➜ ~ sed -i.bak 's/BE/bbee/' demo.txt
➜ ~ ls -1l
total 28
-rw-r--r-- 1 cxxu_u22 cxxu_u22 202 Jan 1 16:42 demo.txt
-rw-r--r-- 1 cxxu_u22 cxxu_u22 200 Jan 1 16:36 demo.txt.bak
-rw-r--r-- 1 root root 7 Jan 1 13:52 etcvimrc
➜ ~ cat demo.txt.bak
1 #!/bin/sh
2 #
3 # This script should BE run via curl:
4 # sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
5 # or via wget:
➜ ~ cat demo.txt
1 #!/bin/sh
2 #
3 # This script should bbee run via curl:
4 # sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
5 # or via wget:
By default sed
prints all processed input (except input that has been modified/deleted by commands such as d
).
Use -n
to suppress output,
-n, --quiet, --silent
suppress automatic printing of pattern space
the p
command to print specific lines.
这里的p
是sed script的语法,而不是sed选项,选项应该有-
开头
p
可以作为一个标记(flag),用来指明script字符串中的那些字符是有特定含义
The following command prints only line 45 of the input file:
sed -n '45p' file.txt
sed
treats multiple input files as one long stream.
The following example prints the first line of the first file (one.txt) and the last line of the last file (three.txt).
Use -s
to reverse this behavior.
sed -n '1p ; $p' one.txt two.txt three.txt
-e
or -f
options, sed
uses the first non-option parameter as the script, and the following non-option parameters as input files.-e
or -f
options are used to specify a script, all non-option parameters are taken as input files.sed 's/hello/world/' input.txt > output.txt
sed -e 's/hello/world/' input.txt > output.txt
sed --expression='s/hello/world/' input.txt > output.txt
echo 's/hello/world/' > myscript.sed
sed -f myscript.sed input.txt > output.txt
sed --file=myscript.sed input.txt > output.txt
-E
-r
--regexp-extended
-E, -r, --regexp-extended
use extended regular expressions in the script
(for portability use POSIX -E).
Use extended regular expressions rather than basic regular expressions.
Extended regexps are those that egrep
accepts;
they can be clearer because they usually have fewer backslashes.
Historically this was a GNU extension,
but the -E
extension has since been added to the POSIX standard (http://austingroupbugs.net/view.php?id=528), so use -E for portability.
-E
might not port to other older systems.sed
scripts• sed script overview: | sed script overview |
---|---|
• sed commands list: | sed commands summary |
• The “s” Command: | sed ’s Swiss Army Knife |
• Common Commands: | Often used commands |
• Other Commands: | Less frequently used commands |
• Programming Commands: | Commands for sed gurus |
• Extended Commands: | Commands specific of GNU sed |
• Multiple commands syntax: | Extension for easier scripting |
s
,它可以应对
sed
script overviewsed
program consists of one or more sed
commands, passed in by one or more of the -e, -f, --expression, and --file options, or the first non-option argument if zero of these options are used.sed
script;sed
commands follow this syntax:
[addr]X[options]
X
is a single-letter sed
command.[addr]
is an optional line address.(range)
[addr]
is specified, the command X will be executed only on the matched lines.[addr]
can be[options]
are used for some sed
commands.sed '30,35d' input.txt > output.txt
30,35
is an address range.d
is the delete command:The following example prints all input until a line starting with the word ‘foo’ is found.
sed '/^foo/q42' input.txt > output.txt
If such line is found, sed
will terminate with exit status 42.
If such line was not found (and no other error occurred), sed
will exit with status 0.
/^foo/
is a regular-expression address.
q
is the quit command. 42
is the command option.
命令q的语法和作用:
q[exit-code]
(quit) Exit sed
without processing any more commands or input.
Commands within a script or script-file can be separated by semicolons (;
) or newlines (ASCII 10).
Multiple scripts can be specified with -e or -f options.
The following examples are all equivalent.
sed
operations:
/^foo/
,sed '/^foo/d ; s/hello/world/' input.txt > output.txt
#
sed -e '/^foo/d' -e 's/hello/world/' input.txt > output.txt
#
echo '/^foo/d' > script.sed
echo 's/hello/world/' >> script.sed
sed -f script.sed input.txt > output.txt
#-e和-f可以混合使用
echo 's/hello/world/' > script2.sed
sed -e '/^foo/d' -f script2.sed input.txt > output.txt
a
, c
, i
, due to their syntax, cannot be followed by semicolons working as command separators and thus should be terminated with newlines or be placed at the end of a script or script-file.sed commands summary:
sed commands summary
The following commands are supported in GNU sed. Some are standard POSIX commands, while other are GNU extensions. Details and examples for each command are in the following sections. (Mnemonics) are shown in parentheses.
a\
text
Append text after a line.
a text
Append text after a line (alternative syntax).
b label
Branch unconditionally to label. The label may be omitted, in which case the next cycle is started.
c\
text
Replace (change) lines with text.
.....
自行使用info sed
查阅或者在线文档查阅完整列表
s
CommandThe s
command (as in substitute) is probably the most important in sed
and has a lot of different options.
The syntax of the s
command is
's/regexp/replacement/flags'
Its basic concept is simple: the s
command attempts to match the pattern space(模式空间) against the supplied regular expression regexp;
if the match is successful, then that portion of the pattern space which was matched is replaced with replacement.
这部分内容可以实现将被匹配模式空间(输入)进行细分成多个部分(最多9个)
The replacement can contain \n
(n being a number from 1 to 9, inclusive) references, which refer to the portion of the match which is contained between the nth \(
and its matching \)
.
➜ ~ echo "a-b-"| sed 's/\(b\)-/x\1/'
a-xb
➜ ~ echo "a-b-"| sed 's/\(b\)-/x\1\1/'
a-xbb
➜ ~ echo "a-b-"| sed 's/\(b\)-/x\1\1\1/'
a-xbbb
调换分组@分组内容加括号示例
➜ ~ echo "a-b"| sed 's/\(a\)-\(b\)/\2-\1/'
b-a
➜ ~ echo "aaa-bbb"| sed 's/\(a\+\)-\(b\+\)/\2-\1/'
bbb-aaa
#这里第1组(记为\1)命中模式空间aaa部分
#第2组(记为\2)命中模式空间bbb部分
#整个替换regep串命中模式空间整个串
➜ ~ echo "aaabbb"| sed 's/\(a\+\)\(b\+\)/\2-\1/'
bbb-aaa
➜ ~ echo "aaabbb"| sed 's/\(a\+\)\(b\+\)/\2\1/'
bbbaaa
➜ ~ echo "aaabbb"| sed 's/\(a\+\)\(b\+\)/[\2](\1)/'
[bbb](aaa)
使用sed扩展正则选项
注意,扩展正则只对sed script 中的regexp
部分起作用
replacement
部分依然是非扩展的正则语法,比起同时启用扩展语法,这往往是方便的
➜ ~ echo "aaabbb"| sed -E 's/(a+)(b+)/{\2}(\1)/'
{bbb}(aaa)
值得注意的是,regex命中模式空间的次数和分组编号是独立的
每一次命中都独立计数分组编号
而s
命令默认只处理行的第一次出现(匹配命中)的部分
如果行内后续又出现regexp匹配命中的模式空间字串,则需要借助参数g
➜ ~ echo "aaabbbaabb"| sed 's/\(a\+\)\(b\+\)/[\2](\1)/'
[bbb](aaa)aabb
➜ ~ echo "aaabbbaabb"| sed 's/\(a\+\)\(b\+\)/[\2](\1)/g'
[bbb](aaa)[bb](aa)
Also, the replacement can contain unescaped &
characters which reference the whole matched portion of the pattern space.
也就是说,在replacement
表达式中使用(出现)&
,这些地方会被替换为模式空间被regexp
匹配这个部分(所有分组)
➜ ~ echo "aaabbb"| sed -E 's/(a+)(b+)/{\2}(\1)&/'
{bbb}(aaa)aaabbb
➜ ~ echo "aaabbbCCC"| sed -E 's/(a+)(b+)/{\2}&(\1)/'
{bbb}aaabbb(aaa)CCC
➜ ~ echo "aaabbb"| sed -E 's/(a+)(b+)/{\2}&(\1)/'
{bbb}aaabbb(aaa)
The /
characters may be uniformly replaced by any other single character within any given s
command.
The /
character (or whatever other character is used in its stead) can appear in the regexp or replacement only if it is preceded by a \
character.
/
需要转移成\/
Finally, as a GNU sed
extension, you can include a special sequence made of a backslash and one of the letters L
, l
, U
, u
, or E
.
The meaning is as follows:(作用在replacement
中的字符)
\L
Turn the replacement to lowercase until a \U
or \E
is found,
\l
Turn the next character to lowercase,
\U
Turn the replacement to uppercase until a \L
or \E
is found,
\u
Turn the next character to uppercase,
\E
Stop case conversion started by \L
or \U
.
➜ ~ echo "aabb"| sed -E 's/(a+)(b+)/{\2\U}(\1)/'
{bb}(AA)
➜ ~ echo "aabb"| sed -E 's/(a+)(b+)/\1\u\2/'
aaBb
# 分组\1(即aa)后的第一个字符被设为大写(如果是英文字符的话),本例是\2分组(即bb)的第一个字母b被设为B
➜ ~ echo "aabb"| sed -E 's/(a+)(b+)/\2\u\1/'
bbAa
# \u等不会改变字符长度,只控制英文字母大小写,\n则可能会(n=1,2,...,9)
➜ ~ echo "aabb"| sed -E 's/(a+)(b+)/\2\ut\1/'
bbTaa
➜ ~ echo "aabb"| sed -E 's/(a+)(b+)/\1\u\u\2/'
aaBb
➜ ~ echo "aabbbcccddd"| sed -E 's/(a+)(b+)(.*)/\1\U\2\l\3/'
aaBBBcCCDDD
➜ ~ echo "aabbbcccddd"| sed -E 's/(a+)(b+)(.*)/\1\U\2\E\3/'
aaBBBcccddd
➜ ~
When the g
flag is being used, case conversion does not propagate from one occurrence of the regular expression to another.
For example, when the following command is executed with a-b-
in pattern space:
假设模式空间中的字符串为a-b-
,使用如下的sed script进行处理
s/\(b\?\)-/x\u\1/g
regexp:\(b\?\)-
b-
或-
b
(b)-
和()-
,可以将括号内的部分串标记成组,可以做后续的指代本例中,对于模式空间a-b-
中的可以被命中两次:
-
,b-
(第一个字符a
不被命中,原样输出)
字符 | a | - | b | - |
---|---|---|---|---|
索引[i],即第i个字符 | 1 | 2 | 3 | 4 |
第一个分组内容是空
-
第一次命中模式,分组括号内为空第二个分组内容是字符b
b
➜ ~ echo "a-b-"| sed 's/\(b\?\)-/x\u\1/'
axb-
➜ ~ echo "a-b-"| sed 's/\(b\?\)-/x\u/'
axb-
#正则扩展语法的写法
➜ ~ echo "a-b-"| sed -E 's/(b?)-/x\u\1/'
axb-
➜ ~ echo "a-b-"| sed 's/\(b\?\)-/x\u\1/g'
axxB
#打开扩展正则,简写为:
➜ ~ echo "a-b-"| sed -E 's/(b?)-/x\u\1/g'
axxB
the output is ‘axxB’. (输出结果为axxB
)
When replacing the first ‘-’,
x
character that is added to pattern space when replacing b-
with xB
.-
)的\1
分组是空串,
\u
后的首字符大写处理(即\1
的首字符,本次为空串,所以无效果)-
被替换为x
+空串
b-
,\1
分组是b
\u
后的首字符大写处理(即\1
的首字符,本次为字符b
,替换为B
)-
被替换为x
+B
,即xB
On the other hand, \l
and \u
do affect the remainder of the replacement text if they are followed by an empty substitution.
➜ ~ echo "a-b-"| sed 's/\(b\?\)-/\u\1x/g'
aXBx
#优化分组显示
➜ ~ echo "a-b-"| sed 's/\(b\?\)-/(\u\1x)/g'
a(X)(Bx)
With‘a-b-
in pattern space,
the following command:
s/\(b\?\)-/\u\1x/g
will replace ‘-’ with ‘X’ (uppercase) and ‘b-’ with ‘Bx’. If this behavior is undesirable, you can prevent it by adding a ‘\E’ sequence—after ‘\1’ in this case.
To include a literal \
, &
, or newline in the final replacement, be sure to precede the desired \
, &
, or newline in the replacement with a \
.
s
的标志(flags)The s
command can be followed by zero or more of the following flags:
g
Apply the replacement to all matches to the regexp, not just the first.
number
Only replace the numberth match of the regexp.interaction in s
command Note: the POSIX standard does not specify what should happen when you mix the g
and number modifiers, and currently there is no widely agreed upon meaning across sed
implementations. For GNU sed
, the interaction is defined to be: ignore matches before the numberth, and then match and replace all matches from the numberth on.
p
If the substitution was made, then print the new pattern space.Note: when both the p
and e
options are specified, the relative ordering of the two produces very different results. In general, ep
(evaluate then print) is what you want, but operating the other way round can be useful for debugging. For this reason, the current version of GNU sed
interprets specially the presence of p
options both before and after e
, printing the pattern space before and after evaluation, while in general flags for the s
command show their effect just once. This behavior, although documented, might change in future versions.
w filename
If the substitution was made, then write out the result to the named file. As a GNU sed
extension, two special values of filename are supported: /dev/stderr, which writes the result to the standard error, and /dev/stdout, which writes to the standard output.3
e
This command allows one to pipe input from a shell command into pattern space. If a substitution was made, the command that is found in pattern space is executed and pattern space is replaced with its output. A trailing newline is suppressed; results are undefined if the command to be executed contains a NUL character. This is a GNU sed
extension.
I
i
The I
modifier to regular-expression matching is a GNU extension which makes sed
match regexp in a case-insensitive manner.
M
m
The M
modifier to regular-expression matching is a GNU sed
extension which directs GNU sed
to match the regular expression in multi-line mode. The modifier causes ^
and $
to match respectively (in addition to the normal behavior) the empty string after a newline, and the empty string before a newline. There are special character sequences (\`` and
'`) which always match the beginning or the end of the buffer. In addition, the period character does not match a new-line character in multi-line mode.
\%regexp%
(The %
may be replaced by any other single character.)
This also matches the regular expression regexp, but allows one to use a different delimiter than /
. This is particularly useful if the regexp itself contains a lot of slashes, since it avoids the tedious escaping of every /
.
If regexp itself includes any delimiter characters, each must be escaped by a backslash (\
).
The following commands are equivalent.
They print lines which start with ‘/home/alice/documents/’:
sed -n '/^\/home\/alice\/documents\//p'
sed -n '\%^/home/alice/documents/%p'
sed -n '\;^/home/alice/documents/;p'
sed --debug
Print the input sed program in canonical form, and annotate program execution.
$ echo 1 | sed '\%1%s21232'
3
$ echo 1 | sed --debug '\%1%s21232'
SED PROGRAM:
/1/ s/1/3/
INPUT: 'STDIN' line 1
PATTERN: 1
COMMAND: /1/ s/1/3/
PATTERN: 3
END-OF-CYCLE:
3
- 学会注释指定行
- 再指定行下方插入若干文本(文本行)
如果有具体的需求,可以把需求细化,然后分别搜索解决方案,这有望使用最少的时间来解决需求
下面这段代码,能够为你执行安装oh my zsh shell框架
涉及的sed操作如下
注释匹配到的特定行(匹配和替换)
在指定行下执行插入操作(插入)
将处理的结果就地保存,其在保存修改之前做一个备份(文件名为E
结尾)
# 工作目录设定为用户家目录
cd ~
sudo apt update
sudo apt install zsh curl git man wget -y
wget https://gitee.com/mirrors/oh-my-zsh/raw/master/tools/install.sh
# 由于国内网络问题,可能需要多尝试几次一下source 命令才可以安装成功.(我将其注释掉,采用换源后再执行clone
#source install.sh
#本段代码将修改install.sh中的拉取源,以便您能够冲gitee上成功将需要的文件clone下来.
# 本段代码会再修改前做备份(备份文件名为install.shE)
sed '/(^remote)|(^repo)/I s/^#*/#/ ;
/^#*remote/I a\
REPO=${REPO:-mirrors/oh-my-zsh}\
REMOTE=${REMOTE:-https://gitee.com/${REPO}.git} ' -r -iE ~/install.sh
# 执行安装
source install.sh
#返回到脚本所在目录(以便执行新的脚本)
cd -
该sed中附属命令n(next),表示跳过上一行,直接处理下一行
sed处理文笔是一行行读入(input);默认情况下,是每读取一行就处理(执行sed 处理脚本)一行
但是如果使用了 n;
,就可以跳行处理,每使用一个 n;
,效果相当于执行跳跃步长+1;默认步长为1
;
是sed script中表示一条指令的结束)n;
每间隔0行处理一次(即,每行都处理(默认行为));n;
每间隔1行处理一次;n;n;
每间隔2行处理一次;If auto-print is not disabled, print the pattern space, then, regardless, replace the pattern space with the next line of input. If there is no more input then sed
exits without processing any more commands.
This command is useful to skip lines (e.g. process every Nth line).
Example: perform substitution on every 3rd line (i.e. two n
commands skip two lines):
$ seq 6 | sed 'n;n;s/./x/'
1
2
x
4
5
x
sed
provides an extension address syntax of first~step to achieve the same result:$ seq 6 | sed '0~3s/./x/'
1
2
x
4
5
x
^
)和结尾 $
)sed
scripts
sed
script overviewsed
commands summarys
Commandsed
gurussed
sed
sed
: cycles and buffers
sed
Worksbash
Environmentsed
’s Limitations and Non-limitationssed