正则表达式的一些用法例子

1. 提取网址的base address

$ url="http://www.flickr.com/search/?q=linux"
$ echo $url | egrep -o "https?://[a-z.]+"
http://www.flickr.com
$ echo $url | egrep -o "https?://[a-z\.]+"
http://www.flickr.com
dot (点)在  [] 里面不再表示任意一个字符,而是表示点本身。

2. 对网页里面的演员列表排序:

$ lynx -dump http://www.johntorres.net/BoxOfficefemaleList.html | grep -o "Rank-.*" | sed 's/Rank-//; s/\[[0-9]\+\]//' | sort -nk 1
1   Keira Knightley
2   Natalie Portman
3   Monica Bellucci
4   Bonnie Hunt
5   Cameron Diaz
6   Annie Potts
7   Liv Tyler
8   Julie Andrews
9   Lindsay Lohan
10   Catherine Zeta-Jones
11   Cate Blanchett
12   Sarah Michelle Gellar
13   Carrie Fisher
14   Shannon Elizabeth
15   Julia Roberts
16   Sally Field
17   Téa Leoni
18   Kirsten Dunst
19   Rene Russo
20   Jada Pinkett
21   Helen Hunt
22   Halle Berry
23   Kate Winslet
24   Margot Kidder
25   Elizabeth Perkins
26   Lucy Liu
27   Geena Davis
28   Rosie O'Donnell
29   Drew Barrymore
30   Sandra Bullock
31   Tia Carrere
32   Julia Stiles
33   Jane Fonda
34   Renée Zellweger
35   Demi Moore
36   Kathy Bates
37   Kate Beckinsale
38   Lea Thompson
39   Talia Shire
40   Queen Latifah
41   Denise Richards
42   Glenn Close
43   Meg Ryan
44   Whoopi Goldberg
45   Nicole Kidman
46   Jennifer Lopez
47   Jennifer Love Hewitt
48   Laura Dern
49   Mary Elizabeth Mastrantonio
50   Jennifer Aniston
51   Alicia Silverstone
52   Laura Linney
53   Elizabeth Hurley
54   Ashley Judd
55   Michelle Pfeiffer
56   Bette Midler
57   Diane Keaton
58   Sigourney Weaver
59   Jennifer Tilly
60   Jodie Foster
61   Courteney Cox
62   Angelina Jolie
63   Neve Campbell
64   Meryl Streep
65   Julianne Moore
66   Goldie Hawn
67   Linda Hamilton
68   Elisabeth Shue
69   Tara Reid
70   Kim Basinger
71   Annette Bening
72   Kristin Scott Thomas
73   Jeanne Tripplehorn
74   Rachel Weisz
75   Gwyneth Paltrow
76   Teri Garr
77   Jamie Lee Curtis
78   Nia Long
79   Madonna
80   Madeleine Stowe
81   Angela Bassett
82   Reese Witherspoon
83   Selma Blair
84   Kirstie Alley
85   Kathleen Quinlan
86   Susan Sarandon
87   Salma Hayek
88   Debra Winger
89   Winona Ryder
90   Charlize Theron
91   Valeria Golino
92   Sharon Stone
93   Jami Gertz
94   Christina Ricci
95   Marisa Tomei
96   Uma Thurman
97   Diane Lane
98   Jennifer Connelly
99   Nancy Travis
100   Heather Graham
101   Sophie Marceau
102   Jessica Lange
103   Kate Hudson
104   Andie MacDowell
105   Naomi Watts
106   Jennifer Jason Leigh
上面的命令也可以写成:

lynx -dump http://www.johntorres.net/BoxOfficefemaleList.html | grep -o "Rank-.*" | sed 's/Rank-//; s/[[0-9]\+]//' | sort -nk 1


3. 提取网页里面的title 信息:

$ echo "<title>Taiwan becomes the third Chinese special administration area following Hongkong and Macau</title><title>Chinese forces destroy Japan's most powerful fleet in South Pacific</title>" |
> sed 's:</title>:&\n:' | sed 's:.*<title>\([^<]*\).*:\1:'
Taiwan becomes the third Chinese special administration area following Hongkong and Macau
Chinese forces destroy Japan's most powerful fleet in South Pacific


后面的 sed 里面的 \([^<]\)*  不能改成 \([^<]\).*

4. 提取 IP地址:

$ ifconfig wlan0 | egrep -o "inet addr:[^ ]*" | grep -o "[0-9.]*"
192.168.1.3





   

你可能感兴趣的:(正则表达式的一些用法例子)