笔记一

笔记源自 RegexOne

需要转义的特殊字符: * . ? + $ ^ [ ] ( ) { } | \ /

  • \s 匹配所有空格字符,等同于: [\t\n\f\r\p{Z}]
  • \f 换页符
  • \n 换行符
  • \r 回车符
  • \t 制表符
  • \p 等同于\r\n, CRLF DOS 行终止符

Tutorials

Lesson 1 The 123s

\d any single digit character

\D any single non-digit character

Lesson 2: The Dot

. any single character

\. period

Lesson 3: Matching specific characters

[abc] only a, b, c single character

Lesson 4: Excluding specific characters

[^abc] not a, b or c

Lesson 5: Character ranges

[a-z] characters a to z

[0-9] number 0 to 9

\w 字母,数字,下划线。等价于[A-Za-z0-9_]

\W 等价于\[^A-Za-z0-9_]

Lesson 6: Catching some zzz’s

{m} m repetitions

{m, n} m to n repetitions

Examples:

w{3} (three w)

[wxy]{5} (five characters, each of which can be a w, x, or y)

Lesson 7: Mr. Kleene, Mr. Kleene

* zero or more repetitions *前的字符可以重复 0 次或者更多次

+ one or more repetitions +前的字符可以重复 1 次或者更多次

Match aaaabcc

Match aabbbbc

Match aacc

Skip a

answer: aa+b*c+

Lesson 8: Characters optional

? optional character ?前的字符可以出现 0 次或者 1 次

ab?c will match either the strings “abc” or “ac

Lesson 9: All this whitespace

\s 代替 any whitespace 包括 space, tab(\t), new line(\n), return(\r)

\S 相反

Lesson 10: Starting and ending

^...$ starts and ends

^Mission: successful$ 文本必须以 Mission: 开头,successful 结尾

Lesson 11: Match groups

捕获组:

(...):匹配字符并创建捕获组

(a(bc)) capture sub-group

(.*) capture all

非捕获组:

(?:...): 匹配字符但不创建捕获组

Lesson 14: It’s all conditional

(abc|def) matches abc or def

笔记二

下面的笔记源自 github 正则表达式教程

4. 零宽度断言

正先行断言 (positive lookahead):(?=...),匹配位置的后面有指定模式

负先行断言 (negative lookahead):(?!...),匹配位置的后面没有指定模式

正后发断言 (positive lookbehind):(?<=...) 匹配位置的前面有指定模式

负后发断言 (negative lookbehind):(?<!...) 匹配位置的前面没有指定模式

比如对于字符串 foobarbarfoo

bar(?=bar)     finds the 1st bar ("bar" which has "bar" after it)
bar(?!bar)     finds the 2nd bar ("bar" which does not have "bar" after it)
(?<=foo)bar    finds the 1st bar ("bar" which has "foo" before it)
(?<!foo)bar    finds the 2nd bar ("bar" which does not have "foo" before it)
(?<=foo)bar(?=bar)    finds the 1st bar ("bar" with "foo" before it and "bar" after it)

5. 标志

/pattern/flags, 其中 flags 有:

  • i 忽略大小写。
  • g 全局搜索,返回全部匹配。
  • m 多行修饰符:锚点元字符 ^ $ 工作范围在每行的起始。

6. 贪婪匹配和惰性匹配

默认是贪婪匹配,使用?转化为惰性匹配。

"/(.*at)/" => The fat cat sat on the mat. "/(.*?at)/" => The fat cat sat on the mat.