笔记一
笔记源自 RegexOne
需要转义的特殊字符:
* . ? + $ ^ [ ] ( ) { } | \ /
\s
匹配所有空格字符,等同于: [\t\n\f\r\p{Z}]\f
换页符\n
换行符\r
回车符\t
制表符\p
等同于\r\n, CRLF DOS 行终止符
Tutorials
Lesson 1 The 123s
\d
any single digit character
\D
any single non-digit character
Lesson 2: The Dot
.
any single character
\.
period
Lesson 3: Matching specific characters
[abc]
only a, b, c single character
Lesson 4: Excluding specific characters
[^abc]
not a, b or c
Lesson 5: Character ranges
[a-z]
characters a to z
[0-9]
number 0 to 9
\w
字母,数字,下划线。等价于[A-Za-z0-9_]
\W
等价于\[^A-Za-z0-9_]
Lesson 6: Catching some zzz’s
{m}
m repetitions
{m, n}
m to n repetitions
Examples:
w{3}
(three w)
[wxy]{5}
(five characters, each of which can be a w, x, or y)
Lesson 7: Mr. Kleene, Mr. Kleene
*
zero or more repetitions *前的字符可以重复 0 次或者更多次
+
one or more repetitions +前的字符可以重复 1 次或者更多次
Match aaaabcc
Match aabbbbc
Match aacc
Skip a
answer: aa+b*c+
Lesson 8: Characters optional
?
optional character ?前的字符可以出现 0 次或者 1 次
ab?c
will match either the strings “abc
” or “ac
”
Lesson 9: All this whitespace
\s
代替 any whitespace 包括 space, tab(\t), new line(\n), return(\r)
\S
相反
Lesson 10: Starting and ending
^...$
starts and ends
^Mission: successful$
文本必须以 Mission: 开头,successful 结尾
Lesson 11: Match groups
捕获组:
(...)
:匹配字符并创建捕获组
(a(bc))
capture sub-group
(.*)
capture all
非捕获组:
(?:...)
: 匹配字符但不创建捕获组
Lesson 14: It’s all conditional
(abc|def)
matches abc or def
笔记二
下面的笔记源自 github 正则表达式教程
4. 零宽度断言
正先行断言 (positive lookahead):(?=...)
,匹配位置的后面有指定模式
负先行断言 (negative lookahead):(?!...)
,匹配位置的后面没有指定模式
正后发断言 (positive lookbehind):(?<=...)
匹配位置的前面有指定模式
负后发断言 (negative lookbehind):(?<!...)
匹配位置的前面没有指定模式
比如对于字符串 foobarbarfoo
bar(?=bar) finds the 1st bar ("bar" which has "bar" after it)
bar(?!bar) finds the 2nd bar ("bar" which does not have "bar" after it)
(?<=foo)bar finds the 1st bar ("bar" which has "foo" before it)
(?<!foo)bar finds the 2nd bar ("bar" which does not have "foo" before it)
(?<=foo)bar(?=bar) finds the 1st bar ("bar" with "foo" before it and "bar" after it)
5. 标志
/pattern/flags
, 其中 flags 有:
- i 忽略大小写。
- g 全局搜索,返回全部匹配。
- m 多行修饰符:锚点元字符 ^ $ 工作范围在每行的起始。
6. 贪婪匹配和惰性匹配
默认是贪婪匹配,使用?
转化为惰性匹配。
"/(.*at)/"
=> The fat cat sat on the mat.
"/(.*?at)/"
=> The fat cat sat on the mat.