Python 正则表达式入门，通过 re 模块使用正则表达式

正则表达式是一种强大的文本处理工具，它允许你使用特殊字符和语法来描述匹配文本模式的规则。在 Python 中，你可以通过 re 模块使用正则表达式，它提供了一系列函数来搜索、匹配和修改文本。

匹配字符

普通字符在正则表达式中直接匹配自身，例如 a 匹配字符 ‘a’。特殊字符如 . 和 * 有特殊含义。

import re

pattern = r"a"
string = "apple"
match = re.search(pattern, string)
print(match.group())  # 输出 'a'

重复匹配

*：匹配前面的字符 0 次或多次。
+：匹配前面的字符 1 次或多次。
?：匹配前面的字符 0 次或 1 次。
{m,n}：匹配前面的字符至少 m 次，但不超过 n 次。

pattern = r"ab*"
string = "abbb"
match = re.search(pattern, string)
print(match.group())  # 输出 'abbb'

使用元字符

元字符是正则表达式中的特殊字符，用于表示特殊的匹配规则。例如：

.：匹配除换行符以外的任何字符。
^：匹配字符串的开头。
$：匹配字符串的结尾。
[]：匹配方括号内的任意一个字符。

pattern = r"."
string = "apple"
match = re.search(pattern, string)
print(match.group())  # 输出 'a'

修改字符串

正则表达式不仅可以用于匹配，还可以用于修改字符串，例如替换和分割。

re.sub(pattern, repl, string)：使用 repl 替换 string 中所有匹配 pattern 的子串。
re.split(pattern, string)：根据匹配 pattern 的子串拆分 string。

pattern = r"\s+"
string = "hello   world"
new_string = re.sub(pattern, "-", string)
print(new_string)  # 输出 'hello-world'

split_string = re.split(pattern, string)
print(split_string)  # 输出 ['hello', 'world']

一些常见示例

import re

# 示例字符串
text = "Contact us at support@tellmethecode.com or call 123-456-7890."

# 匹配电子邮件地址
email_pattern = r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
email_match = re.search(email_pattern, text)
if email_match:
    print(f"Email found: {email_match.group()}")  # 输出 'support@tellmethecode.com'

# 匹配电话号码
phone_pattern = r"\d{3}-\d{3}-\d{4}"
phone_match = re.search(phone_pattern, text)
if phone_match:
    print(f"Phone number found: {phone_match.group()}")  # 输出 '123-456-7890'

# 替换电话号码
new_text = re.sub(phone_pattern, "XXX-XXX-XXXX", text)
print(new_text)  # 输出 'Contact us at support@tellmethecode.com or call XXX-XXX-XXXX.'

# 拆分字符串
split_text = re.split(r"\s+", text)
print(split_text)  # 输出 ['Contact', 'us', 'at', 'support@tellmethecode.com', 'or', 'call', '123-456-7890.']

总结

正则表达式是一种非常强大的字符串处理工具，但它们也有局限性。对于一些复杂的字符串处理任务，虽然可以使用正则表达式完成，但可能会变得非常复杂和难以维护。在这些情况下，编写直接的 Python 代码可能会更加清晰和易于理解，虽然效率上会比精心设计的正则表达式慢。

匹配字符

重复匹配

使用元字符

修改字符串

一些常见示例

总结

相关文章：

本文标签: