Python Regular Expressions (Regex)
Regular expressions (regex) are sequences of characters that define a search pattern. They are primarily used for string matching and manipulation in Python. Python has a built-in package called re for working with regular expressions.
What is a Regular Expression?
A regular expression is a special string used to match patterns in text. It's used for tasks such as searching, replacing, and validating string patterns. Regex is widely used in text processing, web scraping, and data validation.
Syntax of Regular Expressions
Regular expressions follow a set of syntactical rules. Some common regex metacharacters are:
- . : Matches any character except a newline
- ^ : Matches the start of a string
- $ : Matches the end of a string
- * : Matches 0 or more repetitions of the previous character
- + : Matches 1 or more repetitions of the previous character
- ? : Matches 0 or 1 repetition of the previous character
- [] : Matches any one of the characters inside the brackets
- | : Alternation, meaning "or"
- : Escape special characters
Using Regular Expressions in Python
Python provides the re module to work with regular expressions. Some common functions provided by the re module include:
- re.match() : Checks if the regex pattern matches the beginning of the string
- re.search() : Searches the string for the first location where the regex pattern matches
- re.findall() : Finds all occurrences of the pattern in the string
- re.sub() : Replaces the occurrences of the pattern with a new string
- re.split() : Splits the string wherever the regex pattern matches
Examples
1. Matching a Pattern
import re
# Check if the pattern matches the start of the string
pattern = "^Hello"
string = "Hello, World!"
match = re.match(pattern, string)
if match:
print("Match found!")
else:
print("Match not found!")
Output
2. Finding All Occurrences of a Pattern
import re
# Find all occurrences of the pattern 'o' in the string
pattern = "o"
string = "Hello, World!"
matches = re.findall(pattern, string)
print(matches)
Output
3. Replacing a Pattern in a String
import re
# Replace 'World' with 'Python' in the string
pattern = "World"
string = "Hello, World!"
new_string = re.sub(pattern, "Python", string)
print(new_string)
Output
4. Splitting a String Using a Pattern
import re
# Split the string at each occurrence of the pattern 'o'
pattern = "o"
string = "Hello, World!"
result = re.split(pattern, string)
print(result)
Output
Python’s regular expressions allow you to perform a wide range of string matching and manipulation tasks efficiently. Understanding the various functions and how to apply regex patterns will make it easier to work with complex strings in your Python projects.