Regular Expressions (Regex) in Python

Regular expressions (Regex) are sequences of characters that define search patterns. They are widely used for pattern matching and string manipulation tasks such as validation, searching, and substitution.


Importing the re Module

To use regex in Python, you need to import the re module:

import re

Common Regex Functions

re.match()

Searches for a match only at the beginning of the string.

import re
pattern = r"^Hello"
result = re.match(pattern, "Hello, world!")
if result:
    print("Match found!")

re.search()

Searches for a match anywhere in the string.

result = re.search(r"world", "Hello, world!")
if result:
    print("Match found!")

re.findall()

Finds all occurrences of a pattern in a string.

matches = re.findall(r"\d+", "The price is 45 dollars and 30 cents.")
print(matches)  # Output: ['45', '30']

re.finditer()

Returns an iterator yielding match objects for all matches.

for match in re.finditer(r"\d+", "The price is 45 dollars and 30 cents."):
    print(match.group())

re.sub()

Replaces occurrences of a pattern with a specified string.

result = re.sub(r"\d+", "[NUMBER]", "Item1: 123, Item2: 456")
print(result)  # Output: "Item1: [NUMBER], Item2: [NUMBER]"
Info

Use re.compile() to compile a regex pattern for reuse, improving readability and efficiency.


Special Characters and Meta-characters

Regex uses special characters to define patterns:

CharacterDescription
.Matches any character except a newline.
^Matches the beginning of a string.
$Matches the end of a string.
*Matches 0 or more repetitions.
+Matches 1 or more repetitions.
?Matches 0 or 1 repetition.
{n}Matches exactly n repetitions.
{n,}Matches n or more repetitions.
{n,m}Matches between n and m repetitions.
[]Matches any character inside the brackets.
|Acts as an OR operator.
()Groups patterns.

Examples of Regex Usage

Email Validation

pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
email = "example@example.com"
if re.match(pattern, email):
    print("Valid email!")
else:
    print("Invalid email!")

Extracting URLs

text = "Visit https://example.com or http://example.org."
urls = re.findall(r"https?://[\w.-]+", text)
print(urls)  # Output: ['https://example.com', 'http://example.org']

Splitting a String

result = re.split(r",\s*", "apple, banana, cherry")
print(result)  # Output: ['apple', 'banana', 'cherry']
Deep Dive

Regex is powerful but can become complex. Use tools like regex101.com to test and debug your patterns interactively.

Task

Practice: Regex Exercises

  1. Find All Words Starting with a Capital Letter:

    • Input: "The quick Brown Fox jumps Over the lazy Dog."
    • Pattern: \b[A-Z][a-z]*\b

    Example:

    text = "The quick Brown Fox jumps Over the lazy Dog."
    words = re.findall(r"\b[A-Z][a-z]*\b", text)
    print(words)  # Output: ['The', 'Brown', 'Fox', 'Over', 'Dog']
    
  2. Validate Phone Numbers:

    • Input: "+1-800-555-1234"
    • Pattern: ^\+\d{1,3}-\d{3}-\d{3}-\d{4}$

    Example:

    phone = "+1-800-555-1234"
    if re.match(r"^\+\d{1,3}-\d{3}-\d{3}-\d{4}$", phone):
        print("Valid phone number!")
    else:
        print("Invalid phone number!")
    
  3. Extract Hashtags from a Tweet:

    • Input: "#Python is amazing! #coding #regex"
    • Pattern: #\w+

    Example:

    tweet = "#Python is amazing! #coding #regex"
    hashtags = re.findall(r"#\w+", tweet)
    print(hashtags)  # Output: ['#Python', '#coding', '#regex']
    

Conclusion

Regex is an essential tool for any Python developer dealing with text processing. With practice, you can unlock its full potential for various real-world applications.

Copyright © 2025 Devship. All rights reserved.

Made by imParth