Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Class notes

Best Python Notes arranged according to topics

Rating
-
Sold
-
Pages
29
Uploaded on
13-09-2021
Written in
2021/2022

Best Python Notes arranged according to topics

Content preview

9/11/21, 11:20 PM 14. Regular Expression [RegEx] - Part 1 - Jupyter Notebook




What are regular expressions?
The Regex or Regular Expression is a way to define a pattern for searching or manipulating strings. We can use a regular expression to match,
search, replace, and manipulate inside textual data.


Also, Regular expressions are instrumental in extracting information from text such as log files, spreadsheets, or even textual documents.



Example 1: Write a regular expression to search digit inside a string

In [1]: 1 import re
2 string = "My roll no. is 25"
3 a = re.findall(r"\d",string)
4 a

Out[1]: ['2', '5']


Understand this example

We imported the RE module into our program
Next, We created a regex pattern d to match any digit between 0 to 9.
After that, we used the re.findall() method to match our pattern.
In the end, we got two digits 2 and 5.


Use raw string to define a regex
Note: I have used a raw string to define a pattern like this r"d". Always write your regex as a raw string.

As you may already know, the backslash has a special meaning in some cases because it may indicate an escape character or escape
sequence. To avoid that always use a raw string.




Python regex methods
localhost:8888/notebooks/innomatics all notes/all python notes/14. Regular Expression %5BRegEx%5D - Part 1.ipynb 1/29

,9/11/21, 11:20 PM 14. Regular Expression [RegEx] - Part 1 - Jupyter Notebook




1. Compile Regex Pattern using re.compile()

We can compile a regular expression into a regex object to look for occurrences of the same pattern inside
various target strings without rewriting it.


How to compile regex pattern

1. Write regex pattern in string format
2. Write regex pattern using a raw string. For example, a pattern to match any digit.


str_pattern = r'\d'


3. Pass a pattern to the compile() method


pattern = re.compile(r'\d{3})


4. It compiles a regular expression pattern provided as a string into a regex pattern object.
5. Use Pattern object to match a regex pattern
6. Use Pattern object returned by the compile() method to match a regex pattern.


res = pattern.findall(target_string)


Example to compile a regular expression

Now, let’s see how to use the re.compile() with the help of a simple example.

Pattern to compile: r'\d{3}'


What does this pattern mean?

First of all, I used a raw string to specify the regular expression pattern.
Next, \d is a special sequence and it will match any digit from 0 to 9 in a target string.
localhost:8888/notebooks/innomatics all notes/all python notes/14. Regular Expression %5BRegEx%5D - Part 1.ipynb 2/29

, 9/11/21, 11:20 PM 14. Regular Expression [RegEx] - Part 1 - Jupyter Notebook
p q y g g g
Then the 3 inside curly braces mean the digit has to occur exactly three times in a row inside the target string.
In simple words, it means to match any three consecutive digits inside the target string such as 236 or 452, or 782.


In [2]: 1 # target string
2 str1 = " Deepali lucky numbers are 894 234 456 829"
3 ​
4 #pattern to find 3 consecutive digits
5 str_pattern = r"\d{3}"
6 ​
7 #compile str_pattern to re.pattern object
8 regex_pattern = re.compile(str_pattern)
9 ​
10 #type of compile
11 print(type(regex_pattern))
12 ​
13 # find all the matches in the string 1
14 result = regex_pattern.findall(str1)
15 print(result)
16 ​
17 # target string 2
18 str2 = " Harsh lucky numbers are 678 645 234 097"
19 ​
20 # find all the matches in second string by reusing the same pattern
21 res2 = regex_pattern.findall(str2)
22 print(res2)
23 ​

<class 're.Pattern'>
['894', '234', '456', '829']
['678', '645', '234', '097']


As you can see, we found four matches of “three consecutive” digits inside the first string.

Note:

The re.compile() method changed the string pattern into a re.Pattern object that we can work upon.
Next, we used the re.Pattern object inside a re.findall() method to obtain all the possible matches of any three consecutive digits inside the
target string.
Now, the same reagex_pattern object can be used similarly for searching for three consecutive digits in other target strings as well.
localhost:8888/notebooks/innomatics all notes/all python notes/14. Regular Expression %5BRegEx%5D - Part 1.ipynb 3/29

Document information

Uploaded on
September 13, 2021
Number of pages
29
Written in
2021/2022
Type
Class notes
Professor(s)
Na
Contains
All classes
$10.99
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
lalit1

Get to know the seller

Seller avatar
lalit1
View profile
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
4 year
Number of followers
0
Documents
1
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions