FAQ: Introduction to Regular Expressions - Review

This community-built FAQ covers the “Review” exercise from the lesson “Introduction to Regular Expressions”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Practical Data Cleaning

FAQs on the exercise Review

You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Although I found the “official” possible code solution quite complex , I admit that from an educational point of view it is fine because it makes us review all the regular expressions we’ve learned so far.
However, in order to understand better that solution, I tried to split it into parts ; I wrote those parts in a vertical order. Each part corresponds to each seperate character (or order of characters):

1?
\s?
(?\d{3})?
[-.\s]?
\d{3}
[-.\s]?
\d{4}

If I have split the solution regex in the correct order, what I still find difficult to explain is the third part i.e. (?\d{3})?
My questions are:

  • to how many characters does this ‘subregex’ refer to? It seems that it refers to five but I am not too sure.
  • what is the use of the first (before the parentheses) and the last backslash \. If they are not a part of shorthand character classes, what they are?
  • what is the use of the first question mark ? The way it is written (? it is as it is an optional quantifier i.e. indicating that the use of the first parentheses as a single character is optional. However, this parentheses here is not used as a single character but only to note - with the last parentheses - that we want to group. So, what does the ? stand for?

Sorry for the long question , I just try to figure out the regex solution.

1 Like

For what it is worth, my solution is way simpler than the official one!

(\d|\W){10,14}

17 Likes

Here is my solution:
(\d|\W){1,}

5 Likes

Nice way to simplify this exercise

This is my solution, seemed to work with 100% accuracy (\d|[(]|[)]|.|-|\s)*

2 Likes

I used the below solution to complete the exercise.

.+\d

But, I’m still not 100% sure why it works. The wildcard can be followed by a Kleene Star or Plus, then any digit.

At first I thought that ‘\d’ after the wild card was allowing any character as long as the next character was a digit, thus all of the correct strings matched. However if I switched (\d) with any word character (\w), I thought a similar result would happen but with word characters (e.g. all of the strings on the right with characters only would match, and digits would not), but instead the result matches all strings.

Can anyone explain why this works, as well as why ‘\w’ does not?

4 Likes

(\W|\d)+

Most simple solution. Try to beat me+
woo+

2 Likes

\w: matches a single uppercase character, lowercase character, digit or underscore
\d: matches a single digit character

Since these strings don’t have a number at the end of them, they won’t be matched by .+\d

  • :no_entry_sign:wildebeest
  • :no_entry_sign:hippopotamus
  • :no_entry_sign:woolly mammoth

Yet they’re matched by .+\w since they have at least a letter at the end of them.

  • :white_check_mark:wildebeest
  • :white_check_mark:hippopotamus
  • :white_check_mark:woolly mammoth

.+\w will match any string longer than or equal to 2 characters.
.+\d will only match strings longer than or equal to 2 characters that have at least a number at the end of them.

.+\d :

  • :no_entry_sign:wildebeest
  • :no_entry_sign:hippopotamus
  • :no_entry_sign:woolly mammoth
  • :white_check_mark:wildebeest1
  • :white_check_mark:hippopotamus1
  • :white_check_mark:woolly mammoth1
8 Likes

but it is matching all inputs … :astonished:

2 Likes

nice explanation ,keep it up

2 Likes

Here is my convoluted solution:
\(?1?[-.\s]*\d{3}\)?[-.\s]*\d{3}[-.\s]*\d{4}
which basically says:

  • \(? Opening parentheses for area code is optional.

  • 1? The 1 before certain phone numbers is optional.

  • [-.\s]* Zero or more occurrences of whitespace or- or .

  • \d{3} Three digits.

  • \)? Closing parentheses for area code is optional.

3 Likes

Here’s my solution which seems a lot more complicated than most of the other solutions here :frowning:

(\d)\d(\s|-|.)?-?\d*(\s|-|.)?\d*\s?\d*

Cheers :slight_smile:Andy

here’s what i came up with

[^a-z]*

Hope it helps

5 Likes

Did someone clicked on the URL…It was worth Unlocking it
onyourexcitingjourneylearningtocodeyouwillfindthis
@sonnynomnom Knowledge :open_book::blue_book:

2 Likes

How about this

[^a-z]*

1 Like

that was hard for me. i could not :frowning:

Here is mine:
.?\d.*

2 Likes

.*\d

I think this is the simplest way (only 4 characters)

2 Likes

This doesn’t seems to work. Check again.