You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply () below.
If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply () below!
Agree with a comment or answer? Like () to up-vote the contribution!
Although I found the “official” possible code solution quite complex , I admit that from an educational point of view it is fine because it makes us review all the regular expressions we’ve learned so far.
However, in order to understand better that solution, I tried to split it into parts ; I wrote those parts in a vertical order. Each part corresponds to each seperate character (or order of characters):
1?
\s?
(?\d{3})?
[-.\s]?
\d{3}
[-.\s]?
\d{4}
If I have split the solution regex in the correct order, what I still find difficult to explain is the third part i.e. (?\d{3})?
My questions are:
to how many characters does this ‘subregex’ refer to? It seems that it refers to five but I am not too sure.
what is the use of the first (before the parentheses) and the last backslash \. If they are not a part of shorthand character classes, what they are?
what is the use of the first question mark ? The way it is written (? it is as it is an optional quantifier i.e. indicating that the use of the first parentheses as a single character is optional. However, this parentheses here is not used as a single character but only to note - with the last parentheses - that we want to group. So, what does the ? stand for?
Sorry for the long question , I just try to figure out the regex solution.
I used the below solution to complete the exercise.
.+\d
But, I’m still not 100% sure why it works. The wildcard can be followed by a Kleene Star or Plus, then any digit.
At first I thought that ‘\d’ after the wild card was allowing any character as long as the next character was a digit, thus all of the correct strings matched. However if I switched (\d) with any word character (\w), I thought a similar result would happen but with word characters (e.g. all of the strings on the right with characters only would match, and digits would not), but instead the result matches all strings.
Can anyone explain why this works, as well as why ‘\w’ does not?
\w: matches a single uppercase character, lowercase character, digit or underscore \d: matches a single digit character
Since these strings don’t have a number at the end of them, they won’t be matched by .+\d
wildebeest
hippopotamus
woolly mammoth
Yet they’re matched by .+\w since they have at least a letter at the end of them.
wildebeest
hippopotamus
woolly mammoth
.+\w will match any string longer than or equal to 2 characters. .+\d will only match strings longer than or equal to 2 characters that have at least a number at the end of them.