How to use special characters or accented letters in regex?

Can you tell me the pattern for both “escape” and “@,!”
Thanks

3 Likes

In a regex, ‘special characters’ are those that symbolize a particular class, such as \s for whitespace characters, \w for word characters, or \? for an actual question mark, \* for an actual asterix, since ? and * (among others) are also special characters.

The @ character can be used directly in the pattern without escapement. It is not special at all since it has no meaning apart from being a printable character.

The lesson should have link in the narrative to a table of regex special characters and their meanings.

20 Likes

In this exercise , why do we have to add the “+” sign to “[a-zA-Z0-9]” ?

6 Likes

I want to know the answer too1

The + special character is known as a quantifier.

Quantifiers

Quantifier Legend Example Sample Match
+ One or more Version \w-\w+ Version A-b1_1
{3} Exactly three times \D{3} ABC
{2,4} Two to four times \d{2,4} 156
{3,} Three or more times \w{3,} regex_tutorial
* Zero or more times ABC* AAACC
? Once or none plurals? plural

https://www.rexegg.com/regex-quickstart.html#quantifiers

25 Likes

Hi there,

In Sweden we have 3 more letters after Z: Å, Ä + Ö

How do I manage that writing code? :slight_smile:

5 Likes

I was wondering the other day about str.isalpha() which if memory serves was a question regarding Chinese characters being recognized. Perhaps the same applies to Swedish?

@ionatan will be better informed than me as I believe he is from Sweden, as well. In Python 3 the default string object is unicode so one can well imagine string methods will work on all characters alphabetic (letter) in nature. Something to look into.

Unicode HOWTO

1 Like

Thanks for providing this information but I couldn’t associate the examples with their sample matches. If you could explain me this it’ll be very helpful.

  1. Version A-b1_1
    The specified pattern expects the word, ‘Version’ at the beginning, and a dash (-) between two word character groups. Note that were we to use the * quantifier the dash wouldn’t be mandatory (neither would be Version). The word character group is made up of, [0-9][A-Za-z][_] if I’m not mistaken. The dash above had to be manually written into the pattern.
  2. ABC
    The pattern expects non-digit characters where there are exactly three characters in the sample.
  3. 156
    The pattern expects 2 to 4 digit characters in the sample.
  4. regex_tutorial
    The pattern expects at minimum, 3 characters which must all be word characters.
  5. AAACC
    The pattern expects the three characters ‘ABC’ in that order where B is optional.
  6. plural
    The pattern expects an ‘s’ character which does not repeat, but may be absent.

That’s a kind of off the cuff rundown. Be sure to read up on Regex patterns if this is something that interests you. It can take years to master the engine so the more exposure you give yourself, the more it becomes second nature. I’m not there, to be sure. One still uses basic patterns in .replace(), test(), .match() when there is a measure of confidence.

Be sure to include JS’s RegExp() special object in your reading, study and practice.

Although I’m not an expert and can’t answer the questions, I found this very useful website for playing around with regular expressions, and I thought I would share the link! It’s for Javascript, but as discussed in this thread, it seems like HTML and Javascript use the same rules for regex.

In fact, all languages use the same rules since Regular Expressions don’t belong to any one particular language (though it does trace back to Perl). It is a language unto itself. Each programming language has a built in Regex Engine so they can all handle patterns in the same way, though they may have their own particular syntax for expressing the pattern.

1 Like

Thank you, mtf, for that clarification and background information!

1 Like

I tried to include this character but it gives me an error?? ( but if I omit it, then it works) Should I include something more to add the @ character?

1 Like

What does your pattern look like?

Hello, i just want to share this amazing video who help me a lot to understand better Regex and how to read it :slight_smile:

Tutorial Youtube Regex