What is RegEx? Its usage, and how to use it?

What is RegEx? Its usage, and how to use it?

1. Definition

RegEx or Regular expression, according to Mozilla, is:

Regular expressions are patterns used to match character combinations in strings.

From me:

To make it simpler for you to understand, it's like a string that defines other string(s); or if you still remember your math, it's kinda similar to a formula that represents multiple same calculations.

A simple example to a regex:

These strings ‘AA’ and ‘Bb CC’ would match this regex: [A-Z]\w+

You might be asking: ‘Wait, why would they match? Is this real life or is this just fantasy?’

Don’t worry, I will explain to you some of the basic of regex syntax.

2. Explanations

Let’s use the above example:

[A-Z]\w+

[A-Z] means "find any character between A to Z’ (notice, this is case-sensitive, so it will only match if the character is UPPERCASE)".

\w means "find any character that is an alphanumeric characters".

+ means "match at least once".

So the above regex will mean: Match any character combinations which the starting character ranges from ‘A’ to ‘Z' and follow by at least 1 word.

Some strings match that regex: ‘Abc' ZBC' ‘F123' ‘C___’.

Strings that don't match that regex: ‘test' ‘123' ‘&’.

Regex syntax contains a lot more rules:

n* means ‘Match any that contains (zero or more) n'
. means ‘Find a single character, except newline or line terminator’
[AbC] means ‘Find any character that match ‘A' or ‘b' or ‘C'
\s means ‘Find a whitespace character
\n means ‘Find any newline character'
\. means ‘Find any dot . character'
\* means ‘Find any asterisk * character'
\\ means ‘Find any backslash \ character'

Hope you’ve now understood a little bit more about regex. If not, it's fine, you can read again, or play around this website to get used to it faster https://regexr.com/

3. So what are regex usages?

RegEx has a lot of applications in software development.

Let's imagine, you have a login page, which requires the user to input email only and password to authenticate.

If you don't check the string that the user input in the ‘Email’ field, they can accidentally type in their phone number, or nickname, or who knows, some ridiculous strings which are not email. When they hit the ‘Login' button, the request will be sent to the server to handle and obviously, it will fail to authenticate.

By using regex to check the string they input, you can detect if the ‘Email' field is filled in correctly or not. If the string is not email, you can popup a notification to alert the user to correct it. Even if the user clicks the ‘Login' button, you don’t have to send it back to your server, therefore reducing some obviously-failed requests.

Another application: The 'Find and Replace' features in many programs do support RegEx.

For example: in Visual Code, just click this:

Let's say, you have a Rails model like below:

You want to change those 4 associations to a different name. Well, of course you can manually highlight each name, than change it. But you can use 'Find and Replace' with RegEx to do it faster.

It will highlight any strings that match the RegEx:

Boom, hit the 'Replace all' and you just save yourself some precious time!

4. Some RegEx patterns

Email (simple): ^(.+)@(\S+) $
Email (RFC-5322): ^[a-zA-Z0-9_!#$%&'*+/=?`{|}~^.-]+@[a-zA-Z0-9.-]+$
Phone (Vietnam): (([03+[2-9]|05+[6|8|9]|07+[0|6|7|8|9]|08+[1-9]|09+[1-4|6-9]]){3})+[0-9]{7}\b
Visa card: ^4[0-9]{12}(?:[0-9]{3})?$
Mastercard: ^(5[1-5][0-9]{14}|2(22[1-9][0-9]{12}|2[3-9][0-9]{13}|[3-6][0-9]{14}|7[0-1][0-9]{13}|720[0-9]{12}))$

Thank you all for reading :)

Source:

  1. https://www.w3schools.com/jsref/jsref_obj_regexp.asp
  2. https://regexr.com/
  3. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
  4. https://gist.github.com/tungvn/2460c5ba947e5cbe6606c5e85249cf04
  5. https://www.baeldung.com/java-email-validation-regex
  6. https://stackoverflow.com/questions/9315647/regex-credit-card-number-tests