Home Tutorials RSS

Javascript Regex

Regex (a.k.a. Regular Expressions) is a tool for finding words or patterns of text within strings. It is also used to validate text, for example to check that you have entered a valid email address. It’s a swiss army knife that you will use again and again in both web and back-end programming using Javascript.

Before we look at the Javascript code, we’ll learn a bit about Regex itself.

Regex is a mini programming language in it’s own right, but for the purpose of finding text. With Regex you have a piece of text called an expression which is used to match text in a string. This text contains normal characters, e.g. letters that you want to match, and special characters that control how things are matched.

This is best explained with a few examples.

Firstly let start with the simplest thing, let’s say the expression is just this:

a

The string we want to match on looks like this:

Hello Mate

Then the expression will match the a in the word Mate.

In Javascript, you can use this regex to check if a string contains an a, and you can use it to replace any a’s with something else. I’ll cover the functions to do that later in this tutorial.

Now for something a bit more advanced

We are now going to use a special character, the asterisk *, which for a regex means match zero or more of a regular expression. This means if you have this regex:

ab*

It will match strings like this

abba
bab
aaa

These strings all have and a followed by zero or more b’s. When doing replacements, Javascript will replace as many b’s as it can find up until the next character that isn’t a b. For example. if you replace the regex ab* with x then abba becomes xa.

There are other special characters. The basic ones are:

Let’s make an example of a simple email validator regex using just the above special characters. Our validation will assume that a valid email just needs an @ symbol somewhere inside it, and a period somewhere after the @.

E.g. this would be a valid email: justin@domain.com

This actually allow a lot of emails that would be invalid, but it gives you an idea of how to get started, and it can be modified to make it more precise. We’ll start with the obvious thing, the @ symbol, so you could have a regex like this:

@

We want to match anything before that symbol. To do this we will combine two special characters: the period and the plus:

.+@

The .+ means is match any character one or more times. Or in other words there needs to be something (anything) before the @ symbol.

In a similar way, let’s match anything after the @ symbol:

.+@.+

It feels like we are getting close. The only thing is we want that anything to contain a period, so we can add that:

.+@.+\.

Notice how we need to use the backslash to state that the period is just a period, not the special character.

The problem with this is it won’t match anything after that period, e.g. the com in .com, so now we need to add another match anything regex to the end:

.+@.+\..+

And that’s it, that is our basic email matching regex

Let’s test it on a few strings:

justin@domain.com
justindomain.com
justin@domain

It will match the first one but not the last two.

Javascript Regex Exercise

I have put the example above on a website that lets you experiment with regular expressions and what they match.

Open the following link in a new tab: Regex Exercise on Regex101.com

You will see the regular expression for our email, and a test string with the 3 lines from above. Notice that on the right there is a single full match as you would expect:

Javascript Regex 101 Screenshot

Tasks:

  1. Add your own email to the test list. Once typed out it should appear in the matches section on the right.
  2. Add your name to the test list. Assuming there is no @ in your name, it should not appear on the right.
  3. Read the explanation section and make sure it matches your understanding of the regex. This section uses a bit of jargon:
    • Line terminators - these are characters that cause text to go on to the next line.
    • Greedy - this describes how + and * will consume as many characters as they can. For example. if you replace the regex ab* with x then abbbbbbbbbbbbba becomes ax. All the b’s are consumed.
  4. Change the regex to match only .com instead of matching anything after the period.
  5. Following on from 4, Add the email address justin@domain.co.uk and make sure that one doesn’t match, but the .com version does.

Learn more

The book

To learn about the regex syntax in depth, see Mastering Regular Expressions by Jeffrey E.F. Friedl

A quick cheat

If you are strapped for time, in a corner and need a quick regex to get you out, you might want to find a good cheatsheet online. Here is one I found which gets straight to the point: https://www.debuggex.com/cheatsheet/regex/javascript.

Javascript Regex

There has been a distinct lack of the usual braces and semicolons so far you’d expect from a Javascript tutorial. It’s time to fix that by seeing some regex code example. The regex language takes some time to get use to, but once you have a regex, using it in Javascript is easy.

Create a regex

To create a regex you can either put the regex text between two slashes, which is called a regular expression literal:

var regex = /abc*/

Or you can construct it like this:

var regex = new RegExp('abc*')

The literal has the advantage of being pre-compiled, which can make it run faster. The constructed version should be used if you might need to change the pattern, or the pattern isn’t known at the time of writing the program (e.g. from an input field).

Once you have made your regex, there are 6 methods you can use with the regular expression to do useful things:

exec and test and methods on the regex object, and the rest are methods on the string you want to test or change.

Let’s go through these and see what they do.

exec

The exec method will search a string based on the regex, and will return some information about the first match if there is one, and will return null if there is no match.

Here is an example:

var regex = /ab/;
regex.exec('ababab'); // Returns ["ab", index: 0, input: "ababab", groups: undefined]
regex.exec('acdc'); // Returns null.

The information returned is an array whose first value is the first match, with some additional properties, which are:

test

The test method is similar to exec but it returns less information. It will return true if there is a match and false if there isn’t, and that’s it.

var regex = /ab/;
regex.test('ababab'); // Returns true.
regex.test('acdc'); // Returns false.

match

The match method is called on a string. Given a regex it will return the match or matches found in the string. To find more than one match you need to add a g modifier to the regex.

For example:

'ababab'.match(/ab/); // Returns ["ab", index: 0, input: "ababab", groups: undefined]
'ababab'.match(/ab/g); // Returns ["ab", "ab", "ab"], with length set to 3.
'ababab'.match(/ac/g); // Returns null

The search method on a string returns the index (i.e. position) of the match within a string. It will return -1 if nothing is found. Either way, it will always return a number and not null/undefined.

"hello world".search(/w./); // Returns 6
"hello world".search(/s/); // Returns -1 (not found)

replace

The replace method replaces the first match it finds with a substitution. You need to specify the global flag g after the regex if you want to replace all matches in the string.

"hello world".replace(/w...d/, "people"); // Returns "hello people"
"hello hello".replace(/h...o/, "hi"); // Returns "hi hello"
"hello hi".replace(/h...o/g, "hi"); // Returns "hi hello"

You can also use normal strings instead of regex if you wish:

"hello world".replace("world", "people"); // Returns "hello people"

And if you are up for something a bit more advanced, you can provide a function that does the replacement in a special way. For example:

"aa ab ac ad".replace(/a./g, function(match){
    if (match == 'aa') {
        return '';
    }
    return '(' + match + ')';
});
// Returns " (ab) (ac) (ad)"

Extendclass.com has a good Regex Tester.

See Also