Regular Expressions In JavaScript
The JavaScript programming language handles regexes in a special way: with the RegExp object.Before going further, I expect you to be familiar with JavaScript so that you can get the syntax right, then work on figuring out how to write a pattern. If you accidentally fell into this web page and have no idea what JavaScript is (but are curious), click here. If you know what JavaScript is (and can code fairly well in it), but don't know what a regular expression is, click ?here.
Matching a String Against a Pattern
This is actually pretty easy. JavaScript is big on the whole OOP thing, so you match patterns by making a String object, and invoking the search() method. The parameter is the pattern between two forward slashes (i.e. /pattern/).Let's say I want to search the string "Hello World" for the word "World". Here's an example:
alert("Hello World".search(/World/));
The number 6 will appear in an alert box, as the word "World" begins after the 6th character of the string. Let's say that we want to search the string and return the letter 'W', regardless of case. For that, we use the 'i' attribute like so:
alert("Hello World".search(/w/));
alert("Hello World".search(/w/i));
The first will return -1, which is the same as null or not found. The second will return 6 just like in the first example.
Replacing Text in a String
What do we do if we want to replace something in a string though? Well, we use the appropriately named replace() method. It takes two parameters: the text to be replaced and the text to replace it with.var text = "abababab"; var altered = text.replace(/b/, 'a'); alert(altered);
This code is saying that in the text 'abababab', we will replace one 'b' with the letter 'a' (in the case of regexes, always the first 'b' is replaced) and display it using an alert box. What we return is the altered text, 'aaababab'. If we want to replace all of the matches, we use the 'g' attribute.
var text = "abababab"; var altered = text.replace(/b/g, 'a'); alert(altered);
Now we should have received 'aaaaaaaa' as our answer. How about a slightly more complicated pattern this time to practice our regex skills. How about we just do this?
var text = "abababab"; var altered = text.replace(/^(\w+)$/, '\"$1\"'); alert(altered);
In case you can't tell by looking at it (I hope you understand the jist of it), this will simply enclose the text in quotation marks. I slipped in a trick for storing text into that code. See those parentheses in the pattern? That's essentially storing all the word-characters ([a-zA-Z0-9_]) into the variable $1 (scoped to this method call). We can then use $1 to put it back into a new string once we do what we need to with it. So what we did was take a pattern that extracts all the letters in the text 'abababab' and replaces it with quotations marks. We then insert the text between those question marks to finish it off. Here's another quick example that reverses the words "Hello World" so that it reads "World Hello".
var text = "Hello World"; var altered = text.replace(/^(\w+)\s(\w+)$/, '$2 $1'); alert(altered);
Extracting Information from a String
Let's we wanted to actually pull a specific pattern from a string, rather than find out whether or not it exists. We are going to use the match() method, where the parameter is a pattern that you'd like to find.As an example, let's say we want to pull out the last four digits in a U.S. telephone number:
var phone = "123-456-7891";
var lastfour = text.match(/\d{4}$/);
alert(lastfour);
This will match "7891". We use a repetition character to define that we only want 4 characters found. '\d' refers to any digit (a.k.a. numerical integer). The dollar sign ($) says only the 4 numbers at the end of the string will be returned.
Now let's say you want to get an array of matches from a string. We have a list of names that need to be extracted:
var text = "Broch, Kelly, Scott";
var names = text.match(/\w+/g);
alert('Name 1: ' + names[0] + '\nName 2: ' + names[1] + '\nName 3: ' + names[2]);
This will display: Name 1: Broch Name 2: Kelly Name 3: Scott
Easier Ways to Split Strings
The last example displayed one way to seperate an array from a string. But the 4th and final method that the RegExp object has is called, split(). Again, it uses a pattern as its parameter. To give one example, let's use the same list of names and put them into an array for later use.
var text = "Broch/Kelly/Scott";
var names = text.split(/\//);
alert('Name 1: ' + names[0] + '\nName 2: ' + names[1] + '\nName 3: ' + names[2]);
The only difference here is that we use the forward slash (/) to delimit the names. Why don't we make a more useful example where we store key and value strings such as a query string for a URL.
var text = "name=Broch&age=20&hair=blonde";
var keys = text.split(/&/);
for (var i = 0; i < keys[i].length; i++) {
var data = keys[i].split(/=/);
document.writeln('My ' + data[0] + ' is ' + data[1] + '<br>');
}
This will take the query string and extract all the key=value pairs by telling split() to seperate each one at the ampersand (&) and place each match in the array 'keys'. We then loop over that array and split them even further at the equal sign (=) and place these two strings into the data array. At the end of which, we display each key value pair on the web page.
Further Reading
If you want a follow-up on regular expressions in JavaScript, I think you should try the Web Reference website. Also, if you own "JavaScript: The Definitive Guide" published by O'Reilly, then check out chapter 10.
|
|
Anonymous
(Not rated) (Report as abusive) |
new String("123")???? new String(<literal>) is a newbie mistake... just use the literal directly. |
|
Anonymous
(Report as abusive) |
the comment above is incorrect It is true that declaring variables unnecessarily is a frequent newbie mistake. However, in this case, it is necessary to have a String object to operate on. For example, if you don't put the string in var text here, you can't use the replace method on it: var text = "abababab"; var altered = text.replace(/b/, 'a'); Would you write that as var altered = "abababab".replace(/b/, 'a'); I think not. However, I do think that there is a typo in the telephone number example. The string is placed into a variable named "phone", so the second line should probably be: var lastfour = phone.match(/\d{4}$/); instead of var lastfour = text.match(/\d{4}$/); which (probably inadvertently) references a variable named "text" which is used in adjacent examples. |
| View all Rate and comment this article |
