ComputersProgramming

Javascript, regular expression: examples, checking regular expressions

Before the advent of hypertext languages, but rather, until it became clear that it was necessary not only to search, but also to do it under certain conditions, in a specific place, with changed data, in the right quantities, the usual search and replace functions arranged for any sophisticated Programmer. Masterpieces of search art in programming languages were created, and databases were refined in the form of sampling conditions, equipped with stored procedures, triggers and other means of sampling from cumbersome relational information compositions. The emergence of regular expressions for the revolution did not lead, but it turned out to be a useful and convenient means for searching and replacing information. For example, regular JavaScript JavaScript emails significantly simplify the registration of visitors, they do not download the site by sending messages to non-existent addresses.

To say that the JavaScript regular expression is much better than the well-thought-out indexOf () sequences in the framing of conditional and cyclic operators, it is impossible, but to say that it made the script code compact but poorly understood to the uninitiated can be unambiguous.

RegExp object = template + engine

Regular expressions are a template + engine. The first is the regular expression itself - the JavaScript object is RegExp, the second is the template executor that applies it to the string. The engines that implement regular expressions for each programming language are different. And although not all differences are significant, this should be borne in mind, as well as be sure to carefully check the regular expression before using it.

A special notation for writing regular expressions is quite convenient and quite effective, but requires care, accuracy and patience from the developer. To notation of patterns of regular expressions it is necessary to get used. This is not a tribute to fashion, it is the logic of implementing the mechanism of "JavaScript regular expressions".

Regular expression pattern

Two options are allowed:

Var expOne = / abc * / i;

Var expTwo = RegExp ("abc *", "i");

Usually the first method is used. In the second case, quotation marks are used, so to use the '\' character, it must be escaped by common rules.

'I' is the flag denoting "register is not important". You can also use the flags 'g' - 'global search' and 'm' - multi-line search.

The symbol '/' is used to designate a template.

The beginning and the end of the regular expression

The character '^' defines the character (s) from which the regular expression starts, and '$' determines which character (s) should be at the end. Do not experiment with them inside the expression, there they have a different meaning.

For example,

Var eRegExp = new RegExp (cRegExp, 'i');

Var cRegRes = '';

Var sTest = 'AbcZ';

If (eRegExp.test (sTest)) {

CRegRes + = '- Yes';

} Else {

CRegRes + = '- No';

}

Var dTestLine = document.getElementById ('scTestLine');

DTestLine.innerHTML = 'The expression /' + cRegExp + '/ for the string "+ sTest +'" + cRegRes.

In the element 'scTestLine' there will be a result (the variable cRegExp has the corresponding value):

The expression / ^ AbcZ $ / for the string "abcz" - Yes

If you remove the flag 'i', the result will be:

The expression / ^ AbcZ $ / for the string "abcz" - No

Regular expression content

A regular expression is a sequence of characters that is the subject of a search. The expression / qwerty / looks for the occurrence of exactly this sequence:

Expression / qwerty / for the string "qwerty" - Yes

The expression / qwerty / for the string "123qwerty456" - Yes

The character '^' changes the essence of the expression:

The expression / ^ qwerty / for the string "123qwerty456" - No

The expression / ^ qwerty / for the string "qwerty456" - Yes

Similarly for the end-of-line character. Regular expressions allow sequences: for example, [az], [AZ], [0-9] - all letters of the Latin alphabet in the specified register or digits. Russian letters are also allowed to be used, but you should pay attention to the encoding of strings (where to look for what is being searched for) and the page. Often Russian letters, like special characters, are preferably given by codes.

When forming a regular expression, you can specify the options for the presence of certain symbols in a certain place, with their number set as follows: '*' = repeat 0 or more times; '+' = Repeat 1 or more times; {1,} is the same as '+'; {N} = repetition exactly n times; {N,} = repetition of n and more times; {N, m} = repetition from n to m times.

Using square brackets, you can specify the character variants from the set. It looks like this. [Abcd] = [ad] = any character of four: 'a', 'b', 'c' or 'd'. You can specify the opposite. Any character other than those specified in the set: [^ abcd] = any character except 'a', 'b', 'c' or 'd'. '?' Indicates that there may not be a symbol in this place. '.' Defines any character, except for a line break. This is '\ n', '\ r', '\ u2028' or '\ u2029'. The expression '\ s * | \ S *' = '[\ s | \ S] *' means the search for any character, including line breaks.

Simplified versions of the regular expression

The expression '[\ s | \ S] *' - search for a space or its absence, that is all that is in the line. In this case, the notation '\ s' stands for a space, and '\ S' stands for its absence.

Similarly, you can use '\ d' to search for a decimal digit, and '\ D' will find a non-numeric character. The notations '\ f', 'r' and '\ n' correspond to form-feed, carriage return and line-feed.

The tab character is '\ t', the vertical character is '\ v'. The notation '\ w' will find any character of the Latin alphabet (letters, numbers, underscore) = [A-Za-z0-9_].

The notation '\ W' is equivalent to [^ A-Za-z0-9_]. This means any character that is not a letter of the Latin alphabet, a number or a '_' character.

Search for the character '\ 0' = search for the NUL character. Search for '\ xHH' or '\ uHHHH' = search for a character with the HH or HHHH code, respectively. H - hexadecimal digit.

Recommended language and encoding of the regular expression

Any regular expression is important to test carefully on different line options.

With the experience of creating regular expressions, errors will be less, but nevertheless one must always keep in mind that one's own knowledge of the rules of writing a regular expression may not correspond to reality, especially when the "regular" is transferred from one language to another.

Choosing between the classics (exact indication) and a simplified version of the regular expression, it is better to prefer the first one. After all, the classics always clearly indicate what is being sought. If there are Russian letters in the regular expression or in the search string, you should result in a single encoding of all the lines and a page on which the JavaScript code that executes the regular expression operates.

When processing characters that do not belong to the Latin alphabet, it makes sense to consider specifying the character codes, not the characters themselves.

When implementing JavaScript search algorithms, the regular expression should be carefully checked. It is especially important to control the character encoding.

Parentheses in regular expressions

The square brackets specify the symbol variants that should be or are not present in a particular place, and round ones are variants of the sequences. But this is only a general rule. There are no exceptions from it, but there are many different applications.

Var cRegExp = "[az] *. (Png | jpg | gif)";

Var eRegExp = new RegExp (cRegExp, 'i');

Var cRegRes = '';

Var sTest = 'picture.jpg';

If (eRegExp.test (sTest)) {

CRegRes + = '- Yes';

} Else {

CRegRes + = '- No';

}

Results:

Expression /[az]*.(png|jpg|gif)/ for the line "picture.jpg" - Yes

Expression /^[ad][az]*.(png|jpg|gif)/ for the line "picture.jpg" - No

Expression /^[ad][az]*.(png|jpg|gif)/ for the string "apicture.jpg" - Yes

Expression /^[ad][az]*.(png|jpg|gif)/ for the string "apicture.jg" - No

It should be specially noted that everything, after which there is an asterisk, can be present zero times. This means that the "regular" can work in the most unexpected way at least.

Checking RegExp - testing email

In JavaScript, regular expressions receive two methods, test and exec, and can be used in String objects in their methods: search, split, replace, and match.

The test method has already been demonstrated, it allows you to check the correctness of a regular expression. The result of the method is true / false.

Consider the following regular JavaScript expressions. Checking email from the number of "difficult, but accurate":

Var eRegExp = /^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s @ "] +) *) | (". + ")) @ ((\ [[0-9] {1,3} \. [0-9] {1,3} \. [0-9] { 1,3} \. [0-9] {1,3}]) | (([a-zA-Z \ -0-9] + \.) + [A-zA-Z] {2,}) ) $ /;

For the string var sTest ='SlavaChip@sci.by 'gives true, that is, this string is the correct email address. The check was performed using the eRegExp.test (sTest) method.

Practical use: processing e-mail

The exec method on the output provides an array, call:

Var aResult = eRegExp.exec (sTest);

CRegRes = '
' + aResult.length + '
';
For (var i = 0; i CRegRes + = aResult [i] + '
';
}

Gives the following result:

9
Slava.Chip@sci.by
Slava.Chip
Slava.Chip
.Chip
Undefined
Sci.by
Undefined
Sci.by
Sci.

The other methods work similarly. It is recommended to check them yourself. The development and use of regular expressions is desirable to practice in practice, copying the code is not always appropriate here.

Popular "regulars"

The above JavaScript regular expression for eMail is not the only one, there are a lot of simpler options. For example, /^[w-\.]+@[\w-]+\.[az]{2,3}$/i. However, this option does not take into account all options for recording an email address.

Of course, you need to review the experience of colleagues, analyze the methods they offer, before you design your own regular expression on JavaScript. But there are certain difficulties. Do not forget that JavaScript regular expressions (examples of them when copying) can duplicate the essential characters: '\', '/' or quotation marks. This will lead to an error that can be searched for a long time.

It is important to take into account the habitual "human aspect". After all, a formal JavaScript regular expression for a phone that can be a visitor (person) can be indicated in various ways: 123-45-67, (29) 1234567, 80291234567 or +375291234567. And it's all the same number. The variant of writing several templates is not always acceptable, and rigid fixation of the rule for writing a number can create unnecessary inconvenience or limitations. The variant / ^ \ d [\ d \ (\) \ -] {4,14} \ d $ / i is suitable for most phone verification cases.

If you want to compose JavaScript regular expressions, only digits checking, then even such a simple case requires clarification. He must consider an integer or a fractional, exponential notation or an ordinary, positive number or negative number. You can also take into account the presence of a currency symbol, the number of digits after the decimal point and the division of the whole part of the number into triads.

The expression / ^ \ d + $ / i will only check the digits, and the expression /^\ d +\.\d+$/i allows you to use a period to indicate the fractional part of a number.

In JavaScript, checking regular expressions can be used to specify the format of input data, which is important, in particular when entering questionnaires, passport data, legal addresses, etc.

Checking the date is just about complicated

Consider JavaScript regular expressions. Examples for a date, like for a number or phone number, represent a choice between stiffness and flexibility. The date of the event is one of the essential data that often needs to be entered. But fixing the input in a certain format: 'dd-mm-yyyy' or 'dm.yy' often leads to customer dissatisfaction. The transition from the day-to-month entry field, performed by the classical HTML form, can not take place when only one digit is entered, and entering the second one can cause difficulties. For example, in the field of the day 3 was already entered, and the next digit 2 does not replace the first one, and is assigned to it 32, which, naturally, will cause inconvenience.

Efficiency and convenience of regular expressions essentially depend on the overall construction of the dialogue with the visitor. In one case, it is advisable to use one form entry field to indicate the date, in another case it is necessary to provide different fields for the day, month and year. But then there will be additional "code costs" for checking the leap year, the number of months, the number of days in them.

Search with replacement, memory of the regular expression

JavaScript replace (regular expressions) use the method of the String object and allow you to find the value and immediately change it. This is useful for correcting input errors, editing the contents of form fields, and for converting data from one presentation format to another.

Var cRegExp = / ([а-я] +) \ s ([а-я] +) \ s ([а-я] +) / i; // at search three 'variables' are created

Var sTest = 'this article is good!';
Var cRegRes = sTest.replace (cRegExp, "$ 2, $ 3, $ 1");

Var dTestLine = document.getElementById ('scTestLine');

DTestLine.innerHTML = 'The expression' + cRegExp + 'for the string "+ sTest +" will be:' + cRegRes;

Result:

The expression / ([a-π] +) \ s ([a-π] +) \ s [[a-π] +) / i for the line "this article is good!" Get: an article, good, this one!

When executed, each pair of parentheses stores the result in the 'variable' $ n, where n is the number of the bracket pair ($ 1, $ 2, ...). Unlike the generally accepted one, here the numbering of variables is carried out with 1, and not with 0.

General recommendations

A regular expression simplifies the code, but the time to develop it often matters. You can start working with simple constructs, then combine it into more complex expressions. You can use various online services to test regular expressions or special local tools.

The best option is to create your own regular expression library and your own tool for testing new developments. This is the best way to consolidate experience and learn how to quickly create reliable and comfortable designs.

Using repetitions of characters and lines, that is, the special characters '*', '+' and braces indicating the number of repetitions, should be guided by the principles of simplicity and expediency. It is important to understand that the regular expression from the beginning of its operation and until the result is obtained is entirely in the power of the engine of the browser used. Not all JavaScript languages are equivalent. Each browser can bring its own personal preferences in the interpretation of regular expressions.

Compatibility concerns not only pages and style sheets, it also has to do with regular expressions. A page using JavaScript can be considered debugged only when it has successfully worked on different browsers.

JavaScript, String, and RegExp

By the right work at the client level, that is, in the visitor's browser in the JavaScript language, requires high qualification from the developer. It's been quite a long time since the possibility to debug JavaScript code with your own browser tools or with the help of third-party extensions, code editors, independent programs.

However, in many cases, the debugger can manage and provide the developer with good support, rapid error detection, and detection of bottlenecks. The times when the computer was focused on computing, in the distant past. Now special attention is paid to information, and the objects of lines began to play an essential role. Numbers have become strings, and they manifest their true essence only at the right time and in the right place.

Regular expressions reinforce the possibilities of lines, but require due respect. Debug RegExp in the process of its work, even if it is possible to model, not too interesting idea.

Understanding the structure and logic of the RegExp object, the meaning of the String object, the syntax and semantics of JavaScript is a sure guarantee for safe and reliable code, the stable operation of each page and the site as a whole.

Similar articles

 

 

 

 

Trending Now

 

 

 

 

Newest

Copyright © 2018 en.delachieve.com. Theme powered by WordPress.