Tuesday, April 09, 2013

Basic Regular Expression Patterns V

1. Problem : Check whether a number is between 1 to 999
   answer  :
   var patt = /^[1-9][0-9]{0,2}$/;    
 patt.test("1001"); //false
 patt.test("100");  //True


  Explanation :
   a. [1-9] means the string may only start with 1 to 9 only. Zero won't be captured here.
   b. [0-9]{0,2} means, next any sequence of digits may or may not appear. If appears, maximum length of that would be 2.
      So, for a number like 100, the starting "1" is captured in point a above, and two zeros will be captured in point b.
      Number like 34, the starting "3" is captured in point a above, and "4" will be captured in point b.
      Number like 7, the starting "7" is captured in point a above, and rest "" will be captured in point b as the length can be zero.
  
   c. If we want to modify the requirement as "Check whether a number is between 1 to 9999", we would simply change the range from {0,2} to {0,3} as shown in the example below.
   var patt = /^[1-9][0-9]{0,3}$/;    
 patt.test("10010"); //false
 patt.test("100");   //True
 patt.test("9990");  //True

 

2. Problem : Find number of HTML tags used in the given HTML text [Closing Tags excluded]
   answer  :
  var patt = /<([a-z][a-z0-9]*)>/g;
 var text = "<html>
<b>Bold</b><i>Italics</i><u>underlined</u></html>";
 var c = text.match( patt );
 console.log( c.length );  // 4
 console.log( c );         // "
<html>","<b>","<i>","<u>"

 Explanation :
   a. Capture HTML tags starting with '<'
   b. ([a-z][a-z0-9]*) means the capturing group must find any alphanumeric character that follows "<"
   c. '>' means the HTML tag ends with this character
   d. /g means we want it to be a global search, means we don't want to quit after one match is found somewhere before the string ends.

   The above REGEXP only finds simple HTMP opening tags, it does not look for closing tags. But our next REGEXP does that too.


3. Problem : Find number of HTML tags used in the given HTML text [Closing Tags included]
  answer  :
 var patt = /<([^>]*)>/g;
 var text = "<html><b>Bold</b><i>Italics</i><u>underlined</u></html>";

 var c = text.match( patt );
 console.log( c.length );  // 8
 console.log( c );         // "<html>","<b>","</b>","<i>","</i>","<u>","</u>","</html>"
  

  Explanation :
   a. Capture HTML tags starting with '<'
   b. [^>]* means the capturing group must find anything that is not a ">". This also captures closing tag literals like "/a", "/i" , "/html" etc.
   c. '>' means the tag ends with this character
   d. /g means we want it to be a global search


Check the next part of this article here.

No comments: