Thursday, June 13, 2013

String manipulation in javascript thorugh RegExp I

Problem 1 :: Find all 3 letter words in a given string
Solution ::

<script>
// String
var str = "We are the men in red. He is the best boy.";

// Define RegExp
var patt = /\b([\w]{3})\b/g;

// Apply RegExp
var res = str.match( patt );

// Print in COnsole
console.log( res );
</script>


Output ::
["are", "the", "men", "red", "the", "boy"]

Explanation ::
i. modifier 'g' is used for global search.
ii. \b is used to denote the word boundaries
iii. ([\w]{3}) means any alphanumeric character 3 times

Problem 2 :: Wrap all 3 letter words with a pair of braces within the string given in problem 1 above.
Solution :: We need to manipulate the replace() function a little to achieve this.

// Apply RegExp
var repl_ace = str.replace( patt, "{$1}" );

console.log( repl_ace );
 

Output ::
We {are} {the} {men} in {red}. He is {the} best {boy}.

Explanation ::
The 2nd argument in replace() function can take RegExp and $1 means every match found i.e if a 3 letter word "are" is found, $1 would refer to the match "are" itself. Hence, "{$1}" would generate {actual_word} for each 3 letter words found in the string.

Problem 3 :: Convert all 3 letter words to Upper Case within the string.
Solution :: We simply change the replace() function to achieve this. Check it out below.

// Apply RegExp
var repl_ace = str.replace( patt, function($1)

   { return $1.toUpperCase();  } ); 
// Print
console.log( repl_ace );

Output ::
We ARE THE MEN in RED. He is THE best BOY.

Explanation :: The second argument to replace() function should be a string, however we used a function which returns a string. That string takes the "$1" i.e the matched word and returns it in Upper case format which the replace() function uses.

Problem 4 :: Find every 2 adjacent words in a String and swap them. Means, it would match words like "We are", "the men" etc and swap them to "are We" and "men the".
 

Solution ::
<script>
// String
var str = "We are the men in red. He is the best boy.";

// Define RegExp
var patt = /(\S+)\s+(\S+)/g;

// Apply RegExp
var res = str.replace( patt, "$2 $1" );

// Print in Console
console.log( res );
</script>


Output ::
are We men the red. in is He best the boy.

Explanation ::
i) (\S+) means any series of non-whitespace i.e "We"
ii) \s+ means any series of whitespaces.
iii) (\S+) again means non-whitespace after the whitespaces i.e "are". So, we are trying to match adjacent words like "(We) (are)" or "(the) (men)". So, for every such match $1 would hold the 1st word, $2 holds the 2nd word i.e "We" & "are" for matched word pair "(We) (are)".
iv) The replace() function replaces with "$2 $1" resulting to "are We".

More such examples can be found in my next article.

No comments: