Wednesday, October 29, 2014

Find longest substring inside a string

Very recently I have seen a small programming problem given on a site, which is shown below.

Find the length of the longest substring without repeating characters within a given string.

Take the string 'immediate' for example. The longest non-repeating letters would be "mediate". We can't accept "mmediate" because 'm' is repeating. 

In the string "borough", there are longest substring (non-repeating characters) will be "rough".

In the string "xxxxxx", the longest substring without non-repeating characters is "x".

So, we are going to write a program in PHP which would find the longest substring. And we are following this process ..

1. We would run a loop to examine the given string, reading character from 0th position, 1st position and so on.
2. We would pick up non-matching characters and store them in a character array. That array of characters will be converted to sub-strings.
3. Such sub-strings will be stored in a final array.
4. If we have a list of substrings, easily we can make out which is the longest substring among them using the strlen() function and a for loop.

Let's check out the final code :: 

<?php

// The given string is here
$str = "aksjdhiquywenqmneqkjosiqioueyqwemndfgppoiuytrewqasdfghyujikolrvcxzvbybnuimklopoiui";

// This is the final array where substrings will be stored
$subs = array();

// The main LOOP which reads the string
// So when $i is 0, the $str is read from 0th position
// When $i is 1, the $str is read from 1st position
for($i=0;$i<strlen($str);$i++)
{
  
  // A small array where picked-up characters will be stored
  $k = array();
  
  // Another loop which picks up characters from given string
  // and adds to the array $k  
  for($j=$i; $j<strlen($str); $j++)
  {
    // The array $k will be built with characters which are
    // not repeated earlier. So, we used in_array() to check
    // if this picked-up character is already existing in array $k
    if(!in_array( $str[$j], $k))
    {
       $k[] = $str[$j];
    }  
    else
    {
      // So we found a repeating character
      // So, BREAK to build the Substring
      break; 
    }
  }
  
  // When $k holds the series of characters
  // now build the substring and put it in the
  // array $subs
  $subs[] = implode("",$k);
}

// So, now $subs array holds all the possible substrings
// existing in the given string. Now is the time, we 
// find the longest among them
$max = 0; 
$final_str = "";

// Loop and scan
for($i=0; $i< count($subs); $i++)
{
  if(strlen($subs[$i]) > $max) 
  { 
      $max = strlen($subs[$i]); 
      $final_str = $subs[$i]; 
  }
}

// Finally print the result
echo "Max Length : $max, String : [$final_str]";
?>

The above code is quite self-explanatory. Each simply tries to create/extract substrings from the given string. For that we need to run a loop scanning from 0th position in the string $str. 

So, in the first pass, characters are picked up from 0th position in $str and kept in array $k till a non-matching character is found. Next the array $k is converted to a string (read substring) and put in the final array $subs.

In the 2nd pass, characters are picked up from 1st position in $str and kept in array $k and a substring is built eventually and kept in array $subs

This way, we build an array of all the possible substrings (made with non-repeating charaters) exist inside the string $str. 

And finally we simply decide which is the longest among them and print the result.

The above program displayed the output to be ::
Max Length : 21, String : [ewqasdfghyujikolrvcxz]

No comments: