Wednesday, August 27, 2014

Serialization / Unserialization in PHP - II

We have seen serialization/unserialization of common types in article : Serialization / Unserialization in PHP - I

Here, we would see object serialization. When objects are serialized, PHP looks for a member function __sleep() within the object before it actually does the serialization. So, we can write various clean-up jobs ( like close file pointers, close DB connections, close any open sockets, close any streams, free any used memory, free unused variables, destroy other related objects etc ) inside this function __sleep(). We need to remember that all the private and public properties inside object are serialized.

Similarly, when the serialized format is restored to Object through the function unserialize(), PHP calls __wakeup() member function (if exists) after it is finished with re-constructing the object. We can write various jobs ( restore DB connections, open files which the object works with etc ) inside this function __wakeup()

Let's check a code where object is serialized. 

<?php
// Define a Class
class MyClass  
{
    // private properties
    private $data1;
    private $data2;

    // protected properties
    protected $data3;
    protected $data4;

    // public properties
    public $data5;
    public $data6;

    public function __construct()
    {
      // Initialize private properties
      $this->data1 = "\r\n";  
      $this->data2 = NULL;  
      
      // Initialize protected properties
      $this->data3 = NULL;  
      $this->data4 = "400";  
 
      // Initialize public properties
      $this->data5 = "500";  
      $this->data6 = "600";  
    }

    // __sleep() 
    public function __sleep()
    {
echo "__Sleep called";
 
        // within __sleep(), all the clean-up jobs
        // can be included. Along with that, it also
        // needs to return name of the properties 
        // to be serialized in an array format
return array("data1", "data2", "data3", "data4", "data5", "data6");
    
    }

    // __wakeup
    public function __wakeup()
    {
       echo "__wakeup called";
    }

    public function display()
    {
        // Display all the properties
        // We used a simple Loop
for($i=1;$i<=6;$i++)
{
         echo "<br>" . $this->{"data$i"} ;
}

    }
}

// Create Object
$obj = new MyClass;

// Serialize it
$ser = serialize($obj);

// See what happened after serialization
echo "$ser";

// Unserialize it to restore the 
// data/properties in a new object
$p   = unserialize($ser);

// $p is the new Object, hence
// call a member function
$p->display();
?>

The above code defines a class called "MyClass", which includes 3 private and 3 public properties and __wakeup(), __sleep() methods. It also includes a method called display() which shows the values in all the properties. The __wakeup() method should return all the properties which need to be serialized within an array. The __wakeup() function does not have such restrictions.

The above program can run without the __sleep() and __wakeup() methods. In that case the clean-up jobs etc can't be defined and we can't define/select properties which need to be serialized. In that case PHP serializes all the properties within that object.

When serializing the private properties within the object, the class name is  prepended to the property name. Protected properties get an asterisk '*' prepended to their names. Such prepended values have Null bytes padded on both sides.

The code above produces 2 outputs, first is the serialized text of the object, second a list of property values in object $p which is created during the unserialization process of the stored representation we generated when $obj was serialized. Let's check the first output rendered on browser.

O:7:"MyClass":6:{s:14:"MyClassdata1";s:2:" ";s:14: "MyClassdata2";N;s:8:"*data3";N;s:8:"*data4";s:3:"400"; s:5:"data5";s:3:"500";s:5:"data6";s:3:"600";}

O:7:"MyClass":6: means Object having name "MyClass" (length:7) with 6 properties

s:14:"MyClassdata1";s:2:" "; means private property "MyClassdata1" (string, length:14) holds a string (length:2) value " ". Class name "MyClass" is prepended to Private properties. The value " " is text representation rendered on browser, actually it contains '\r\n'. As class name "\0MyClass\0" (padded by Null bytes on both side) was prepended to "data1" making it to "\0MyClass\0data1", the length of the new string is 14 ( 12 of "MyClassdata1" + 2 NULL bytes ) .

s:14:"MyClassdata2";N;        means private property "MyClassdata2" (string length 14) holds a NULL value;
s:8:"*data3";N;                          means protected property "*data3" (string length 8 including NULL bytes on both side) holds NULL value
s:8:"*data4";s:3:"400";      means protected property "*data4" holds string (length:3) value "400
s:5:"data5";s:3:"500";         means public property "data5" holds string (length:3) value "500
s:5:"data6";s:3:"600";         means public property "data6" holds string (length:3) value "600

Next part of the program is the unserialization part where we call the unserialize() function to restore an object from the stored representation $ser. And as a result, we create new object $p of type "MyClass" (The serialized data starts with "O:7:"MyClass"). Hence the call $p->display() calls the member function display() which iterates through all the properties inside the $p object and prints them on screen.

In such cases where object of undefined class to be created during unserialization, it creates object of "stdClass". Check the example below :

<?php
// an Array is being converted to Object
$o = (object) array("name" => "chandan");

// Serialize 
$ser = serialize($o);

// Unserialize would instatiate
// Object of class PHP default 'stdClass'
$po = unserialize( $ser );

// Print the new object details
var_dump($po);

// Access member properties
echo "Hello {$po->name}";
?>

The above code is quite self-explanatory. It produces the following output :: 

object(stdClass)[2]
  public 'name' => string 'chandan' (length=7)

Hello chandan

$po is an object of PHP built-in class 'stdClass' with a public member "name" and this property holds a string value "chandan". So, when $po->name is referred, it prints the correct value as unserialize() correctly re-constructed object from stored representation.

Serialization / Unserialization in PHP - I

Through Serialization, we can produce a storable representation of a value. PHP offers serialize() function which accepts value of any type except resource type. The serialize() function can accept arrays also. Array/Objects with circular references ( means some values reference to others within the array or objects ) can also be serialized. The serialize() function returns a byte-stream representation of the value and this may include NULL bytes and in case NULL bytes are serialized, it is better to store the serialized representation in BLOB field within Database. If we try to save it in CHAR, VARCHAR or TEXT type fields, those NULL bytes will be removed.

The reverse process i.e converting the serialized format to respective PHP value ( including array/object ), is called unserialization. PHP offers unserialize() function for unserialization.

Now check an example ::

<?php
// Define same Variables
$i = 100;
$j = true;
$k = "sample text";
$l = array("name" => "John", "age" => 23, "salary" => 103.25, "is_adult" => true );

// Serialize and store in other variables
$ii = serialize($i); 
$jj = serialize($j);
$kk = serialize($k);
$ll = serialize($l);

// Print Serialized data
echo "<b>Serialization returns storable strings ::</b><br>";
echo "$ii $jj $kk $ll <br>";

// Unserialize
echo "<b>Unserialization restores :: </b><br>";

// Print what we got after unserialization
echo unserialize($ii) . "<br>";
echo unserialize($jj) . "<br>";
echo unserialize($kk) . "<br>";

// print_r used to print the array
print_r( unserialize($ll) );  
?>

In the above program, we defined some variables of type integer, boolean, string and array and we called the serialize() function on each of them and stored the results in other variables. Next, we called unserialize() function to restore original data from the serialized representation.

Check the output :: 

Serialization returns storable strings ::
i:100; 
b:1; 
s:11:"sample text"; 
a:4:{s:4:"name";s:4:"John";s:3:"age";i:23;s:6:"salary";d:103.25;s:8:"is_adult";b:1;} 

Unserialization restores :: 
100
1
sample text
Array ( [name] => John [age] => 23 [salary] => 103.25 [is_adult] => 1 ) 

See that serialize($i) generates a string like "i:100" which means integer value 100. 
Similarly, "b:1" means Boolean value 1
s:11:"sample text" means String with length: 11 and value: 'sample text'

Let's understand the serialized text a:4:{s:4:"name";s:4:"John";s:3:" age";i:23; s:6:"salary"; d:103.25;s:8:"is_adult";b:1;} created for array $l. Array's key and value - both are sequentially stored.

a:4 means array of 4 values. All the values are wrapped in curly braces.
s:4:"name";       => Key, String of length:4, key value:"name" 
s:4:"John";       => Value, String of length:4, value:"John"
s:3:"age";          => Key, String of length:3, key value:"age"
i:23;                      => Value, Integer 23
s:6:"salary";  => Key, String of length:6, key value:"salary"
d:103.25;            => Value, Double value 103.25
s:8:"is_adult"; => Key, String of length:8, key value:"is_adult"
b:1;                        => Value, Boolean value 1

Let's check some more serialized arrays.

<?php
// Array 1
$arr1 = array("John",20,103.5,true);

// Array 2
// See that the 3rd item would have index '11'
$arr2 = array(10=> "John", 9=> 20, 103.5, 1=>true);

// Array 3
$arr2 = array('name'=> "John", 'age'=> 20, 'is_adult'=>true);

echo unserialize($arr1);
echo unserialize($arr2);
?>

The first output of the above program is ::
a:4:{i:0;s:4:"John";i:1;i:20;i:2;d:103.5;i:3;b:1;}

As said earlier, array's key and value are stored sequentially. 

a:4:                          means array consists of 4 items.
i:0;s:4:"John" means integer key 0 holds string (length:4) value "John".
i:1;i:20                means integer key 1 holds integer value 20.
i:2;d:103.5        means integer key 2 holds double value 103.5.
i:3;b:1                  means integer key 3 holds boolean value true/1.

The 2nd output of the above program is ::
a:4:{i:10;s:4:"John";i:9;i:20;i:11;d:103.5;i:1;b:1;}

i:10;s:4:"John"; means integer Key/index '10' holds a string "John"
i:9;i:20;                  means integer Key/index '9' holds an integer value 20
i:11;d:103.5;        means integer Key/index '11' holds a double value 103.5

The 3rd output of the above program is ::
a:3:{s:4:"name";s:4:"John";s:3:"age";i:20;s:8:"is_adult";b:1;}

s:4:"name";s:4:"John"; means string (length:4) key "name" holds string value "John" (length:4)
s:3:"age";i:20;              means string (length:3) key "age" holds integer value 20
s:8:"is_adult";b:1;    means string (length:8) key "is_adult" holds boolean value true/1

The unserialize() function converts the stored representation to appropriate PHP value. This function returns FALSE if the passed string is not unserializable. 

Check out the next part of this article : Serialization / Unserialzation in PHP - II