Using array_reduce to Transform Data

First, why we might transform data? If we have raw data from a database and need to send data to an external system or maybe export data. In either case we most likely don’t want to expose database column names or the structure returned must be different than how the data is stored.

For our example we’ll be sending data from our database to a CRM. Here’s the example class. In the database we probably store first and last name separately, but the CRM expects that we pass in the full name. Likewise the CRM expects a full address and company name.

class TransformUserForCrm {
    private $columns = [
        'fullName',
        'company',
        'address',
    ];

    public function prepareData($rows)
    {
        $data = [];
        foreach($rows as $row) {
            $data[] = array_reduce($this->columns, function ($result, $column) use ($row) {
                $methodName = 'get' . ucfirst($column);
                $result[$column] = (method_exists($this, $methodName)) ? $this->$methodName($row) : $row->$column;
                return $result;
            }, []);
        };
        return $data;
    }

    private function getFullName ($row) {
        return $row->firstName . ' ' . $row->lastName;
    }

    private function getAddress ($row) {
        return $row->street . ' ' . $row->city . ', ' . $row->state . ' ' . $row->postalCode;
    }
}

The $columns array defines the output column names.

In prepareData we foreach through each of the $rows.  Each row calls array_reduce. Simply put array_reduce will reduce an array to a single value by way of a callback function. This means we can call array_reduce on each row of data to transform the data into another array with the proper structure and formatting.

Blank array_reduce function with no logic and blank array passed in for initial value.

$data[] = array_reduce($arrayToReduce, function ($result, $valueFromArrayToReduce) {
    //Logic for each iteration goes here
}, []);

array_reduce takes 3 parameters.

  • First parameter ($arrayToReduce) is the array to reduce to a single value.
  • Second parameter is the callback function that is called for each element of $arrayToReduce. The callback itself has two parameters: previous value returned by the callback function ($result) and current iteration value of $arrayToReduce which we have named $valueFromArrayToReduce.
  • Third parameter is the initial value to pass into $result as the previous value because there is no previous value on the first iteration.

The class above implementation of using array_reduce below:

$data[] = array_reduce($this->columns, function ($result, $column) use ($row) {
    $methodName = 'get' . ucfirst($column);
    $result[$column] = (method_exists($this, $methodName)) ? $this->$methodName($row) : $row->$column;
    return $result;
}, []);

The final output structure we want is from $this->columns so that goes into the first parameter.

The callback function carries the result from each previous callback interation unless it’s the first time running through in which case it passes [] because the final parameter of array_reduce is []. Second parameter is the current column we are working on.

In the callback check to see if a method exists on the class. This defines a standard for retrieving data. For example the fullName column will call getFullName method. If a method is not defined then it will assume the value is fine the way it is and puts the raw value in for output.

Each iteration of the callback adds the column to the result by $result[$column]. The result is an array that we keep adding columns to until array_reduce is done. When array_reduce is done the result is returned and added to the $data array. Eventually $data contains all transformed rows.

This method of holding all data in memory doesn’t work well for massive datasets, but works well for small to medium size datasets.

Example

Using the class above given the following data:

$user1 = new stdClass();
$user1->firstName = 'Nick';
$user1->lastName = 'Escobedo';
$user1->street = '123 Fake St.';
$user1->city = 'Chicago';
$user1->state = 'IL';
$user1->postalCode = 12345;
$user1->company = 'Fake Company';

$user2 = new stdClass();
$user2->firstName = 'Will';
$user2->lastName = 'Smith';
$user2->street = '456 Fake St.';
$user2->city = 'Chicago';
$user2->state = 'IL';
$user2->postalCode = 56789;
$user2->company = 'Fake Company';

$data = [$user1, $user2];

$transformer = new TransformUserForCrm();

print_r($transformer->prepareData($data));

Output:

array (
    array (
        'fullName' => 'Nick Escobedo',
        'company' => 'Fake Company',
        'address' => '123 Fake St. Chicago, IL 12345',
    ),
    array (
        'fullName' => 'Will Smith', 
        'company' => 'Fake Company', 
        'address' => '456 Fake St. Chicago, IL 56789'
    ),
)