Techniques for comparing multirow in CRM Analytics Data Prep / Data Recipe using pivoting approaches

8 min readOct 4, 2024

Business Use case:

In order to better serve customers and increase sales, the client wants to divide up accounts into groups according to the sales performance of the previous three years and write back the transformed fields into Salesforce account objects.

Problem Statement:

The requirement’s problem statement is that each account may contain sales records from three separate years. For instance, sales records from previous years to the previous year, the previous year, and current year. To segment the accounts, we must therefore compare the current year’s sales to the previous year’s sales in the account subgroup as well as the previous year’s sales to the previous to previous year’s sales. due to the fact we are unable to compare the multirow data, it can be difficult to complete.

Solution:

In data processing tasks, it’s often necessary to pivot data from a row -wise format to a column-wise format. This transformation, known as transposing rows into columns is a common operation in data analysis and reporting tasks.

Step 1: Load the Data into Recipe

Let’s First load the data into Data recipe.

eg : in our use case last two years sales data coming from SAP and Refreshed data will be received every year jan 2nd week .

Loading Last two years Sales data into Data Recipe

eg : in our use case current sales data also coming from SAP and Refreshed data will be received every month. End of 1st week.

Loading current year Sales data into Data Recipe

Step 2 : Append the Last two years & Current year sales data

Given that the sales data for the previous two years and the current year come from separate datasets. In order to compare sales, we must compile all three years of sales data into a single data stream by append each stream to the top with a mapping of the relevant field

Appending last three years and current year sales data

Step 3: Creating dynamic year category column Transform Node

We now have all of the sales data from the previous three years in one data stream. All we need to do is use the aggregate node to translate the column level data into row level data. However, the sample data’s year field is numeric data type, which we cannot use in the aggregate node’s group by column. Instead, we must convert the numeric data type into a Dimension data type. Adding the transform node after the append node is the next step.

OH Wait why are we doing this transformation rather than just converting the data type….?

Dynamic year category Transformation

As we previously stated, the data is updated every year, therefore if you just transform the year field into a dimension and utilize it on the group by column in the Aggregate node, you will need to redo the work each year because we will compare the resultant transposed columns to determine the Segmentation field after the aggregates operation. hence when the new data was refreshed, references in the formula were no longer valid.

To take into consideration for the development’s life cycle, we plan to create a dynamic year category column. so that the recipe doesn’t need to be modified annually.

eg : All we have to do is take the current year and subtract it from the dataset’s years fields. We are just labeling them according to the difference, such as previous to previous year when the difference is -2, previous year when the difference is -1, and current year when the difference is 0.

the below is the formula for this dymanic year column field

case 
    when (YEAR_ID - year(now()))  = 0
    then 'Current year'
    when (YEAR_ID - year(now()))  = -1
    then 'Previous year'
    when (YEAR_ID - year(now()))  = -2
    then 'previous to previous year'
    
    else ''
    
end

Step 4: Adding aggregate Node for pivoting

All of the needed fields are now included in the data stream. In order to convert the rows into columns, we may now add the aggregate node right after the transformation node.

Let’s add the sum of the sales in the Aggregate node’s "Aggregate" section. since we are really focused on the sales value.

Following the aggregate selection, we must group the rows.Let’s choose the Account/Customer Name option under "Group Rows." Since our requirement is to compare each account sales.

We now have the overall sales for each account, but we desire to have sales data spanning past three years. In order to achieve that, we must choose the dynamic year calculated column that we created through the previous steps in the "Group Column" section.

Let’s choose each of the three dynamic category values from the drop-down menu. and select "Done”.

After selecting the Apply on Aggregate Nodes option. The sales information will be converted from rows to columns. There will be some limitations for aggregate nodes. Therefore, let’s review the documentation before moving further with your use case.

Aggregate Node: Roll Up Data to a Higher Level (salesforce.com)

Step 5: creating Segmentation column Transform

Now that the sales data is in the column, we need to compare the accounts' sales over the last three years and classify them using the segmentation matrix below.

We are going to utilize a nested case statement for the segmentation because we need to check various criteria.

pls refer the below the nested case stament formula for the segmentation.

case 
    when Sales_previoustopreviousyear <= Sales_Previousyear
    then
        case
            when  Sales_Previousyear <= Sales_Currentyear
            then 'Growth'
            when  Sales_Previousyear > Sales_Currentyear
            then 'Decline'      
        end
    else
        case
            when Sales_Previousyear < Sales_Currentyear
            then 
                case
                    when Sales_Currentyear < Sales_previoustopreviousyear
                    then 'Progressive'
                    else 'Growth'
                    end
            else 'Decline'
        end
end

Finally, for every account whose sales are recorded over the previous three years, we have made a segmentation column.

Step 6: Segmentation column on to account level join

We can now add segmentation information at the account level . Thus, let’s add the join node with the suitable right field selection following the segmentation transformation.

Eg : in the sample dataset customer name column only present so i have utilized customer name as join key .

Summary :

We may perform this type of multi-row comparison data operation by using the aggregated node in the recipe to pivot the row data into columns.

Writeback to salesforce use cases

We also discuss use cases for Salesforce Out Connections in this article.

This information should be at the account object, per business request. So, how can we actually do this?

We can write back the transformed fields to the salesforce object by using the Salesforce output connection functionality that is included in the output node.

Step 7: Create Segmentation field in respective object

Let’s check whether the field needed to store the segmentation data is present in the appropriate object. It’s account object in our use case. If not, create a text area field and include it in the layout.

Step 8: Filter segmentation presented account

Since we are aware that not all of the accounts in the organization have sales or transactions, we must write back the accounts that do. Therefore, let’s add the filter node right after the join node to remove account records when the segmentation value is null.

Step 9: Dropping Unnecessary columns

Our goal in the use case was to push the account object only with the segmentation data. Adding drop transform functions will allow us to remove the remaining unnecessary columns from the data stream.

Step 10: Setting up Salesforce output connection

It’s time to configure the Salesforce output connection in the "connections" section of the data manager. Please refer to the documents below for further details on setting up the connection and limitations.

Salesforce Output Connection

Step 11: Register Salesforce output connector in Recipe

We can write back to the Salesforce object after the output connection has been established so let’s add an output node after the "drop transform" node.

In the output connector node “Write To” section select the output connection in the dropdown.

In the “Connection Name” section lets the select the connector which you configured.

After selecting the connection, you must choose the object to which you wish to write back data. Choose Account Object from the dropdown menu since that is what we need in this particular case.

Let’s choose the action you wish to carry out, such as INSERT, UPDATE, or UPSERT, after the object selection. We must use the "UPSERT" function because our data is refreshed monthly and the segmentation may change.

Once the operation has been chosen, let’s map the object’s external id to writeback and mapping the recipe fields with the object fields.

After setting up the output connection, save the recipe and execute it.

Note: Recipe scheduling

To go along with a data refresh timeline, let’s schedule a recipe. Our use case specifies the end of the first week of each month for updating the current year data, hence we’ve scheduled the recipe to execute on the first working day of the following week. hence that the newly created segmented account’s segmentation will be added to the account and modified on the existing segmented account accordingly

Output :

Navigate to the Account record page and verify the segmentation field when the recipe has successfully completed.

An effective method would be to perform data transformation in the data recipe and writeback to the salesforce object.

Techniques for comparing multirow in CRM Analytics Data Prep / Data Recipe using pivoting approaches

Written by Ramdosk

No responses yet