Demystifying Custom key use cases in CRM Analytics Data Prep / Data Recipe

Ramdosk
4 min readSep 9, 2024

--

Custom key :

Business Use case:

In order to improve user experience and sales, the Client wants to monitor the most recent three user behaviors on their external websites and apply this data into their marketing strategy. Additionally, they want to include this stats on the relevant users’ record pages.

Google analytics sample dataset

Problem Statement:

The client requests to view the last three events for each category. However, each user in the sample dataset above has a different type of event, and each event contains multiple kinds of records. so it is not possible for us to order each event-based record by window function so that the last three are displayed.

kindly look at the below sample records below

sample dataset

solution:

It is not as simple as using window functions to sort and order the data according to their most recent dates when trying to solve a problem like retrieving/displaying the most recent records for different categories across subgroups. We will look at approaches in the Data Recipe to accomplish this transformation in this post.

The approach I will take for this use case is we will use the concat function to build a “custom key” that contains the user ID and the relevant event.

Explanation

We created this type of custom key because we needed to obtain the most recent records from each event type for every user. Using group by user id in window functions will take the most recent records across all events, but our business requirement is to see the most recent records from each event type for every users, so we must group by event type only in window functions. However, doing so will take the most recent records from each event over all user, which also does not meet requirements. If we combine the event type and user ID into a single field and apply them to the group in window functions , will take the most recent records from each event for every users because this custom key will be unique for each event type over User subgroup .

Then Use row_number() windowing function to rank the records, grouping by the “custom key” field and ordering by “date” as descending.

Explanation

Business needs to view the most recent three records for every event, hence we must rank them in order to select the most recent three records so we are using row_number() window function.

The steps would be:

Let’s First load the object/external dataset into Data recipe.

Loading data into Data Recipe

Once the data has been loaded into the data recipe, let’s establish a join to this input node if necessary to bring in related information from other objects.

Now that everything is in the stream, our goal is to sort the most recent three records for each type of event across users. To do this, we will use the concat function to build a custom key that contains the user ID and the relevant event.

Custom key creation transformation

To rank the most recent entries ,let’s use the row_number() window function, grouping by the newly created “custom key” field and ordering by “date” as descending. The output for this field would be “Number” Data Type.

Ranking the different event types across subgroups

In order to get the latest three records we have to filter records orders rank field , however it is a numeric value, therefore performance will be affected if we use it in a filter node to filter the first three records. Therefore, in order to identify the first three records, we will create a new dimensions flag , using this dimensional field in filter wont have an impact on performance.

Last Three Records Flag

After this transformation, we are good to register as a dataset by using the output node.

Output:

we have successfully created dataset. the final output will be look likes below.

Dataset Output

Let’s compare the generated dataset records above with the original dataset records included below.

Original records of Doc downloads event type
Original records of transactions event type

--

--

No responses yet