Remove Special Characters from String Column and formatting using Nested String functions in Data Prep/Data Recipe.
Business Use case:
What if I have a funky Phone Number format?
When you enter phone numbers in standard Phone fields, Salesforce preserves whichever phone number format you enter. so, we might have the phone number in various formats isn't weird? when you show un uniformity phone numbers on the dashboard. let's assume that the business wants to see the phone number in the following format xxx-xxx-xxxx .
Problem Statement:
we could use a string function to remove the special characters and re-format it. But we cannot be able to declare the Special characters in a list to remove all of the special characters present in the given string in Data Recipe. in that case, you are forced to iterate multiple calculated columns to get the cleaned strings and it would be pretty painful because we will end up having some additional fields based on the length of special characters present in the string and then need to use drop functions to drop those unnecessary fields.
Solution:
The above problem can be solved by using the Nested Replace function. let's get started with the detailed solution
Before we format the String (in our use case it's a Phone number,) we would need to clean the string which means removing all the special characters associated with it. then it's easier to format the phone number as per business needs.
Ex: if you look at the above sample data it has special characters like (,), —, +, and space.
the below string function will be helpful for removing the special characters in the string.
Formula: replace (field, searchString, replacement string)
Nested Replace formula syntax:
when we use the nested replace function then all of the declared special characters will get replaced by a defined replacement string(in our case it’s empty), in a single formula column.
replace (replace (replace (field, searchString3, replacement string), searchString2, replacement string), searchString1, replacement string)
Nested Replace function formula: -
replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')
pls, see the below video for how the nested function can be created and how its works in Data Recipe.
After removing the Special characters from the string column it's time to format the string as per business requirements. If the phone number field has a country code, then we have to first find the length of the cleaned strings to split out the country code from the phone number field and format the rest of the digits as per the required format (place the hyphen in between the phone number digits.)
To find the length of the Phone Number: -
we have to pass the nested replace function to the inside of the char_length function.
char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''))
Phone Number Without Country Code: -
eg : 904–254–8587
let's pass the above char_length formula to the nested case statements to remove the country code and format the phone number like the above.
Formula explanation:
The actual phone number digit is 10. and country code digits can vary based on the record's country. so, we are checking the total length of the cleaned phone number.
If the phone number length is more than 10 then we are executing another case statement with lengths between 11 to 13.
If the length of the phone number is 10 then there is no country code present in the record. so, we have to substr the phone number from the 1st position and then place the hyphen after 3 digits and 3 digits using concat function.
If the length of the phone number is 11 then the country code of the record is 1. so, we have to substr the phone number from the 2nd position and then place the hyphen after 3 digits and 3 digits using concat function.
If the length of the phone number is 12 then the country code of the record is 2. so, we have to substr the phone number from the 3rd position and then place the hyphen after 3 digits and 3 digits using concat function.
If the length of the phone number is 13 then the country code of the record is 3. so, we have to substr the phone number from the 4th position and then place the hyphen after 3 digits and 3 digits using concat function.
case
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) > 10
then
case
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 11
then concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),2,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),5,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),6,4))
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 12
then concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),3,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),6,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,4))
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 13
then concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),4,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),8,4))
end
else concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),4,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,4))
end
Phone Number With formatted Country Code: -
eg : (91) 904–254–8587
Formula explanation:
Just we are adding additional open and closed brackets and spaces after the country code from the above formulas.
case
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) > 10
then
case
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 11
then concat('(',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,1),')','',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),2,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),5,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),6,4))
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 12
then concat('(',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,2),')',' ',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),3,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),6,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,4))
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 13
then concat('(',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,3),')',' ',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),4,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),8,4))
end
else concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),4,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,4))
end
Phone Number Without formatted Country Code: -
eg : 91 904–254–8587
Formula explanation:
Just we are adding additional spaces after the country code from the Phone Number Without Country Code formulas.
case
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) > 10
then
case
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 11
then concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,1),' ',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),2,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),5,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),6,4))
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 12
then concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,2),' ',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),3,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),6,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,4))
when char_length(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(','')) = 13
then concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,3),' ',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),4,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),8,4))
end
else concat(substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),1,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),4,3),'-',substr(replace(replace(replace(replace(replace(Phone,' ',''),'-',''),'+',''),')',''),'(',''),7,4))
end
Output:
Summary: -
we can use nested functions to perform string operations to avoid additional custom columns created in Data Recipe.