Picking rows with partitions

I’m constantly faced with situations with data in Salesforce Marketing Cloud where I need to select or identify rows based on some ordered criteria. The row_number() over (partition by x order by y) as ranking is my go-to method.

This particular query finds a random row for each bounce category in the _bounce data view:

Recently, found this shorter method using top 1 with ties:

Credit @TheGameiswar


Parsing Delimited Fields Values with SQL Cross Apply

Say you wanted parse some field values:

SubscriberListMembership

UUIDLists
AAAA1|2|3|4|5|6|7
BBBB2|4|6|8|10|12|14
CCCC1|2|3

…and output something like this in Marketing Cloud:

SubscriberListMembership_Parsed

UUIDListsList1List2List3List4List5List6List7
AAAA1|2|3|4|5|6|71234567
BBBB2|4|6|8|10|12|142468101214
CCC1|2|3123

Since you can’t use any of the newer T-SQL functions for parsing in Marketing Cloud, you can use Cross Apply to determine the positions of the delimiters and then simple sub-strings to extract each one.

Troubleshooting Queries in SFMC

So your Query Activity failed. What next?

SFMC Support can certainly tell you exactly what the error was, and now you can (sometimes) see some error details in the Activity tab of Automation Studio.

Here are a few common things that cause Query Activities to fail.

Primary key violation

If your query results in duplicate rows not allowed by the primary key, then you’ll need to change your primary key or de-duplicate the rows that result from your query. Here’s my go-to method for deduplication. The innermost query sorts by insertDate, groups by subscriber key and assigns a ranking to each row. The outermost query selects only the oldest result per subscriber key.

Inserting a null value into a non-nullable field

If you have fields in your target Data Extension that are non-nullable, you need to ensure none of the columns in your query return a null value. You can utilize the isnull() SQL function to handle that situation:

Inserting a value too long for the field (truncation)

If any of your source data extension columns are larger than the target, then you can adjust the length of the target column or use the left() SQL function to trim the value:

Timeout

Probably the most frustrating error is the the 30 minute timeout. If your query won’t run in that amount of time, then it’ll error out. There are a bunch of different ways to address query timeouts. Here are some tips:

  1. First and foremost, you need to reduce the number of row you’re querying. This can be altering your date range or adding additional criteria to target rows more specifically.
  2. Leverage the primary keys for speed.  While we don’t have any insight in to the indexes that SFMC behind the scenes, it’s a good bet the primary keys are indexes.  So the more utilize their values in the where-clause, the better.
  3. Don’t join multiple System Data Views in your query.  It’s a pain, but it’s a good practice to have a separate query for each activity (sends, opens, clicks, etc.).  Last resort is to make your own copies of the System Data Views with primary keys and refresh them on an interval.  The data extension versions will perform better.
  4. Make sure your where-clause conditionals are sargable to reduce the number of type-conversions and to optimize the selection:

If it’s still timing out, leave a comment below with some details. I have some other tricks to share.

Target Data Extension has been deleted

Easy fix for this one. Just re-select the target data extension in your Query Definition.

Data-type mismatch

You can’t, for example, insert a string value of $12.34 into a Decimal field. What I typically do in this scenario is to check the data types of the source data extension(s) and ensure that they match the target data extension column data types. If they don’t, then I’ll either recreate the target column with the correct data type or cast/convert/fix the source column value in the query.

T-SQL Split Queries

Testing content is a good thing in marketing. It’s especially appropriate for email.

Find what works best and go with it.

One way to go about that is to segment your sending audience into groups and send varying content to each group.

In the Salesforce Marketing Cloud platform, you can create those groups with queries. Here’s an example of a 20-20-80 split — two groups of 20% and the rest in the last group.

(Select the first 20% randomly)

(select 25% more of those not already selected in the first group)

(select the rest of the rows that are not in either of the first two groups)

You may be wondering if the second query is correct — 25% is not 20%. Here’s the math:

If you have a 100 subscribers and you subtract 20% — or 20 — you have 80 left.

If we want two equal groups of 20, then 20% of 80 is not 20, it’s 16. To get the correct count by percentage we’ll need to use 25% to get 20 from 80, since 80 * .25 = 20.

Also, ordering by NewID() will randomize the selection each run of the query.

For further reading: