I have been seeing
this error time and time again when processing cubes in Analysis Services and
thought it would be a good idea to blog about it. When processing Analysis Services
cubes, you might see the following error:
Errors
in the OLAP storage engine: A duplicate attribute key has been found when
processing: Table: 'tblCustomers', Column: ‘Customer_Name’, Value: ''. The
attribute is ‘Customer Name'.
As you can see, Analysis
Services is complaining that there is a duplicate “blank” value in the column
Customer_Name. How can there be a duplicate blank value as Analysis Services
runs a SELECT DISTINCT query on the
relational table to retrieve column values when processing? Well, a few other
things happen “under the hood” when Analysis Services is processing. Let’s
consider a scenario where we have the following values in our Customer table:
Customer_Id
|
Customer_Name
|
1
|
ABC
|
2
|
XYZ
|
3
|
NULL
|
4
|
When Analysis
Services processes a dimension, it will execute a SELECT DISTINCT query to retrieve
a list of all the distinct values for each attribute; in our example, it will retrieve
4 DISTINCT values – ABC, XYZ, NULL, ‘’
(the last one being empty string). Now the tricky thing is that Analysis
Services will then convert the NULL value to an empty string and so the list of
values after conversion would be - ABC,
XYZ, ‘’, ‘’ (the last two being empty strings). Obviously, the distinct
list of values for Customer Name aren’t distinct anymore as there are two empty
strings. In other words, the NULL value is being converted to an empty string
and we already have an empty string in the distinct list. This is when Analysis
Services fails to process the dimension complaining that there is a duplicate
value “”(empty string). This behavior is actually controlled by an attribute property
called “NullProcessing” which is set to Automatic by default. Setting it to automatic
means that Analysis Services will convert NULL values to “empty string” if the
attribute is of type text and to 0 if the attribute is of type numeric.
Let me give you
another example as shown in the table below, and this time, let’s consider a
numeric attribute.
Customer_Id
|
Number_of_Bikes
|
1
|
4
|
2
|
10
|
3
|
NULL
|
4
|
0
|
In this case, the
error would be - Errors in the OLAP storage engine:
A duplicate attribute key has been found when processing: Table:
'tblCustomers', Column: ‘Number_of_Bikes’, Value: '0'. The attribute is ‘Number
of Bikes’.
Notice that now, Analysis
Services is complaining that there is a duplicate zero value. When Analysis
Services processes this dimension, it will come up with the following DISTINCT
list of values - 4, 10, NULL, 0. It
will then convert NULL to 0(zero) since the attribute is numeric. So the list
of DISTINCT values after converting NULL to 0 would be - 4, 10, 0, 0. As you can see, the list is not distinct anymore and
so Analysis Services would fail to process the dimension.
One of the
easiest ways to resolve this is to convert NULL values in your data source to blank
or zero if the attribute is text or numeric respectively. For example, you can
use ISNULL(Customer_Name, ‘’) in your data source view to convert NULL
value to an empty string(if the attribute is text) and use ISNULL(Number_of_Bikes, 0) to convert NULL value to 0( if the attribute
is numeric).
Just a side note
that you’ll see some other errors when this issue occurs and in most of the
cases, these errors would be generic and less than helpful. I have seen the
following errors when this issue occurs:
“Errors
in the OLAP storage engine: An error occurred while the dimension, with the ID
of 'Customer', Name of 'Customer' was being processed”
"OLE
DB error: OLE DB or ODBC error: Operation canceled; HY008."
While
the first error is generic, the second one could be due to various reasons
including the one we discussed. Please see this post for the other possible
reasons for the second error – Analysis Services Cube processing fails with error "OLE DB error: OLE DB or ODBC error: Operation canceled; HY008.
Hope
this helps!
Thank you for this explanation of this error. I have been getting this same problem and I "figured out", by trial and error, that the columns with empty string values were causing this error. My solution was to convert the empty strings to null, since I have never felt that an empty string had any real meaning anyway. However, I did not know exactly why until now. Nice explanation.
ReplyDeleteThank you for the explanation, I was struggling for a couple of days and now I understand why.
ReplyDeleteGreat writing. Thank you for clearing this up!
ReplyDeleteThanks for writing about this issue. It's very helpful.
ReplyDelete