Archive

Posts Tagged ‘Analysis Services’

How to use ITEM and when ITEM(0).ITEM(0) is redundant

August 30th, 2010

In MDX we have the Item function which can be used in a number of ways. It is important to understand how it works and how it can be used to our advantage.

As a start, we can call Item over a set or over a tuple:

{set}.Item(0) or (tuple).Item(0)

If we use Item with a tuple, we must specify as an argument the integer position of the member within the tuple which we want. However, when we works with sets, we can either do the same, or specify a number of strings, which identify specific tuples. Some examples:

Item with a tuple:

(a,b).Item(0) = a

SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 ([Product].[Category].&[4], [Date].[Calendar].[Calendar Year].&[2008]).Item(0)
} ON 1
FROM [Adventure Works]

(a,b).Item(1) = b

SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 ([Product].[Category].&[4], [Date].[Calendar].[Calendar Year].&[2008]).Item(1)
} ON 1
FROM [Adventure Works]

Item with a set:

{a,b,c}.Item(0) = a

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0)
} ON 1
FROM [Adventure Works]

{a,b,c}.Item(“a”) = a

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item("([Product].[Category].&[4],
             [Date].[Calendar].[Calendar Year].&[2008])")
} ON 1
FROM [Adventure Works]

{(a1,b1),(a2,b2),(a3,b3)}.Item(“a1″,”b1″) = (a1,b1)

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item("[Product].[Category].&[4]",
           "[Date].[Calendar].[Calendar Year].&[2008]")
} ON 1
FROM [Adventure Works]

When we specify a number of strings as arguments, we get the tuple which is defined by these strings/coordinates.

Now, let’s see what happens when we have a set of tuples and we use Item on it with a single argument:

{(a1,b1),(a2,b2),(a3,b3)}.Item(0) = (a1,b1)

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0)
} ON 1
FROM [Adventure Works]

We get a tuple back. Therefore, if we use a second Item function over the first one, we will get the member on that position from the tuple:

{(a1,b1),(a2,b2),(a3,b3)}.Item(0).Item(0) = (a1,b1).Item(0) = a1

To illustrate the concept:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(0)
} ON 1
FROM [Adventure Works]

This gives us the whole amount for Accessories, while:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(1)
} ON 1
FROM [Adventure Works]

gives us the total amount for 2008.

Even if we do:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(0).Item(0).Item(0).Item(0).Item(0).Item(0).Item(0)
} ON 1
FROM [Adventure Works]

we still get the amount for accessories. But if we do:

WITH
SET SET1 AS
 { ([Product].[Category].&[4],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[1],[Date].[Calendar].[Calendar Year].&[2008]),
  ([Product].[Category].&[3],[Date].[Calendar].[Calendar Year].&[2008])
 }
SELECT
{
 [Measures].[Internet Sales Amount]
} ON 0,
{
 SET1.Item(0).Item(1).Item(1).Item(0).Item(0).Item(0).Item(0).Item(0)
} ON 1
FROM [Adventure Works]

we get nothing back. This is because there is no element on the second position/coordinate of the tuple ([Date].[Calendar].[Calendar Year].&[2008]).

No matter how many times we use Item(0), it always returns the first member of the result set, which then is itself a set and so on. Therefore calling Item(0) after another Item(0) is rarely necessary and should be done only if we need it.

SSAS , ,

 

Set Operations in MDX – UNION and EXCEPT

August 1st, 2010

I just read an article MDX: Except written by Vincent Rainardi.

It shows set subtraction by usng the EXCEPT function (as you could derive from the title, no doubt). I have always been a fan of using the “-” and “+” operators instead of EXCEPT and UNION where possible because in my opinion they give us better visiblity of the intention we are putting behind our MDX expressions. However, EXCEPT and UNION have an advantage over “-” and “+” – the third parameter ALL.

In BOL we can see that both of these functions can be used like this: UNION/EXCEPT(set1, set2, ALL). If we skip the ALL keyword, we would get exactly what we would get with +/-. Some examples:

SET1: {a, b, c}
SET2: {c}

SET3 = UNION(SET1, SET2) = SET1+SET2 = {a, b, c}+{c} = {a, b, c}

But if we use ALL, we would get duplicates in our result set:

SET4 = UNION(SET1, SET2, ALL) = UNION({a, b, c}, {c}, ALL} = {a, b, c, c}

The difference here is the duplicates, which get preserved in SET4 because of ALL. And this is also where EXCEPT is different to “-”:

SET5 = EXCEPT(SET4, SET2) = SET4-SET2 = {a, b, c, c} – {a} = {b, c}

while

SET6 = EXCEPT(SET4, SET2, ALL) = {a, b, c, c} – {a} = {b, c, c}

As you can see, unlike in set math where a set cannot have dupicates, in MDX we can. Therefore, if we are in a situation where we need to preserve these, we have the option of using the UNION and EXCEPT functions with the ALL parameter.

I am using these concepts on every-day basis and I have found that mastering them gives me a very powerful way of solving many MDX problems. I hope that the examples are suitable and easy to understand – when I read the article after I have just written it I sound a bit like my math lecturer from uni (I wish he knew and taught MDX), who was a good guy, so I guess there is nothing wrong with that :)

SSAS , ,

 

Using the DSV to its Full Potential

July 22nd, 2010

The Data Source View in Analysis Services is a very powerful abstraction of the data source and it can help us overcome some scenarios in an easy and clean way. Many times we look for MDX or programmatic solutions to problems, which can be tackled best in our data. While for complex tasks we would be better off extending the ETL process, some simple ones can and should be implemented in the DSV.

 

As an introduction to the topic I would like to explain briefly what the DSV actually is. It can be conceptualised as a database view on top of the data source. By default all tables which we need for building the Analysis Services database (typically dimensions and facts) are appearing in the DSV as table bindings (exactly as if we do a SELECT * FROM Table). If we have no foreign keys defined in our database, SSAS will not show us the relationships in the DSV. However, we can define logical relationships in the DSV, thus connecting the tables on related columns, which are then used for automatically determining dimension relationships to the measure groups.

 

There are two important ways to modify the DSV, which allow us to add more columns to the existing tables and to modify the way the existing columns are shown:

Named Queries

 

 

 

If we right-click on a table in the DSV, we can select to replace the table with a Named Query. A Named Query is essentially a T-SQL statement, which is equivalent to a database view definition. By utilising Named Queries we can alter the way we see the tables and their column in SSAS. In example, we could concatenate columns, implement CASE logic, etc. Named Queries can be thought of as equivalent to database views.

 

Named Calculations

 

A named calculation is a SQL statement which adds a column to a table without modifying the table binding. It gives us an easy way to define a new column without changing the whole query. The statement defining the column is in T-SQL and it behaves the same way as a new column in a Named Query (or a SELECT statement). If we just want to add one more column (e.g. Display Order, Code+Description concatenation, etc.), we can simply define a Named Calculation. Also, as the name suggests, Named Calculations can be commonly used for defining a leaf-level calculation without modifying a large fact table’s SELECT statement in a Named Query.

 

The column we define here appears in both the DSV table and in the Dimension Designer window:

 

These two DSV functions can be used in many scenarios. Most importantly, there are a few when they yield better performance, faster development and easier maintenance:

Leaf-level calculations

If we have the common requirement to perform leaf-level calculations and then aggregate this up the hierarchy, as opposed to aggregating and then calculating, the best way to do this is in a SQL statement on the fact table. Alternatively, we can do this in and MDX statement:

 SUM(DESCENDANTS(Dim.CurrentMember,,LEAVES), MeasureCalc)

However, it comes at a price. Since SSAS would have to do the calculation for each leaf and then sum this up the hierarchy, this could take a long time to perform. Also, SSAS would not be able to use pre-processed aggregations and the calculations will be done at execution time. To avoid this we could add a new column to the fact table and do the calculation there (in SQL), using the column as a new measure in the cube, which can then be aggregated by SSAS as any other measure. The performance gain is usually substantial and using a Named Query or a Named Calculation should always be the preferred option.

Description Attributes

Often we need to perform a concatenation between different dimension attributes, which we can use as a Description attribute while slicing the cube, or when providing reports from the SSAS database. A very easy way to achieve such a requirement is to use our DSV and concatenate the column we need in a new column in the dimension table, which we can expose as a new attribute in the dimension. A task such as concatenating an Account Code and Account Description into an Account Long Description (i.e. [Account Code] + ‘-‘ + [Account Description]) becomes very easy to implement within the DSV without modifying the ETL or any tables.

Composite Keys

Sometimes we need to build unique keys for attribute column in a dimension. A good example is a Date dimension, which does not have unique keys for non-leaf levels such as Month. Often developers have Month Key of 1,2,3-12. This does not make a good Month key in SSAS as it is not unique for higher levels such as Year, Quarter, etc. There are a number of ways to tackle this common scenario. While the recommended approach would be to build a concatenation between Year-Quarter-Month as a Month Key in the dimension table, we can also achieve this by either selecting all of the columns as key columns for the attribute in the dimension attribute properties. However, this would give us a concatenated key in MDX and this could sometimes be undesirable. A yet simpler and cleaner solution is to concatenate the relevant columns in the DSV by using a Named Query. Instead of the typical

SELECT col1, col2,.., MonthKey, colx, coly, coly FROM DimDate

we can write

SELECT col1, col2,…,YearKey+QuarterKey+MonthKey AS MonthKey, colx, coly, coz FROM DimDate

This way we can use the MonthKey column directly as a key for our Month attribute.

While this is useful for a Date dimension, it can also be useful for any other composite key definition in our dimensions.

Other possible applications of DSV Named Queries and Named Calculations are the implementation of

  • Sort Order attribute, in cases when we need custom sort of the dimension attributes
  • Restricting the data which comes into the cube dynamically based on a certain condition (think of a Date dimension, which includes only relevant periods)
  • Combining tables – by a SQL join
  • Replacing 0s with NULLs (the opposite can be done automatically in SSAS) for our measures

Basically, in a DSV we can “correct” our data to make it suitable for our cube without changing the ETL.

Last but not least, we can also transform tables to conform to a star-schema-like design. If we want to show a proof of concept on top of a normalized OLTP database, we could avoid the ETL complexities, as well as building a datamart, and use SQL to join/split tables in dimension and fact tables, which are suitable for cube development. While this could work in post-POC scenarios, it would be better to take a cautious approach to it as there are many scenarios when it would either not work, or will be too slow.

And a word of warning – your DSV could become slow because of over-use of complex Named Queries. This could be painful when minimising cube processing time is crucial, or when the DSV starts timing out and queries take hours to execute. Luckily, in most cases we can simply move these large queries forward – to the ETL where we have more time and better tools (e.g. SSIS).

SSAS , , , , ,

 

Passing unCONSTRAINED Set and Member parameters between reports in Reporting Services

April 15th, 2010

By default SSRS MDX queries get a StrToMember or StrToSet functions with a CONSTRAINED flag. However, many developers do not quite know why it is there or what it actually does. Books-On-Line contains this statements:

StrToMember

  • When the CONSTRAINED flag is used, the member name must be directly resolvable to a qualified or unqualified member name. This flag is used to reduce the risk of injection attacks via the specified string. If a string is provided that is not directly resolvable to a qualified or unqualified member name, the following error appears: “The restrictions imposed by the CONSTRAINED flag in the STRTOMEMBER function were violated.”
  • When the CONSTRAINED flag is not used, the specified member can resolve either directly to a member name or can resolve to an MDX expression that resolves to a name.
  • StrToSet

  • When the CONSTRAINED flag is used, the set specification must contain qualified or unqualified member names or a set of tuples containing qualified or unqualified member names enclosed by braces {}. This flag is used to reduce the risk of injection attacks via the specified string. If a string is provided that is not directly resolvable to qualified or unqualified member names, the following error appears: “The restrictions imposed by the CONSTRAINED flag in the STRTOSET function were violated.”
  • When the CONSTRAINED flag is not used, the specified set specification can resolve to a valid Multidimensional Expressions (MDX) expression that returns a set.
  • Therefore, if you have a CONSTRAINED flag you have to specify exact members or sets (e.g. [Date].[Year].[2009], or {[Date].[Year].[2009],[Date].[Year].[2010]}). If you omit the flag, you can pass to the StrToMember an expression, which evaluates to a member (e.g. [Date].[Year].[Year].Members.Item(0)), and to StrToSet an expression, which evaluates to a set (e.g. NONEMPTY([Date].[Year].[Year].Members, [Measures].[Amount]).

    The flexibility which removing CONSTRAINED offers can be quite powerful when passing parameters between reports. In example, we may want to pass a parameter to a drill-through report from two different summary reports, where each of those uses a different subset of dimension members, which in turn can be derived by different set expressions.

    Major drawbacks of using this approach is the severe performance hit it leads to, as well as a possible “MDX injection” vulnerability. Since in most cases we would  be using the passed parameters in a subcube expression or on the slicer axis (WHERE clause), this should not lead to as bad performance as we would get if we would use it inside a calculation. However, when we need to use a parameter directly in a calculated measure, we would be better off avoiding an unCONSTRAINED function. 

     Therefore, we may instead use SetToStr on the summary reports and pass a String parameter to a CONSTRAINED StrToSet function in the drill-through report. This way we are first resolving the set expression once and passing it on to the underlying report as a string. We could do that in a calculated measure returning a String, then passed on as a Field to the drill-through parameter. However, in the rare case where we have many rows travelling from the SSAS server to the SSRS server, this could be slow.

    So, whether we use a StrToSet without CONSTRAINED flag, or a String parameter constructed by a SetToStr function, is dependent on the actual scenario, but it is good to have both options in our arsenal of tools and techniques when we need to implement some non-quite-standard piece of functionality.

    SSAS, SSRS , , ,

     

    An OLAP Client – Implemented, Up and Running

    March 29th, 2010

    Davide Mauri just wrote about the Ranet.UILibrary.OLAP client by Galaktika Corporation (aka Galantis, Inc). Well, I have it installed and open to the world at:

    http://www.max-data.com/lab

    There is also a permanent link to it in the sidebar to the right of this blog.

    The Silverlight client allows us to write MDX queries against Adventure Works, drag-drop dimensions and measures just like in a pivot table and get the MDX which it generates.

    There are a few known bugs with it. You may get an timeout exception, which usually does not appear the second time you run the same query (a bit strange).

    Hope it helps to test scripts in the case when Adventure Works or SQL Server Analysis Services is not readily available on your environment.

    I will be updating the site with new releases of the Silverlight control, and I hope that sooner or later it will get more stable.

    Currently, the SQL Server box is running SQL Server 2008 R2 November CTP, and this will get upgraded to the RTM as soon as it gets available.

    SSAS , , ,