Archive

Posts Tagged ‘MDX’

Two New Books Definitely Worth Buying

September 27th, 2011

I will try to provide a concise and useful insight into two books from the world of SQL Server. The first one is available now, while the second one is coming out very soon (in the next couple of weeks).

MDX with Microsoft SQL Server 2008 R2 Analysis Services Cookbook (link)
by Tomislav Piasevoli

 

I received an invite to review Tomislav’s book a few weeks ago and despite the fact that I am quite busy these days, I decided to have a look mainly because I expected it to be very good as I had heard about it months ago by the author and the reviewers (Darren Gosbell, Chris Webb, Greg Galloway, Marco Russo and Deepak Puri); and because I was promised a free copy, so I wouldn’t have to buy the book myself (which I would have done anyway)J. Tomislav has been one of the most prominent SSAS MVPs, quite active on the MSDN Forums and writing interesting posts on his blog. I was not surprised that he has been able to deliver excellent practical advice for his readers in this cookbook and I expect to be using it often during SSAS implementations. Let’s see what you could expect from this book if you buy it.

Firstly, it is a cookbook – Tomislav has written recipes for a large amount of real-world scenarios. I must admit that I did not read absolutely all of them. I went through the ones I think I know well and compared Tomislav’s versions to the ones I would typically use. The topics are very well covered and follow a pattern of: Getting Ready -> How to do it -> How it works -> There’s more -> See also. First, we get an introduction, and then we prepare to get the work done. After that we get a step-by-step walkthrough for the actual solution. I liked the next section “How it works”. Here we get an explanation of why we get the result letting us get some more insight rather than blindly typing code. I find it a very nice touch and I applaud the author for spending the time to include this invaluable bit of information. The “There’s more” section after that expands a bit the topic, trying different options and showing what happens when we apply them. In the end of each chapter we have a section showing which other topics are related to the one we are currently reading. All in all, I think that Tomislav’s cookbook is very, very, well written.

In general, I would say that if you want to learn MDX and how it works, you can start with a textbook-style book, which shows the basics, explains them and then builds up on them. However, if you need to work with MDX now, or if you do have some knowledge but you lack practical skills, then this book is a real gem. I would definitely recommend buying Tomislav’s cookbook, and if you are interested in getting more theoretical detail on why things work the way they do, either do a lot of blog reading, or buying another MDX book to serve as a companion to this one.

In brief, this is the best MDX cookbook out there and offers excellent practical advice over a variety of topics.

MVP Deep Dives vol.2 (link)

 

The second book I would like to introduce is another “must buy”. Imagine the effort required to assemble a book from 53 different authors who live in different parts of the world and are not all native English speakers (like me). Well, Kalen Delaney has made the effort…twice…and this is one of the reasons for having this book published in the next couple of weeks. Another reason is the motivation which the authors found in donating their work to Operation Smile. We are not profiting from it and we hope that you will also buy the book not only because of the outstanding content, but also because the money you spend will be used to help children with facial deformities all around the world.

The list of authors speaks for itself and I am very flattered to be able to participate with a chapter entitled “Managing Context in MDX”. The book will be out by the beginning of PASS Summit 2011 and there will be a book signing session, where you can get your copies signed by some of the authors. Come, drop by if you are there to say hello and have a chat!

Other , , ,

 

MDX Subselects – Some Insight

July 5th, 2011

Recently I answered a question on StackOverflow, which may be an interesting, common case and understanding it correctly helps understanding subselects better.

Let’s examine the following MDX query:

Here we effective place the All member of the Customer.Education hierarchy on columns, and an ordered set of the countries in the Customer dimension on rows. We have a subselect and a slicer. The slicer contains the Internet Order Count measure (which places is it in context) and the slicer restricts the query to look at customers with Partial High School education only.

When we execute the query we get exactly what we expect.

Now, let’s change this to:

The difference here is the All member in the tuple used as a second argument of the Order function. What we get is a different order (note that Germany and France are on different positions on rows). It seems like the set is ordered by the total amount for all customers without taking their education in consideration – the subselect does not make a difference. However, the measure value is the same as in the first query. So, what causes this?

To answer, we need first to understand what a subselect does. As Mosha posted a while ago, a subselect (which is really the same as a subcube) does implicit exists with the sets or set expressions on each axis and also applies visual totals “even within expressions if there are no coordinate overwrites”.

Instead of explaining the same as what Mosha has already spent the effort to explain, I will advise you to read this post thoroughly. Then, you would understand what the “implicit exists” does. In our case, it is not as important as the visual totals part, so I will concentrate on it.

The reason why the set gets ordered by orders made by customers with partial high school education is the visual totals. It takes place within the Order expression. The visual totals also restrict what we see in the query results, as it applies to the slicer axis measure, as well. However, in this case the catch is in the “if there are no coordinate overwrites” part of the story. The visual totals in Order does not get applied because we explicitly overwrite the Customer.Education hierarchy member within the tuple in the Order function:

Therefore, the set of countries gets ordered by the Internet Order Count for All Customers without taking into account the subselect. However, the measure on the slicer axis still gets the visual total logic applied to it, and this causes the result set to be for those customers who have Partial High School education.
If we re-write the second statement to:

We can see that the NON VISUAL keyword (SSMS can’t recognise this syntax and underlines with a red squiggly line) changes the way the measure is displayed. We see the total amount for the slicer, as well.

Similarly, if we re-write the first query with NON VISUAL we get:

Here both visual totals are excluded and both the Order and the result set are done for customers with any education.

Hopefully this will clarify the logic applied when we use subselects to some extent.

SSAS ,

 

Dynamic Groups in SSRS Reports with MDX

May 7th, 2011

Providing a dynamic group in SSRS with SQL is not all that hard. It is also a common business requirement when combining reports. In example, I see very often an old system providing multiple similar reports to its users – that is reports with exactly the same layout but with different items on rows or columns. I also see quite often SSRS requirements asking me to mock the old functionality and build multiple almost identical reports. My personal preference is to combine them in one if possible, thus reducing the maintenance and even development effort and cost. In SQL we have the convenience to choose how to name the output columns. As long as we have a stable dataset – that is a constant number of columns with the same data types called the same way, SSRS does not differentiate between them and treats them the same way. This allows us to flexibly change their actual contents. The standard approach in MDX would be to create all possible “groups” in a large dataset and then use one or another in the report as required through expressions. This, however, leads to slower execution times, larger chunks of data going around the network and the need to aggregate within SSRS at almost every data cell.

Instead, we can build the same functionality by constructing MDX dynamically through the use of parameters. Let’s examine a practical case to illustrate the concept a bit more clearly. If we assume that the Adventure Works executives asked me to create two simple reports showing years across the columns, their Internet Sales Amount in the data area and on one of them product categories on rows, while on the other one – the countries from which the sales originate from. The layout would be the same – a tablix, which has one column group – Year (calendar), Internet Sales Amount in the data cell and one of the two dimension hierarchies (Product Category or Country) on rows. The MDX for the datasets would also be fairly similar:

Year

Country

As we can see, the only difference here is the set on rows. Instead of providing two reports we can combine them in one. The standard approach I see all developers implementing is crossjoining all sets together and then aggregating in SSRS. To switch between the two grouping fields there would be a parameter “Group By”. The sample MDX for this report would be:

And the layout in SSRS:

The row group (and the expression within the row cells) here is based on a GroupBy parameter with two values – “product” or “country”. In SSRS the expression for them looks like this:


When the user selects Product as a value for the Group By parameter, our report renders grouped with the Product categories on rows. In this particular case the dataset is relatively small even when we do a full crossjoin between years, product categories and countries. However, in many cases the number of items on rows can be prohibitive. Therefore, it is better if we can avoid doing this by dynamically constructing the MDX expression. We can add another (Internal) parameter – GroupMDX to our report with the following expression as its default value:

Then, we can replace the report dataset with the following MDX:

To make it execute, I added as a sample query parameter GroupMDX with a value of [Product].[Product Categories].[Category]*[Customer].[Customer Geography].[Australia]. Note that you should use the same hierarchies as sample parameters as what you are actually using in your MDX parameter – otherwise you will get empty results (e.g. if I were to use [Customer].[Country].[Australia] as a sample in this case the actual dataset would not be able to figure out that the hierarchy has been changed and will return an empty set on rows for country).

Unlike with SQL, though, we have a problem – every time the query executes with a different set as a parameter value it returns different result set and its fields collection does not link well to them:

Luckily, there is a workaround. We can add a bit of custom code to our report (Report Properties -> Code). For more details you can read the following MSDN article. In a nutshell, we can check in the custom code for missing fields, thus guarding against the error we are receiving. The slightly modified code snippet from the MSDN page is:

Now, we can wrap every possibly disappearing fields reference within our report with a call to the function. In our case, we can replace the row group expression, as well as the row cell expression with the following expression:

Now we can run our report and get the results we expect. Every time we change the Group By parameter and run the report, SSRS constructs a different MDX query and sends it to SSAS getting only the results it needs. It is a fiddly technique, but in many cases can be a lifesaver – in particular when the large crossjoin method is just too slow.

And the report rdl can be downloaded here: 

SSRS , ,

 

SSAS SCDs: Showing Active Dimension Members With No Data

January 8th, 2011

Type 2 dimensions play well with SSAS because when we use them SSAS determines which members are relevant to which periods based on the data in the measure group we have. This is always the case – if we slice by a particular date, then all members from a SCD which do not link up to the data in the slice will yield empty values, which we can easily hide in most clients (or by using an MDX function like NON EMPTY). This works well in most cases. However, there is a small problem in a small subset of all usage scenarios – that is when we have no data against a member relevant for the same period. Because it still yields nothing when we go through the data in the cube, it gets hidden, too.

Let’s explore a simple case. If we have a SCD called Employee Type, which has two values: Internal and External for 2010, but three members Internal, Non-Internal and Contractor for 2008 and 2009; and then we have a measure called Employee Count, if we slice by Employee Type we get something like:

Internal          150
External           20
Non-Internal       15
Contractor         10

When we slice by a year we may get something like:

                 2008    2009    2010
Internal           50      50      50
External                           20
Non-Internal               15
Contractor          5       5

Note that if we slice by 2008 and we hide empty cells we would get:

                 2008
Internal           50
Contractor          5

However, what we may want to see on our report may be:

                 2008
Internal           50
Non-Internal
Contractor          5

Un-Hiding empties would actually show:

                 2008
Internal           50
External
Non-Internal
Contractor          5

Well, in SQL we would just use the EffectiveFrom and EffectiveTo dates in the dimension table to determine the correct results. In SSAS we are not that lucky. I would like to show a possible solution.

Firstly, for an SCD we would typically have the dates I just mentioned – EffectiveFrom and EffectiveTo for each row. I would typically exclude them from the dimension in SSAS, but in our specific case we need them. Therefore, we can add them and just hide them instead of excluding them completely. Once we have them set up in this way we could write a bit of MDX, using LinkMember:

WITH
MEMBER [Measures].[ActiveType]
AS
  IIF({{LinkMember([Type].[EffectiveFrom].CurrentMember,
                   [Date].[Date]):
        LinkMember([Type].[EffectiveTo].CurrentMember,
                   [Date].[Date])}*
       [Date].[Year].CurrentMember}.Count = 0,
      NULL,
      1)
SELECT
{
  [Date].[Year].[Year].&[2008]*
  [Measures].[Employee Count]} ON 0,
{
  NONEMPTY([Type].[Type].[Type],
           [Measures].[ActiveType])
} ON 1
FROM [MyCube]

Now we get exactly what we want (Non-Internal shown in 2008 but with no data):

                 2008
Internal           50
Non-Internal
Contractor          5

Please note that for large dimensions this is a very bad approach from performance point of view and should be avoided. Surprisingly enough, users rarely consider performance in their top 10 priorities, while functionality somehow always makes it there, so someone might find this technique useful in extreme user cases.

It would be also interesting if there is another approach to this scenario, which I may be unaware of – I am sure developers have hit (or have been hit) by this problem in the past and there must be other solutions, as well.

Note: If you try to replicate this in Adventure Works as I did, you will find that the keys in the Date dimension(s) and the keys for the Start and End Date in the Type 2 SCDs are not the same. The Date dimension uses integer keys, while the Start and End Date attributes use datetime. Therefore, LinkMember will not be able to match these cross-dimensional attributes and the above approach will not work. As a lesson from this exercise – Effective From and Effective To columns should be of the same data type as your date attribute key in the Date dimension table.

SSAS , , , ,

 

Default Measure in SSAS Cubes

January 7th, 2011

When writing MDX it is always a good idea to know what the context which the query executes in is. If we do not explicitly specify a hierarchy member in a tuple SSAS replaces it with the default one, which way more often than not is the All member for dimensions. Because the All member is the default default member, it is easy to think that the default is always the All. This is untrue – the default can be easily changed through BIDS. A sinister (not really, but the word is cool) consequence of this is very apparent when we consider the Measure dimension.

Measures are in a dimension of sorts. It does not have an All member, so we can say it is non-aggregatable; however it has a default member, which is used whenever we do not explicitly specify a [Measures].[<member>] to be used. The most confusing part is when we omit the Measures member in a function call, and a prime example is the NonEmpty() function call which goes like this:

NonEmpty(<set>,<set>) -> NonEmpty([Customer].[Customer].[Customer], [Date].[Calendar].[CY 2007])

What we get here is a slight problem. NonEmpty still uses a measure - that is the default measure. And if we are not careful, we end up getting incorrect, or even invalid results (e.g. if the default measure is not related to one of the dimensions we may get the set of all members from the first set). However, if we do:

NonEmpty([Customer].[Customer].[Customer], ([Date].[Calendar].[CY 2007], [Measures].[Internet Sales Amount]))

We will get exactly what we want – the function uses the correct, or at least a known measure.

Often we do not know what is the default Measure member. To find it out we can just write this little query:

WITH
MEMBER [Measures].[dMemberName] AS
[Measures].DefaultMember.Member_Name
SELECT
{
[Measures].[dMemberName]
} ON 0
FROM [Adventure Works]

The result is the default Measure member name.

SSAS , ,

 

Switch to our mobile site