A Closer Look at PALO and GPGPU
Last week I had the pleasure to meet a friend of mine, who formed a company I wrote about a year or two ago. His business has grown nicely since then and they have become the number one PALO partner in Australia. For those who are not aware of Jedox and PALO, I would recommend visiting their website at www.jedox.com – it is an open source BI suite very similar to SQL Server, minus the relational part. Since I was given a private show (no, nothing immoral here) in their corporate setup, I thought it may be interesting to discuss what I saw here in this post.
There are a few interesting and vastly different aspects of PALO when compared to the SQL Server BI stack:
For me the best feature they have is the General-Purpose GPU support in the OLAP server. While the OLAP components can be queried through MDX much like SSAS, they solve query bottlenecks with raw power. As far as I am aware, PALO supports CUDA, or the NVIDIA implementation (ATI have their own) of the GPGPU vision. If this all sounds a bit foreign, have a look at Tom’s Hardware article “The Advent of GPGPU“, where the concept of using the GPU for computational purposes is explained in a fair bit of detail. In short, by harnessing the power of NVIDIA GPUs, the processing power of a PC jumps from a few GFLOPs (50-60 GFLOPs on my i7 2600K OC-ed to 4.5Ghz) to 1500-1800 GFLOPs on my NVIDIA GTX 570 GPU. This means that for GPU optimised calculations, a PC gets a boost of a factor of 30. Both NVIDIA and ATI can see the potential and have been working hard in the last few years to get better drivers and better support for such applications. PALO in particular prefers the NVIDIA Tesla GPU. Note that a Tesla does not even have video output – it is used only for calculations, supports ECC memory (thus making itself ready for enterprise environments), and has been designed from the ground up for CUDA.
In terms of PALO, I got told that when they have an optimised query performing badly, adding a new Tesla unit in the server solves the problem. Their experience shows that the servers scale up linearly with every new GPU, and since NVIDIA’s SLI allows multiple GPUs running in parallel, adding 2-4 such units is all it takes to create a very, very fast computational workhorse.
Another area where I was impressed was the way PALO does mobile. They have free apps for the iPad (which I saw in action), as well as the iPhone and Android. Their vision is that information dashboards are best seen, and mostly required on the go when BI users have limited ability to browse around and get a deeper insight. I tend to agree to some extent. In my experience, the information dashboard is a slightly overrated concept. Having it on your phone or tablet where you can easily connect to you corporate environment and check some numbers quickly is a nice idea and I hope we see it becoming a part of the Microsoft stack sooner rather than later. The application which PALO have is quite nice minus the pies, allows any form of touch experience (multi-touch included) and allows easy slicing and dicing of data – just how it should be.
Open Source Software Compatibility
The last bit I would offer as an impressive and different to other not-open source vendors is the openness and compatibility of PALO with other open-source tools. Their stack components are easily replaceable. The ETL component can be changed to Pentaho’s Kettle, or JasperSoft’s ETL software which can load data directly in PALO’s cubes. A bit like loading a SQL Server data mart with Informatica, but seemingly better and tighter as the interfaces between the components are, apparently, completely open.
Apart from these areas, I think that the Microsoft stack has a nicer UI, allows easier development, and is richer (with MDS, QDS coming up, Data Mining, etc.). PALO has its own ETL tool, which is not graphical and relies on drop-downs and various windows to get the work done, the OLAP server seems to support many features out of the box, allows querying through MDX and supports write-back, but in general seems quite barren from SSAS point of view. The front-end is either Excel through a plug-in allowing the creation of reports through formulas, Open Office, Libre Office, and PALO’s own web-based spreadsheet environment. Once a report is created in either of those it can be published to a web portal for sharing with other users.
All in all, PALO is a neat, free BI suite, which comes for very cheap initially. There is an enterprise version, which is not free and, of course, any new customers will have to pay for someone to install it, configure it, and implement their requirements which will add to the total cost but these expenses are there for any other set of tools (although, a decent argument can be lead on which suite allows faster and cheaper development). The features listed in this article definitely appeal to some and I am very impressed by the innovative GPGPU capability, which has a lot of potential and I can easily think of a few areas where a 30-fold improvement in computational power will benefit SQL Server BI.