Skip to content

25 Years of OLAP – A personal view

In the beginning there were spread-sheets.

VisiCalc was the first spread-sheet program available for personal computers. It probably was the application that turned the microcomputer from a hobby for computer enthusiasts into a serious business tool. VisiCalc sold over 700,000 copies in six years.

Soon more powerful clones of VisiCalc were released. Eventually, Microsoft Excel (introduced for the Macintosh in 1985 and for Windows 2.0 in 1987) became the dominant spread-sheet tool, which in recent years has become the most important end-user tool (via Pivot Tables) for Microsoft’s OLAP server software.

I have used 5 OLAP software since the early 1980s. Here is a very brief précis of each of them:-

Wizard –> System W –> One-Up/Prism

In the early 1980s, Wizard was initially developed for use on Comshare’s time-sharing mainframe computers. Wizard extended 2-d spread-sheets into multi-dimensional databases, which were mainly used for management reporting, budgeting and financial modelling. There were 2 PC versions of it: One-Up and then Prism but both were sold, primarily, with the mainframe versions.  

So, what is meant by “multi-dimensionality”?

Consider: You want to monitor several Sales measures (volume, price, sales) over Time (say, 12 months) and you also want to store and analyse this data by Customer, by Product and by Location. In short, 5 “dimensions” in all. How do you visualize this amount of data?

My analogy is based on a “Filing Cabinet” …

Start by taking a single spread-sheet (Measures (rows) by Time Periods (columns)) – You could treat this as one piece of paper (for 1 Customer, 1 Product, 1 Location).

Next, create and place the above data for all your Customers in one drawer of the Filing Cabinet.

Next, place all your Customer x Product data in one physical Filing Cabinet.

Next, place all your Location x Customer x Product data in separate physical Filing Cabinets (in one room).

Aggregations (summations) of the above data would require additional Filing Cabinets.

Now, if you add another dimension – you will need lots of Rooms. Right?!

And so … with lots more Dimensions (which is not uncommon these days) … Rooms become Floors become Buildings become Towns become Countries become Planets etc.

Hopefully, the above analogy highlights (at least) 2 problems associated with multi-dimensionality:-

There is a high degree of Sparsity, that is, not every Customer buys (or sells) every Product in every Location; and

You would have to travel to far-flung places to actually get hold of your required spread-sheet!

Moving on, Wizard’s success grew and it needed to be marketed world-wide. The name Wizard could not be used in north America and so it was re-branded as “System W” (because of its W> prompt on the time-sharing computers).

By the early 1990s, System W was used by the top 3,000 companies world-wide and could therefore claim to be (at least in my opinion) the “Mother” (literally) of all DSS / OLAP application software.

Essbase

Essbase means “Extended Spreadsheet Database”. It was first released by Arbor software in 1992. Why was Essbase developed? To fill a gap in the PC client/server (i.e. not mainframe) market. No OLAP database existed in 1992 which offered true client/server processing with concurrent read/write access as well as the capacity to handle large volumes of data (100mb, not GB!!!).   

Essbase grew to replace System W for traditional management reporting and budgeting applications.

It was a robust technology with strong modelling rules, eventually capable of handling gigabytes of data, with a tried-and-tested Excel Add-In, the latter being a real differentiator.

Buttt … Essbase was primarily a numerical database (it did not handle textual Measures initially), without any data integration (aka “ETL”) component and no GUI as such. It was also expensive. Taken together, this was a gap waiting to be filled by others.

That said, Essbase could, arguably, be considered to be one of the 10 most influential technology innovations of the 1990s (along with Netscape, Google etc.).

Gentia

Gentia means “For Everyone” – the first version was released in 1993 by Planning Sciences. The functionality concentrated on large scale EIS/OLAP applications and contained some features which were a vast improvement on the competition. In particular, it offered a superb (for the time) development (Book/Page) environment and seamless integration between the screens (“Pages”) and the OLAP database (“GADB” and later “GentiaDB”).

Buttt … Gentia could not do it all! For example, it had weak modelling rules, was not suitable for budgeting applications, did not possess a data integration tool, a complex programming language (“GDL”) and, worst of all, was very expensive!

Arguably, Gentia should have been acquired by a larger organisation which would have taken it to the next level. It was not and, sadly, its demise was as swift as its rise.

Analysis Services

In October 1996, Microsoft purchased OLAP technology from Panorama and also hired some of the key relevant personnel from the same company. Initially codenamed “Plato”, it was shipped with SQL Server 7 and eventually “integrated” with SQL Server 2000. Arguably, it was not until SQL Server 2005 (SP1) that Analysis Services became serious production-worthy software.

The coming of Analysis Services spelt the “death-knell” for the independent EIS/OLAP vendors, such as Gentia, because (a) it was (and still is) relatively cheap, (b) not proprietary (i.e. it was used with SQL databases), and (c) it was from Microsoft, which was a large, stable and growing organisation at the time.

From my perspective (as a Developer), the Analysis Services databases (“cubes”) were much smaller than, for example, the Gentia ones which meant that much larger applications (more Dimensions, more Measures) could be developed. With the 2005 version came the Partitioning functionality, which is an effective way to overcome the aforementioned Sparsity problem.

Analysis Services continues to evolve, with the recent release of the “PowerPivot” functionality and the up-coming “Tabular” option, both of which use column-based (“Vertipaq”) technology.

WhiteLight

WhiteLight Systems started in 1992 with a product on NeXTStep but did not get going until Michael Brill of Sybase heard about it in 1994. Sybase then decided to fund its development until 1997 after which it was funded by private equity firms. Its sales peaked around the turn of the century but it was very expensive software which was sold mainly to banks and after 9/11 direct sales of it plummeted.

From my perspective (as a Developer), the key feature of the software was that it did not store the fact data.  Just imagine that for a moment! When you initiated a query then WhiteLight converted the query into SQL, fired the SQL at the relational database, retrieved the results, performed any aggregations / re-calculations in-flight, cached the results  and presented the results to you. Quickly. It was “like running Excel against a Server”.

Finally …

I am also a Developer.

I spend most of my time delivering information systems – MIS / DSS / EIS / DW / BI.

Over the past 25 years (and over 50 projects), I have had to wrestle with integrating completely different technologies from different (or even the same) vendors on the same project, all of which was oblivious to the end-user.

Arguably, not since Comshare’s “Commander” offering of the late 1980’s has a single vendor provided an “integrated” suite of components for an information system:  relational database, data integrator, olap database and reporting functionality.

Arguably, that offering today is SQL Server 2008 R2 … with Windows Server 2008 R2 … they are best together. 

Advertisements

The Confession

Aim to leave things in a better place.

I have been asked to conduct numerous, varied but primarily technical assessments.  They range in scale from conducting wide-ranging system audits to simply improving a single fragment of badly-performing T-SQL code.

The wider ranging ones are, arguably, akin to being in a Confessional, albeit on the other side. A good and technically knowledgeable listener is required. But, enquiring is also permitted and is essential in certain situations. And, hopefully, the end result is that everyone is left in a better place.

Here are 10 lessons that I have learned over the years from the more wide ranging assessments:-

Lesson #1 – Establish the Problem Definition

If it is not provided then write it yourself and get confirmation. Seriously.

Lesson #2 – Determine why someone is concerned

Ask – and they will tell you – you might even be given a whole slide-deck for your perusal!

It will tell you why – it will help to “zero-in”  on the areas of concern

Lesson #3 – Establish what is IN-SCOPE and what is OUT-OF-SCOPE

Be bold – what is out-of-scope may be important

Lesson #4 – Understand The Firm

It’s background – it may be in the midst of the throes of “integration”.  You will need to know this for the “casual” conversations during the coffee and lunch breaks.

It’s locations: ”far-flung” places that you will need to get to

Lesson #5 – Understand the “Divisions”

Architects, Analysts, Developers, End-Users, DBAs, Testers, Support; as well as

the Others e.g. the software vendor

Lesson #6Time Management

All assessments are usually Time-Boxed, so map the days out.

From the outset – Prepare a “Daily” Plan with contingency slots at regular intervals.

For example, for a budgeted 10 man-day assessment:-

Days 1,3,4 – Interviews, in different locations

Day 2 – Prep – Areas of Investigation

Days 5 & 6 – Initial Draft

Days 7,8,9 – End-less Reviews & Refinement (fill-in-the-gaps)

Day #10 – showcase findings

Lesson #7The Daily Shout

Consider having daily catch-up conference calls at, say 4pm to identify show-stoppers; and

discuss findings to-date during these calls – so that, at the end – nothing comes as a surprise

Lesson #8The Confession

Face-to-Face Discussions work best for me  (for the important people)

(even if they are in far-flung places)

Otherwise, Conference Calls are the next best thing

(Use e-mail for Intro & Follow-up & Thank You)

Prep! Prep! Prep! – Always prep for all meetings – don’t lose sight of the wood from the trees!

Not forgetting: Style – We are all different – Some humour may help!

Lesson #9Take a break!

I read the whole book in one day, on the train journeys between long and intensive meetings.

Lesson #10 – Don’t be surprised (or even disappointed) if not one of your recommendations is implemented (for now). (That’s the twist.)

Finally, such wide-ranging assessments should aim to add-value.

You have had the luxury to look at an application with a fresh pair of eyes – try to uncover just one important potential problem which, if rectified, will leave the application in a better place.

A SQL Layout Standard

 A reminder for me which, hopefully, should help you as well when entering SQL code.

Why?! Because SQL which is EASY TO READ is much better than SQL which is not – Think about the person who has to decipher it later!

If it is easier (and it usually is) to simply “crash-touch-type-in” the code INITIALLY then that is OK … Buttt … Don’t forget to tidy-it up!

Either use a good formatting tool or follow the guidelines below.

If you “forget” to tidy-up then, hopefully, the code will be rejected at the “Code Review”.

Some general guidelines for before you get started:-

  1. if a standard already exists then follow it, even if it is different from this one
  2. if different standards exist (within the same application) then ask the Team Leader
  3. when updating existing code follow the existing standard (if one exists)

Indentation – I prefer using SPACES instead of the TAB character.

     SSMS -> Tools -> Options -> Transact-SQL -> Tabs -> TabSize=4 -> check: Insert Spaces —> TEST IT

IntelliSense – ensure it’s enabled

Then as you input the code:-

  1. All keywords e.g. SELECT should be input in uppercase
  2. Commas separating the columns should be at the START of each line, in column 4
  3. Use separate lines for each column
  4. Aliases for table column names should be spaced well apart e.g. starting in column 41 
  5. if possible, keep to within 90 columns – doing so should mean not having to scroll right too often

COMMENTS

  1. insert as many as you like providing each one is relevant
  2. insert some at the beginning (see example below)
  3. optionally, insert the Author’s name & date last modified – this helps Support staff

Finally, ask someone to do a “Code Review”.

An example:-

— This SQL analyses 3 MSDB tables to determine the longest running JOB STEPS

— Check that the WHERE clauses are not removing rows that you may actually require!

— Other tables which could be be useful:-

—    msdb.dbo.sysjobactivity (stores current execution activity)

—    msdb.dbo.sysjobstepslogs

— Author: Andrew Baker, Last Modified on: 2-Feb-2012

SELECT

    h.[server]                               AS HistoryServer

   ,s.step_name                         AS StepName

   ,h.run_date                            AS HistoryRunDate

   ,j.name                                     AS JobName

   —

   ,(  (h.run_duration/10000 *3600)

     + ((h.run_duration%10000)/100 *60)

     + ((h.run_duration%10000)%100)

    )

                                                        AS HistoryRunDurationInSeconds

   —  

   ,CONVERT(DATETIME, RTRIM(run_date)) +

    (  (run_time/10000 *3600)

     + ((run_time%10000)/100 *60)

     + ((run_time%10000)%100)

    ) / (23.999999*3600)                — seconds in a day

                                                        AS HistoryStartDateTime

   —

   ,h.run_status                        AS HistoryRunStatus

   ,s.step_id                                AS StepID

   ,s.subsystem                         AS StepSubsystem

   ,j.version_number             AS JobVersionNumber

   ,j.date_modified                  AS JobLastModifiedDate

   ,s.command                           AS StepCommand

FROM

    msdb.dbo.sysjobhistory AS h        — contains the detail for each Job Step execution

LEFT JOIN

    msdb.dbo.sysjobsteps AS s          — contains the STEPS for each Job   

    ON  h.job_id = s.job_id

    AND h.step_id = s.step_id

LEFT JOIN   

    msdb.dbo.sysjobs AS j              — contains the list of Jobs

    ON j.job_id = s.job_id

WHERE

    j.[enabled] = 1

AND h.run_status = 1

AND h.run_duration > 30 — seconds

ORDER BY

   s.step_name ASC

  ,h.run_date DESC

  ,h.run_duration DESC

Parallels with Da Vinci

Is it possible to draw a parallel between an information system and the artistic works of Leonardo da Vinci?!

The Lady with an Ermine. La Belle Ferroniere. The Virgin Mary with St. Anne. To name just a few of Da Vinci’s great artistic works.

You have to actually see these works to appreciate their true majesty. And by “see” I mean actually study the real thing for a good few minutes. Books, the Internet and even the Prints cannot do them justice. The lustre of oil on canvas shone in the correct lighting unveils the fine facial expressions as well as the richness of the varying shades of colour. Having actually done so, I can appreciate Da Vinci’s:-

  • powers of imagination and truly awesome expression thereof
  • devotion to the effect of lighting on colour (e.g. a bright blue fading away into the dark)
  • fine attention to detail – not just the shadows but also the embroidery on the subject’s vestments

Where might one perceive such virtues in every-day life today?

Take, for example, the 3-Series (E91 model) Tourer – the graceful contours in the body styling, the colour montego blue – rich in both daylight and twilight , numerous useful sensors and the attention to detail that, for example, makes the task of replacing a chipped windscreen an almost effortless exercise.

Such virtues are less obvious in the development of information systems but they can and should exist:-

  • Imagination, Expression and Desire are needed (in large quantities) to Conceive, Design and Deliver such systems
  • Documenting the system (both inside and outside the code) provides the Developer with the opportunity to display some artistic licence
  • (Examples: a well-laid out Data Model, Control Flow and Data Flow as well as vivid and revealing diagrams which enable Support and future staff to quickly and correctly determine the true functionality of a sub-system)
  • Attention to detail at all stages in the developement life-cycle e.g. Cost-Benefit Justification, Capacity Planning, Table Partitioning and Compression, Processing and “User Response” times

Yes – Parallels can exist with Da Vinci’s artistic works, never mind the rest of his output.

Project Closure

I have been involved in lots of project closures in my time.

Projects can only begin to arrive at closure if the actual users of the deliverable (e.g. an information system) are happy with it. User confidence needs to be measured (and recorded). This is done through sign-offs – be they verbal, e-mail or a physically signed piece of paper – at a number of checkpoints (not just at the end). 

Just to emphasise – Obtaining Sign-Offs is Key – they should not (in my opinion) be just a CYA exercise (although some see it as such) – they are primarily a measure of user confidence. If someone is asked for sign-off and does not provide it then there is a good reason for it – the project manager needs to find out why and resolve it!

The first request for sign-off is at the end of Unit Test / System Test – Usually a Business Analyst is involved at this stage and should sign off on all Test Cases etc. (This assumes that all the preceding stages have been signed-off, that is, Terms of Reference, Requirements, Architecture etc.)

 If you cannot get an e-mail sign-off at this stage from anybody from the client side then something is not quite right! 

The second request for sign-off is at the end of UAT – this must come from the project sponsor, not just from the Business Analysts, End-Users and Systems staff.

The third request for sign-off is the prep list for implementation in Production. This is signed-off by the IT Director (after approval from the Operations Manager). This is the final check to make sure that the basics have actually been done e.g. Capacity Planning, so that when the changes are implemented then the system will not fall apart due to, for example, insufficient Memory etc.  A basic prep list is a must! An essential part of the prep list is the roll-back strategy!

The fourth request for sign-off is immediately after the changes have been implemented in Production. The Business Analyst(s) and key users must verify that (a) the changes have been implemented successfully; and (b) there has been no adverse knock-on effect on any other part of the system. If the roll-out has not been successful (for whatever reason) then it must be possible to roll-back to the state prior to the roll-out.

The last request for sign-off is 3 months after the release – the success or otherwise (as well as the lessons learned) of the changes should be reviewed and signed-off by the Project Sponsor. If this does not happen then there should be something in the contract which states that the system is deemed to be operational 3 months after it has been used in Production.

People! People! People!

What makes for a successful project?!

In the same way that the most important aspect about a residential property is its location, arguably, the most important contributor to a successful IT systems development project is the project team. People! People! People!

Quality, not Quantity – More people just get in each other’s way. You need the right people. And, people who are “Up-For-It” as well as “Up-To-It”.

This post (and the whole blog, for that matter) only covers IT projects and, primarily, “Business Intelligence” systems development ones at that. I don’t have any experience of any other kinds of projects, so I have no idea whether these ideas could be applied elsewhere.

Good project management would split up a big project into lots of little ones etc. Each of the small projects could then be undertaken by a small team. A team that consists of all the right people. A team that delivers quickly. In my experience, such a BI team would consist of:-

1. A Senior Business Analyst – whose “Day Job” is assisting in running the business

2. A Senior Developer – with a sound understanding of the Microsoft BI software (esp. its limitations)

3. A Senior IT Support person – to provide information on those parts of the IT systems of which the Developer has no knowledge

4. One or more Experts who can perform Design Reviews and Code Reviews 

5. A Project Management team which supports (in every way) the above, when necessary.

Extra pairs of hands are only useful when requested by the Developer. Additional Quality Staff are useful. The time spent in explaining requirements is time well spent if the additional staff are “up-to-it” and “up-for-it”.

Get the right people, let them get on with it and they will deliver.

Happy New Year!

Greetings!

This could be the start of my personal blog!

A blog on all things Microsoft – BI related.

Have a Happy New Year.

Andrew