Share what you know with millions of people

Focus is the best place to turn what you know into remarkable content
×
0

Enterprise vs. Open Source vs. SaaS BI: Some Thoughts

Introduction

There is a temptation to assume that BI solutions are the same as other applications, and that there will be the same cost considerations, the same functionality, the same tradeoffs. As it turns out, BI is not your average CRM, ERP, or SCM application. For example, SaaS BI simply is not appropriate to handle all of a large enterprise’s BI needs, not because it isn’t functional enough (the differentiator in some other apps), but because in large-enterprise BI, customization as well as moving data to one large data store continually are more difficult to achieve in a SaaS solution. Likewise, using open-source programming for solution development and customization to the needs of the enterprise can be quite effective with other applications; with BI, the database and data-mining expertise needed is relatively lacking in the open-source community.

Nevertheless, we can make some tentative statements about the pros, cons, and best fits for each type of BI. What follows is some brief, broad-brush generalizations about enterprise-app vs. open source vs. SaaS comprehensive BI solutions.

Analysis

Let’s consider the architecture, TCO, and best usage of each solution type in turn.

Enterprise BI

An enterprise BI solution such as IBM Cognos sets up data feeders from operational databases such as order entry into a common central data store, known as a data warehouse (or multiple data marts). Data comes into the data store in hourly, daily, or weekly bursts, and the periods of time not spent on “mass loads” is spent on running queries against the data store using “BI software”. Today, such a store may be terabytes or more, and it is typically composed of numeric or text data records with relatively small sizes.

BI solutions cost a lot, but a large part of the cost goes to database administration. The reason is that the data warehouse has to maximize query performance, day after day, year after year, as the data-store size increases by 50% a year. Only a fine-tuned, powerful database can handle the job, and every customer believes that his or her fine-tuning is not “one size fits all” – it’s very hard to outsource the tuning that the administrator performs on the database.

However, the same is not true for SMBs. Up to a certain point (in my opinion, somewhere around 500-1000 employees in the company) these companies need raw power rather than customization. Moreover, it’s possible to find a cheap enterprise database that will handle all the load that an SMB can throw at it, and deliver “near-lights-out administration” as well. The result is that, according to my studies, an SMB can save more than 50% in 3-year TCO by using one of these instead of Oracle.

Open Source BI

An open source BI solution such as Jaspersoft or Pentaho replaces the license cost of a full BI solution with an open-source “free” distribution of software, plus either a fee for services or an “enterprise edition” at moderate cost. The architecture of the open-source solution is pretty much the same as that of an enterprise BI solution, although the prevalence of open source communities on the Web has led to a significant presence of open-source BI software in public clouds.

The main attraction of open-source BI is the reduction in license costs. Note, however, that the open-source BI solution either uses an enterprise database, in which case overall costs are not reduced by much, or its own open-source database (typically MySQL), in which case the open-source solution won’t scale as well and may be more appropriate for an SMB.  The main possible problem with open-source BI is not the possible security vulnerability of company data (since users can always take advantage of sophisticated Web security schemes and keep the physical architecture in the company itself), but rather the relative inexperience of today’s open-source community with scaling databases. It is only very recently that open-source databases like mySQL have implemented some of the basic mechanisms of enterprise databases to ensure data integrity and consistency, and Java programmers frequently betray a poor understanding of database schemas. Finally, for some SMBs, databases that offer "administration for dummies" are vital, because good database-administration personnel are just not out there to be hired, even in today's economy. All in all, open-source BI right now occupies a “middle tier” in the BI market – good for medium-to-large-scale implementations where Web knowledge is plentiful.

SaaS BI

Birst is a good example of the new breed of SaaS BI provider. The architecture is hosted and multi-tenant (multiple users can share one BI “veneer” and physical data store). Instead of flowing operational data to an in-house data store, Birst redirects the data to a Birst data center “in the cloud.” To implement Birst, one simply inserts new generic ETL software that feeds the hosted hardware, and the Birst solution auto-discovers the structure of the existing data. Thus, deployment is quick, and administration is cross-customer, cutting the costs (included in the price) of database administration. Moreover, the solution itself is necessarily quite agile, being able to adapt more readily to, or be customized more quickly for, new data types and new kinds of transaction streams (with cost-saving load balancing).

However, many large-enterprise implementations do not just store new data in the data warehouse; they also store historical data. Moving massive amounts of new data to a geographically farflung SaaS data center is much slower than moving multiple smaller streams of that data to a local data center or one with dedicated communications. Things are even worse when historical data is involved, because it can increase the amount being loaded by one or two orders of magnitude. The proof of this is in the new cloud concept of "data locality": although theory says that applications can be moved quickly between geographies in a public cloud, in fact implementers keep the data where it is, and "pretend" that the data has been moved along with the code -- because moving large amounts of data dynamically croaks performance.

The result is that SaaS BI is especially good for one of two situations: handling a new SMB’s BI, or serving as a complement to a larger organization’s BI to do quick ad-hoc deeper data mining for particular, smaller data marts or tables.

Conclusion

Users should try to pierce the veil of vendor claims and counter-claims about open-source and SaaS/cloud BI by focusing on the ability of the solution – the whole solution, not just the BI software -- to scale and cut database-administration costs.  There is much that is attractive about open-source and SaaS BI solutions right now – fast deployment and data-store modification for SaaS, fast custom-program modification for open-source – but that doesn’t mean they are suitable for all of an enterprise’s needs. On the other hand, enterprise information management no longer requires one solution for all needs; the days when Oracle could credibly recommend a one-stop data shop are pretty much over. The new BI solutions don’t typically provide almost all of the answers; but, separately or in combination with enterprise BI solutions, they cover more user needs than ever before.

Disclosures and References

Birst web site, Birst publicly available TCO white paper.

1
Blewis

Thanks for mentioning Birst as a leading example of SaaS BI.

To underscore your TCO point, an evaluation of the TCO of SaaS vs. Enterprise has shown that SaaS BI can be 33% the 5 year TCO of on-premise software solutions. That's a significant savings, in both upfront and ongoing costs.

I also wanted to point out that beyond BI for SMBs, Birst also has many enterprise customers such as RBC Wealth Management, Citrix, and Securian, who are using it to analyze key parts of their business, such as supply chain management, so it's not just for smaller projects. This is why RBC won the TDWI Best Practices award for its Birst implementation.

- Barbara Lewis
Birst
www.birst.com

1
Kirsty Lee
We Are Cloud
Posted on Nov. 8, 2010

Anyone interested in SAAS BI solutions should also take a look at Bime. Setting itself apart from other SAAS BI tools, it can connect directly to Google Analytics data, something increasingly desired by many organizations nowadays.
http://businessintelligence.me

1
Alex Wied
Partner, Project Partners
Posted on Nov. 23, 2010

Wayne,

Thanks for the post. This is a great summary.

There’s one specific point in your post I’d like to comment on. I don't exactly agree with your observation on open source databases. MySql scales extremely well (horizontally), and if you go the commercial open source edition it comes with many critical enterprise features. Ingres, as another example, is an open source database recommended by Gartner for mission critical applications. And to make the list complete, EnterpriseDB is a mature enterprise product as well. All three products, specifically the commercial open source editions by the respective vendors are serious contenders. Sure, depending on your needs to may need a BI flagship and complex Oracle setup. But in other instances, an open source BI including an open source database might fully meet your needs.

Talking about Jaspersoft and Pentaho, they have impressive products and work with those database vendors above. Meaning, MySql is a good option but your choice is not limited to that product only.

If would also point out that the value of open source BI goes beyond cost. No doubt that eliminating license cost is a compelling factor. However, equally important in my view is also the relevance of innovation. Ingres VectorWise is an excellent example. Or consider coupling Red Hat’s Metamatrix with Pentaho on a mix of proprietary and open source datapools. Or have Infobright (which uses MySql) competing with Oracle to feed thousands of records per second into a data warehouse and you’ll be amazed by the results of the open source product (I was when we did the comparison for client and finally recommended Infobright).

In summary, here’s what I typically recommend to my clients:

• Evaluate and assess open source BI the same way you evaluate any other proprietary software in your organization. Open source is not a silver bullet, nor is any proprietary product. Have a transparent business case and know what you need.

• Consider open source databases, look at MySql (and eco-system of add-ons), Ingres, EnterpriseDB.

• In enterprise IT, consider commercial open source only. Secure support, SLAs and expert know-how.

• If you’re hesitant, start deploying open source BI small scale (e.g. a single department or a defined subset in you organization) and go from there.

--Alex

0
Arun Khan
Director, Silver Arc Solutions (India) Pvt. Ltd.
Posted on Dec. 21, 2010
  • Recommended by:

Vis-a-vis open source databases, I agree with Alex's viewpoint. PostgreSQL open source and commercial (EnterpriseDB) is what I recommend to clients.

0
Surya
Posted on Jan. 25, 2011
  • Recommended by:

Hi,

I have been in SAP BI space for a long time. We are planning to develop some BI solutions in some vertcials such as healthcare using opensource BI tools such as Pentaho. We are trying to target SMB market.

Could you let us know some of the techie forums to do some research on deciding with a right tool, solutions as we are still in the discovery path.

Answer This Question