Friday, November 5, 2010

Cloud Computing Meets Data Warehousing

 

Thus is the dynamic landscape of the emerging cloud computing environment. What will the effect be of the encounter between cloud computing and data warehousing? First, data warehousing will do to the cloud what it did to web service – raise the bar. Second, it will push the pendulum back in the direction of data marts. Third, it will deflate the inevitable hype being generated in the press.
First, data warehousing raises the bar on cloud computing. Capabilities such as data aggregation, roll up and related query intensive operations may usefully be exposed at the interface whether as Excel-like functions or actual API calls. Cloud computing is the opposite of traditional data warehousing. Cloud computing wants data to be location independent, transparent and function shippable, whereas the data warehouse is a centralized, persistent data store. Run-time metadata will be needed so that data sources can be registered, get on the wire and be accessible as a service. In the race between computing power and the explosion of data, large volumes of data continue to be stuffed behind I/O subsystems with limited bandwidth. Growing data volumes are winning. Still, with cloud computing (as with web services), the service, not the database, is the primary data integration method.
Second, data warehousing in the cloud will push the pendulum back in the direction of data marts and analytic applications. Why? Because it is hard to image anyone moving an exiting multiterabyte data warehouse to the cloud. Such databases will be exposed to intra-enterprise corporate clouds, so the database will need to be web service friendly. In any case, it is easy to imagine setting up a new ad hoc analytic app based on an existing infrastructure and a data pull of modest size. This will address the problem of data mart proliferation since it will make clear the cost and provide incentives for the business to throw it away when it is no longer needed.
Third, the inevitable hype around cloud computing will get a good dose of reality when it confronts the realities of data warehousing. Questions that a client surely needs to ask are: If I want to host the data myself, is there a tool to move it? Since this might be special project, how much does it cost? What are the constraints on tariffs (costs)? The phone company requires regulatory approval to raise your rates; but that is not the case with Amazon or Google or Layered Technology. Granted that strong incentives exist to exploit network effects (economies of scale and Moore’s Law like pricing). It is a familiar and proven revenue model to give away the razor and charge a little bit extra for the razor blade. Technology lock-in! It is an easy prediction to make that something like that will occur once the computing model has been demonstrated to be scalable, reliable and popular.
Under a best case scenario, economies of scale – large data warehousing applications – will enable a win-win scenario where large clients benefit from inexpensive options. However, in an economic downturn, the temptation will be overwhelming to raise prices once technology lock-in has occurred. Since this is a new infrastructure play, it is too soon for anything like that to occur. Indeed, this is precisely the kind of innovation that will enable the economy to dig itself out of the hole into which the mortgage mess has landed us. Unfortunately, it will not make houses more affordable. It will, however, enable business executives and information technology departments to do more with less, to work around organizational latency in any department and to compete with agility in the digital economy. It is simply not credible to assert that any arbitrary cloud computing provider will simply be able to accommodate a new client who starts out requiring an extra ten terabytes of storage. Granted, the pipeline to the hardware vendors is likely to be a high priority one. The sweet spot for fast provisioning of data warehousing in the cloud is still small- and medium-sized business and applications. 

No comments:

Post a Comment