Tag Archives: data virtualization

Presenting “Accelerating DevOps Using Data Virtualization” at #C16LV

Borrowing from Wikipedia, the term DevOps is defined as…

DevOps (a clipped compound of "development" and "operations") is a culture, movement or practice that emphasizes the collaboration and communication of both software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes. It aims at establishing a culture and environment where building, testing, and releasing software, can happen rapidly, frequently, and more reliably.

Now, I hate buzzwords as much as the next BTOM (a.k.a. bitter twisted old man)…eastwood

…but the idea behind DevOps, of building, testing, and releasing software more rapidly and reliably is simply amazing and utterly necessary.

As system complexity has increased, as application functionality has ballooned, and as the cost of production downtime has skyrocketed, writing and testing code leaves one a long way from the promised land of published and deployed production code.

As explained by DevOps visionaries like Gene Kim, the biggest barrier to that promised land is data, in the form of databases cloned from production for development and testing, in the form of application stacks cloned from production systems for development and testing.

The amount of time wasted waiting for data on which to develop or test dwarfs the amount of time spent developing or testing.  Consequently, IT has learned to be satisfied with only occasional refreshes of dev/test systems from production, resulting in humorously inadequate dev/test systems, and that has been the norm.

There is a new norm in town.

Data virtualization, like server virtualization, breaks through the constraint.  Over the past 10 years, IT has learned to wallow in the freedom of server virtualization, using tools like VMware and OpenStack to provision virtual machines for any purpose.

Unfortunately, data and storage did not benefit from virtuaMatrix1lization as well.  This has resulted in a white-hot nova in the storage industry, and while that is good news for the storage industry, it still means that IT has cloned from production to non-production the same way it has done the past 40 years, in other words slowly, expensively, and painfully.

Matrix2And we have continued to do it the old way, slowly, expensively, and painfully, because we didn’t know any better.

The IT industry couldn’t see any better way to do clone from production to non-production.  Slow and painful was the norm.

But once one realizes the nature of making copies, and how modern file-system technology can share at the block-level, compress, anMatrix3d de-duplicate, suddenly making copies of databases and file-system directories becomes easier and inexpensive.

Here is a thought-provoking question:  why doesn’t every individual developer and tester have their own private full systems stack?  Why can’t they have several of them, one or more for each task on which they’re working?

I can literally hear all of the other BTOM’s scoffing at that question:  “Nobody has that much infrastructure, you idiot!

And that is the point.  You certainly do.

You just don’t have the right infrastructure.

This was presented at the Collaborate 2016 conference in Las Vegas on Monday 11-April 2016.

You can download the slidedeck here.

Will the real data virtualization please stand up?

There is a post from a good friend at Oracle entitled “Will the REAL SnapClone functionality please stand up?” and, as well-written and technically rich as the post is, I am particularly moved to comment on the very last and conclusive sentence in the post…

So with all of that, why would you look at a point solution that only covers one part of managing your Oracle infrastructure?

The post does not refer to Delphix by name, and it could in fact be referring to any number of companies, but Delphix is the market leader in this space, so it is reasonable to assume that the “Product X” mentioned throughout the post is Delphix.  The same holds true for any post commenting on relational database technology, which can reasonably be assumed to refer to Oracle.  Regardless, I was struck by the use of the phrase point solution in that final sentence of the post, and how it really is a matter of perspective, and how interesting is that perspective.

First of all, before we go any further, please let me say that, as an Oracle DBA for the past 20 years, I think that the current release of Oracle’s Enterprise Manager, EM12c, is the finest and most complete release of the product since I tested early versions of Oracle EM alongside the Oracle8i database in the late 1990s.  At that time, the product was full of promise, but it wasn’t something upon which an enterprise could truly rely.  That has certainly changed, and it has been a long time coming, starting with the advent of utilities like AWR, ASH, and Active Session History.  If you have extensive Oracle technology in your organization, you should be using EM12c to manage it.  Not EM11g, or EM10g, but EM12c.  It really is that good, and it is getting better, and there are talented people behind it, and you simply need it if you want to maximize your investment in Oracle technology.

But just because EM12c is the center of the universe of Oracle technology, what about organizations for whom Oracle technology is merely a component?  Many organizations have diverse IT infrastructures comprising Microsoft, IBM, SAP, and open-source technologies, and all of those technology components share the need for the basic use-cases of quickly and economically cloning production to create non-production environments to support development, testing, reporting, archival, and training activities.

Should those diverse IT organizations employ a silo tool like EM12c just for cloning Oracle databases, and then find the same functionality separately for each of those other separate technologies?  Would doing so be a tactical or a strategic decision?

So in response to the final question in the SnapClone post, I ask another question in turn…

Why would one look at a point solution that covers only Oracle database?

Access to data for development and testing is the biggest constraint limiting development and testing, so it doesn’t make sense to not enable data virtualization for all applications, regardless of whether they are comprised of Oracle technology or not.  IT agility is a strategic capability important to the entire business, not a technical challenge for a component silo.

But perhaps, in the interest of continuing the Oracle-only focus of the SnapClone post, we could stay inside the bounds of Oracle.  Fair enough, as a theoretical exercise…

So, even if we limit the discussion only to Oracle technology, it quickly becomes obvious that another important question looms…

Why would one look at a point solution that covers only the Oracle database, leaving the application software, database software, configuration files, and all the other necessary parts of an application as a further problem to be solved?

Anybody who has managed IT environments knows that the database is just one part of a complete application stack.  This is true for applications by Oracle (i.e. E-Business Suites, PeopleSoft, JDEdwards, Demantra, Retek, etc), as well as prominent applications like SAP, and every other application vendor on the planet, and beyond.

To do this, one needs a solution that virtualizes file-system directories with software, files, and everything that comprises the application, not just an Oracle database.

To provision those complete environments for developers and testers quickly and inexpensively, one needs both server virtualization and data virtualization.

Unless one has spent the past 10 years in deep space chasing a comet, you’ve already got server virtualization on board.  Check.

Now, for data virtualization, you need to virtualize Oracle databases, check.  And you also need to virtualize SQL Server databases, check.  And PostgreSQL and Sybase databases, check and check.  In the near future, Delphix will likely be virtualizing IBM DB2 and MySQL databases, not to mention MongoDB and Hadoop, ‘cuz that’s what we do.  Check, check, … check-a-mundo dudes and dudettes.

Despite this, even if you’re a single-vendor organization, you need to virtualize files directories and files, on UNIX/Linux platforms as well as Windows servers.

Delphix does all of the above, which is one reason why it is the market leader in this space.

And it has been in general use for years, and so a substantial portion of the Fortune 500 already relies on data virtualization from Delphix today, across their entire technology portfolio, as the partial list online here shows.

Perhaps it is only a point solution from one perspective, but be sure that your perspective is aligned with that of your whole IT organization, and that you’re not just thinking of a strategic business capability as merely “functionality” within a silo.

Data Virtualization and Greener Data Centers

On the Saturday before the Oracle OpenWorld 2014 conference started, I had the added bonus of finding out that the Data Center Journal had published my article on how data virtualization leads to greener data centers.  Hooray!

Unfortunately, I recently learned that Data Center Journal has gone defunct, so my posted article no longer exists.  As a result, here we go…

I recall that the January 2000 issue of National Geographic magazine had a “Letters From The Editor” column article that speculated, in jest, that the rate at which humans were saving back-issues of National Geographic magazine, would by the year 2100 result in the total accumulation of yellow magazines outweighing planet Earth.

Note:  The public archives of Nat’l Geographic magazine appear to only go back to 2005, so I can’t verify the exact issue in which this comment appeared.

Anyway, that statement resonated with me, because although I change residences every few years, it has only been recently that I hadn’t packed and carried my decades of accumulated National Geographic magazines with me.  Now that I’m free of them, I have no idea why I schlepped them with me for so long.  Worse, it cost real money to do so;  movers charge by weight. One mover commented that he was certain that two-thirds of the weight of all my possessions were books and National Geographic magazines, as he handed me an $8,000 bill for the move.

I now collect books on Kindle.  And I dropped off my boxes of yellow Nat’l Geographic magazines at the Goodwill store, in the middle of a dark and shameful night, almost a decade ago.  I don’t know if it was a particularly “green” decision, but I know that my recent moves have been the easiest since I was an undergraduate.

Likewise in data centers.  If we keep doing business in data centers as we have for the past 30 years, quite soon the planet would tilt off it’s axis due to the sheer weight of data storage hardware.

The advent of virtual machines has had a profound impact on provisioning environments.  Instead of unpacking, racking, wiring, powering, and cooling physical servers, data centers can now create virtual machines by the hundreds by pointing and clicking.  All of these new virtual machines share the previously under-utilized CPU and RAM resources of physical servers, making the ROI on CPU and RAM resources sky high.

So, virtual machine technology has allowed data centers to provision several million virtual servers without having to power and cool several millions of physical servers.  They use the existing physical servers far more efficiently.  That is “green”.

Not so with disk storage.

Each virtual machine still requires a full image of storage.  So, as several million virtual servers have been spun up, each has required a full complement of disk storage, thus driving the already overheated computer storage industry into supernova.

I’ve said it before and say it again:  if you have money to invest, do so in either energy or data storage.  We’re never going to use less of either.

So how does Delphix and data virtualization fit in?

Delphix virtualizes data, just as VMware and their competitors virtualize servers.  Delphix data virtualization makes more efficient use of existing storage, and slows the rate of growth of storage in data centers.

That is “green”.

For many, the time has arrived where server virtualization has completely taken over, even for those situations where sharing CPU and RAM resources are not desired.  For high-impact production environments, it is very common to have virtual machines one-for-one with physical machines.  Having production application encapsulated in a virtual machine makes it easier and simpler to migrate to other physical servers, whether to address resource shortfalls or to deal with physical server failure.

In these situations, data virtualization does not yield benefit, green or otherwise.

But in the scenario where a couple, or dozens, or even hundreds or thousands of virtual machines are provisioned to a cluster of physical servers, we have an environmentally unsustainable model, in every sense of the phrase.

An analogy for server virtualization without data virtualization in this latter scenario is an advance in technology to enable us to build automobiles entirely from cheap renewable resources, such as cellulose.  Hey terrific, instead of building cars from expensively mined resources such as metals and exhaustible resources such as plastic, let’s imagine a leap in technology where we could employ cellulose waste from food production, mainly biomass left over from farming.

Perhaps we would have found a way to get rid of all those old back issues of National Geographic?

We could then produce these cars more cheaply and with less environmental impact, using what is essentially mulch, for a fraction of the cost of currently manufactured automobiles.

It would be the golden age of personal transportation.  Everyone on the planet could afford one.

But what if these new automobiles still used internal combustion engines, consuming fossil fuels, at the same level of efficiency as today, about 20-40 miles per gallon?  Even if they were more efficient, upwards of 100 miles per gallon;  would that yield a net benefit to the environment?

Of course not.  The proliferation of these inexpensive, environmentally friendly automobiles would be an utter disaster environmentally, as the consumption of fossil fuels skyrocketed.

The oil companies would be quite happy, wouldn’t they?

That is server virtualization without data virtualization.

Except that it is the storage companies in the place of the oil companies in our analogy.

Server virtualization is a huge advance, but data virtualization is needed to fully deliver on the promise of the solution.