Share what you know with millions of people

Focus is the best place to turn what you know into remarkable content
×
0

How do you get rid of cloud data?

OK. You've made the leap to the cloud. As time goes by, you decide to move to another cloud provider. Migration of your data was a snap - good job on planning and execution!

But how do you get rid of the data that's left behind at your previous provider? Lots of techniques and options to erase and "crypto shred" data when it resides on your hardware. But what about your data that resides on a service provider's infrastructure? How do you ensure it's no longer accessible?

What steps should you take to avoid this problem?

Are some data strategies/architectures better than others when considering this problem?

Are there tools out there to address this problem?

What should you expect (demand) of your cloud provider with respect to "data residue"?

Thoughts?

Attachments

3
Dave Roberts
Vice President, Strategy, ServiceMesh, Inc.
Posted on Oct. 19, 2011

GREAT question. There are a few strategies that we recommend to ServiceMesh customers:

(1) Some cloud service providers (CSPs) have a data destruction policy. When a customer moves on, all the disk data previously used by that customer is overwritten with random bit patterns (typically per some NIST standard somewhere). This is the best option if your CSP supports it.

(2) If your CSP doesn't support it, you can try to do this yourself (for IaaS) by overwriting your own data. There are Linux and Windows programs that will follow the NIST protocols for doing so. One problem with this is that with all the indirection that happens in a modern storage array, there isn't any guarantee that overwriting a given disk block will actually result in the physical media being overwritten. The new disk block at any given logical position in the virtual disk may be written on a completely separate drive in a totally different part of the array. If you're really worried about, this solution should make you nervous. Sometimes, when a service provider claims to be doing data destruction, they are just doing this with a script or some such. It's important to ask them how they *know* that they are in fact overwriting the data. Sometimes, they don't even realize that what they are doing isn't having the effect they want. Note that there is really no good way to do this for PaaS or SaaS, so you'll have to do #1 in that case.

(3) Many of our customers use strong encryption (e.g., AES256 or better) for all disk activity in all cases. The theory here is that you're protecting your data when its at rest while you're using the provider, and then you can try to do whatever data destruction you want yourself (overwriting disk blocks). If, for whatever reason, the overwrites don't deliver what you want, at least your data is protected with strong encryption and is unlikely to be useful to anybody before another customer plops his data down. The ServiceMesh Agility Platform makes this trivial. Just click a single checkbox when creating instances and you're good to go. You can even enforce this with a high-level policy for all external clouds.

(4) For the truly paranoid, you'll have to use an internal cloud and control all the variables yourself.

To learn more about the ServiceMesh Agility Platform, see http://www.servicemesh.com/

2
Krishnan Subramanian
Industry Analyst
Posted on Oct. 19, 2011

I second what Dave is saying

1) Read the contract and insist on putting their data destruction policy on it.

2) Encrypt, Encrypt, Encrypt - Mantra for storing your data on the cloud. For starters, keep your encryption key locally :-)

1
JP Morgenthal
Principal, Ranger | Cloud & VDC Services, EMC Consulting
Posted on Oct. 19, 2011

It is a great question and one that is of major concern in the US Federal government.

Here's a Wikipedia page on data remanence: http://en.wikipedia.org/wiki/Data_remanence

Here's the really nasty goo of cloud storage and storage virtualization. The implementations of redundancy (e.g. RAID5/6/10) and snapshots means that your data doesn't exist just once in one place. It exists multiple times in multiple locations across multiple arrays in pieces and as concurrent chunks.

If this is a major concern, you will need to select cloud vendors that will allow allocate and entire subset of a storage area network that you will use so that the drives can be pulled when you are complete. Most cloud providers will use the overwriting features of the underlying storage system, but that only gets rid of the original copy. There's still the backups, snapshots and, if overwrite is not implemented properly, the redundant bits spread across the backup drives.

It's important to note, that in the cloud environment, it's not plausible that your drive space, once allocated to the next tenant will be able to recover any of your data. So, this really comes down to a trust issue between you and your cloud service provider, which in my opinion dramatically reduces the risk.

1
Andrew Baker
Director, Service Operations, SWN Communications Inc.
Posted on Oct. 25, 2011

Besides the excellent points made so far, there's the added problem of backups. Depending on the type of cloud application, the provider may be legitimately creating some sort of scheduled backup on an aggregate level, rather than a customer level.

Depending on the data retention policy, this data might hang around for quite some time.

To the extent that you can (depending on the cloud service involved), data encryption provides some of the best mitigation against residual data, but it is incumbent on each customer to work this out with a provider right from the beginning of the relationship such that proper support for this vital concern can be baked into the system.

Otherwise, all the customer will have to rely on is trust and the idea that the provider doesn't want to use up any unnecessary space if they don't have to.

-ASB: http://XeeMe.com/AndrewBaker

1
Robin Goodchild
Owner, Antarctic Technologies
Posted on Oct. 25, 2011

=======
It's important to note, that in the cloud environment, it's not plausible that your drive space, once allocated to the next tenant will be able to recover any of your data.
=======
A bold claim! That the "new tenant" has read access to this disk is enough for them to try and read whatever was there previously. Looking for bit faults over multiple reads is an interesting technique, but only one of many.

Bottom line: if you are worried about someone accessing your data afterwards, do not put it "in the cloud". Period. That way you do not need to be concerned about data leakage afterwards.

The best method for data destruction is physical destruction of the media (i.e. grinding it to a bazillion pieces, or melting it down to a liquid). This is the only recognized method for any serious data in government.

0
Andrew Baker
Andrew Baker Replied on Oct. 25, 2011

That's true, Robin, but wouldn't the same be true for all other forms of hosted service? If you're worried about people accessing your data afterwards, you really need to manage all processes locally AND encrypt all traffic as much as humanly possible.

The more prudent course is to evaluate the level of risk involved across the whole data set and mitigate accordingly.

-ASB: http://XeeMe.com/AndrewBaker

0
Robin Goodchild
Robin Goodchild Replied on Oct. 25, 2011

Yes it would. There are a few criteria that need to be looked at.

1) How sensitive is the data? If it contains the nuclear codes, then you obviously don't want these where they can be read. You equally don't want sensitive corporate data being left around (e.g. details of acquisition targets, for example, or details of a break-through idea that the business has yet to patent).

2) Confidentiality. What data protection laws apply to customer data? What are the repercussions should this data be accessed by an unauthorized 3rd party? What about the risk to business reputation of such a leak?

There are others, but these are the main points.

========
The more prudent course is to evaluate the level of risk involved across the whole data set and mitigate accordingly.
========

Absolutely. If the cost of the leak is too high, don't put it out there.

0
Mani  shankar Prasad
VP(Engg), neoAccel India
Posted on Oct. 19, 2011
  • Recommended by:

This is an important issue which has been discussed in many forums. The solution which I can suggest is :-

a. Every CSP must have a technique to produce IMMUTABLE Logs of all activities. This would ensure that any data movement can be explicitely known if not done by the owner.

b. The data residual problem could be solved by having the Meta data of storage with the user at their premise.

c. To have a solution which gives " proof of data destruction and residue " of like we have a solution " for proof of data Possession " by a cloud service provider.
Merely encrypting it will not solve the residue problem.

d. data rseidue only could be checked by hash method . The real problem would be if cSP has made a copy of it inadvertantly or other wise.

0
Dave Roberts
Dave Roberts Replied on Oct. 25, 2011

a. Who do you think this would protect you from? Presumably, the CSP (or a rogue CSP employee) could also subvert those same logs.

b. This doesn't work for performance reasons, or if you're interested in backup. Or maybe I'm misunderstanding what you mean by "metadata."

c. Not sure what "merely encrypting" means, but that said, I don't think anybody was suggesting "merely" doing anything. It's a layered strategy. Encryption is one layer.

d. Not sure what "hash method." Things in inadvertent copies of the data are precisely where things like encryption play a role.

So, summary: If you're worried about data residue, work with CSPs that have strong policies around data destruction that meet your needs. And even then, encrypt everything at the disk level so that if mistakes are ever made, the data is still useless to anybody but a determined attacker with major resources. Wear both a belt and suspenders.

0
Dennis Morgan
CEO/Consultant, DK Morgan Group
Posted on Oct. 25, 2011
  • Recommended by:

Robin is correct. That is the only plausible way to be confident your data is managed properly. To say otherwise is a stretch. Be afraid of new technologies to a certain point. A vendor will not insure the integrity of your company. Be aware and
educate yourself up-front.

0
Dave Roberts
Vice President, Strategy, ServiceMesh, Inc.
Posted on Oct. 25, 2011
  • Recommended by:

I agree with Robin Goodchild about recovery of data after being allocated to another client. Just wiping metadata or whatnot is **NOT** enough for any serious enterprise app.

Even if you can't recover files, there still might be enough information in a single disk block to make the exercise interesting (account numbers, social security numbers, credit card numbers -- think "row in a DB table"). Even things like swap space associated with a sensitive application could hold interesting information. It's trivial to write a program to sweep through all the disk space looking for credit card numbers (fixed formats with check-digits, etc.).

0
JP Morgenthal
JP Morgenthal Replied on Oct. 25, 2011

My experience with virtualization technologies is that it would be extremely difficult for anyone but the cloud sevice provider to invoke code that would have direct access to disk at the level required to perform this operation. Most likely the underlying technology implementing cloud storage will not facilitate access at this level.

0
Dave Roberts
Dave Roberts Replied on Oct. 25, 2011

You may be right, but it depends how things are implemented. Worst case, the CSP is just using a simple storage array where blocks are blocks and there is no wiping. Data leaks between customers when customers do simple Linux-level reads of the raw disk volume. Yea, I agree that would be a VERY crappy CSP, but the point is, simple destruction of metadata is not enough if you want to be really secure. Anything that is not wiped between customers could potentially become a covert channel to leak info.

0
Dennis Morgan
CEO/Consultant, DK Morgan Group
Posted on Oct. 25, 2011
  • Recommended by:

Good points Dave. Those exposures are there.

Answer This Question