A conversation at Vox’s about trusting the cloud today reminded me I’ve been wanting to write about that. “The Cloud” is mainly a marketing term that’s been misused in some confusing ways, so let’s define some terms.
Up until a decade or so ago, if you wanted to have a computer at some remote location – usually to provide it with a high-speed, redundant Internet connection – you had a couple of choices. You could do a “colocated” server, where you own the computer yourself, and pay a company to provide electricity and a high-quality Internet connection to it. Or you could get a “dedicated” server, where you lease the computer from the company along with the network access. The main difference is that with colocation, you’re responsible for replacing any failed hardware yourself, but your monthly cost is much lower. But in both cases, the computer you’re using is a specific one, like machine #3 in rack #15 in room D12.
Then computers with multiple processors became cheaper and virtualization became a thing, and companies started selling virtual private servers (VPS). In this case, you’re renting a subset of the resources on one computer instead of the whole thing. A server with 8 CPUs might be divided up into four 2-CPU VPSs for four different customers, for instance. Each VPS only sees the resources that were assigned to it. So if you buy a 2-CPU VPS, you can’t see what else is on the physical machine. You just see what your VPS has been given, as if it was on a physical machine with those limits.
Companies that were selling a lot of VPSs gradually developed tools that made it easy to spin these virtual servers up quickly, or knock them down on one machine and spin them back up on another within seconds. They called this a “cloud” because of the idea that you didn’t see them as individual machines anymore, but as a pool of resources to draw on. Any particular VPS could be on any particular physical machine, and it wouldn’t matter. Since servers could be moved around the cloud with so little effort, it would be easy to do upgrades and handle hardware failures with little disruption.
Another advantage is that, since you can spin servers up and down so quickly and they’re normally charged by the hour, you don’t have to keep a whole room of computers running if you only need to do heavy computation once in a while. If there’s some huge monthly task that needs a hundred servers to get done in less than a day, you can spin them up, run it, and knock them down when they’re done, only paying for the day.
So that’s what the cloud is. The first thing to note from that is that there isn’t one “The Cloud” the way it sounds in commercials. Microsoft has a cloud, so does Amazon, so does Google, so does Apple, and so do lots of other companies. Some companies rent out servers in their cloud; others only use it to offer other services. For instance, my LG phone will do automated backups. Those probably go to a cloud, but whether LG has its own cloud or rents cloud storage from someone like Amazon, I have no idea.
Device and app makers like to put your files “in the cloud.” That allows them to make your device seem like it has more space than it does, because your files are actually somewhere else except when you’re using them. It also has the advantage that if you drop your device in a lake, your files aren’t lost.
However, that means those pictures and videos you think you have on your phone aren’t necessarily really there. They may be stored in the cloud, and only pulled down when you need them. You’re completely dependent on the cloud provider and their software not to lose your files in the meantime. And odds are, if they do, you have no recourse except complaining to someone who barely speaks English. So trusting your files to long-term cloud storage may not be the best idea.
One option is to have your own cloud server at home. The funny thing is, this isn’t a “cloud” at all, since there’s no pool of resources. It’s one computer, working only for you. But it will have cloud-type services running on it, so it will let your mobile apps treat it like a cloud server, allowing you to take advantage of those features without trusting someone else to keep them safe. They’ve been called “personal cloud servers” for that reason.
You can even use both: keep everything backed up to your own personal server in your home or office, and also back files up to cloud storage somewhere. That way you still have remote storage in case your house burns down, but you aren’t counting on the cloud as your only storage for anything.
Storage is so cheap at this point that there’s really no reason to skimp on it. My own workstation has three hard drives mirrored, so two of the three could fail at once and I still wouldn’t lose anything. And I do remote backups to another server in case of a catastrophic loss like fire. I lost files in The Great RAID Disaster Of 2002; that’s not going to happen again.