Your typical hard disk drive today measures its capacity in the half-terabytes. If you’ve got a 250GB drive – or maybe a half-dozen of them – you may think they're not good for much anymore. But what if you could use them to build a private cloud just like those OpenStack folks in the enterprise? In Windows 8, Microsoft is bringing the power of private clouds to consumers with the inclusion of Storage Spaces.

In this 10-part series, 26-year-veteran Windows tester Scott Fulton walks you through the best features, faculties and functions of Windows 8.

No. 10 : Refresh and Reset

No. 9: File History

Make Your Own Cloud

For years I’ve said that the next great version of Windows will be a deliverer of cloud service, and I’ve had folks from Microsoft tell me, “Oh sure, that’s what we’re doing, look at how we sync your photos with SkyDrive!” Indeed, syncing was pretty cool from a 2009 perspective, but Dropbox and Box.net are becoming nearly ubiquitous now. So with respect to Windows, syncing may already have become a “me, too” service.

Real cloud technology, when you get down to brass tacks, is about services and resources being provided to you in a logical fashion that’s separate and distinct from their physical locations. So when we talk about “public clouds” and “private clouds,” we’re referring to resources provided in big pools, on a metered basis, via the Internet (compared with big pools of resources - like hard drives - collected together in your own office). If the next Windows is to have any hope of greatness, it needs to start delivering some private cloud technology – ways for you to collect your processing and storage power together. One glimmer of hope for Windows 8 comes from Storage Spaces, a way for you to finally build one storage volume out of many.

The story of Storage Spaces in Windows 8 is not quite as simple as you may expect it to be (which explains the story of my life), but it makes sense if you stick with it a few minutes. With “LUN Provisioning,” you create a logical storage unit (using logical unit numbers, or LUNs) by collecting storage devices together – not volumes on those devices, mind you, not by “joining volume D: with volume E:” but by surrendering devices in their entirety to the collective. A new kind of virtual volume is created, not by slicing each device into segments but by joining the devices together under a single file system. It’s this notion that distinguishes LUN provisioning from pooling, which is a simpler system for collecting volumes together, but which is more susceptible to failure.

When you provision a Storage Spaces LUN, you’re effectively formatting the drives contained within it. There’s no choice about this, because you’re replacing the file system normally used for the drive with a completely different, more resilient one. In fact, you might even consider creating a LUN around just one drive, if that’s all you have, simply because you’d be increasing the file system integrity for that drive. (You just might find me talking more about this little fact in the near future.)

Redundancy for Redundancy’s Sake

Interestingly, the logical size of the joined devices in the LUN does not necessarily have to be the sum of their respective capacities. It can actually be larger. That may sound crazy, but it's a characteristic of many cloud technologies, and you just have to get used to it. The idea is that you provision your storage pool with the maximum size you expect to actually ever need. You could have three 2TB drives, though you can actually provision a LUN that encompasses those three with 15TB. This technique is called thin provisioning, and it sounds a bit like a lie told by comedian Jon Lovitz. (“Why, I’ve got ni-… te-… fifteen terabytes! Yeah, that’s the ticket.”)

Bear with me. In this example, each Windows LUN would make maximal use of the 6TB it has available to it, while it pretends to be a 15TB volume. Although enrolling three devices in a single LUN would appear to increase the theoretical chances of a read or write failure by 3, Storage Spaces will replicate data across the devices for as long as it can. It’s like RAID, except at the operating system level, implementing your choice of redundancy methods.

When you do have more than one device, mirroring is like RAID 1. It creates a redundant arrangement of “slabs” (in this case, 256MB data segments) across either two or three devices. Striping is like RAID 0, in which data slabs are equally distributed across as many devices as there are in the LUN. If you’re going to use striping, you may as well choose parity. This method stripes information across the total number of devices in the LUN minus one, and then stores extra parity data in the leftover device. This way, in the case of device failure, the missing segments may be mathematically recovered.

It’s Bigger on the Inside

Now, all of this doesn’t quite complete the answer to this question: Why would you want to provision a LUN for a larger size than the sum of its physical capacities… especially since Storage Spaces can add more data to the storage system than the files actually consume anyway?

The answer has to do with… the future! You may not own all the storage capacity you eventually need, especially if you’re running a home network in a family where you record a lot of HD video with Windows Media Center. Thin provisioning gives you a reprieve, letting you add new devices when you need them. You’ll need them soon after Windows 8 tells you so – as your consumed capacity leaves less and less room for redundancy.

Storage Spaces are for storage devices that are directly attached to one PC in your homegroup, either internally (SATA, SCSI or iSCSI interface) or externally (SATA, SAS or USB). It doesn’t work with network drives that are mapped to a volume because – as hopefully you understand now – Storage Spaces is not a collection of volumes. Since a network drive can be accessed by multiple PCs, it can’t be part of a LUN that’s attached to just one of those PCs.

For now, you can’t use Storage Spaces to create a system volume. The service that runs Storage Spaces is a formal service like the network stack, and Windows must be running to use it. For every PC I’ve ever built, I create at least one separate drive to store data and media files anyway. It might sound silly, but conceivably, I could use Windows 8 to create a storage space around just that one extra drive, with “striped + parity” storage (it’ll “stripe” over one disk, but at least there will be parity), and then thinly provision it for far more than its native capacity. That way I can add a new device to the mix whenever I feel like it (or can afford it), just by plugging it into the USB or external SATA port.

Resilience architecture is changing the way we compute, and it’s helping once again to distinguish the power of the PC from one of these other, cutesy, newfangled devices you see floating around. I’m looking forward to being able to build my own cloud on my desk whenever I need one.