Canonical
on 18 May 2015

What is software-defined storage?


This article is more than 10 year s old.

What is software-defined storage, and how do NAS and SAN appliances compare to software-defined storage?

Large-scale storage presents an inherent scalability challenge: how do you connect multiple disk drives with producers and consumers of data while ensuring performance and durability — and furthermore, without blowing out bandwidth, capacities & budget? The most common way of addressing these requirements is to provide remote filesystems or block storage through NAS (Network Attached Storage) and SAN (Storage Area Networks) appliances.

NAS and SAN appliances are typically built with proprietary hardware, and powered by proprietary software which serves up the relevant storage protocols like iSCSI or NFS; the software internally handles replication, rebalancing and reconstruction. Their design normally allows for some scalability through added storage elements, but the number of nodes involved is typically low. They are generally built to be fault-tolerant through high internal redundancy — RAID, standby power supplies and multiple network & disk bus interfaces.

Software-defined storage (SDS) embodies a different philosophy: the storage service is actually built from a cluster of commodity-hardware-based server nodes, some of which act as proxies, managers or gateways, and some of which act as storage nodes. Each storage node is responsible for storing a subset of the overall data. This allows additional nodes to be added to provide greater storage capacity, higher availability or increased throughput; clusters of dozens and even hundreds of nodes are possible. The software powering the cluster manages data placement and may offer enhanced capabilities to clients, including novel interfaces such as HTTP REST APIs for object storage.

Fault-tolerance in SDS is implemented by assuming that the failure domain is an entire node, and that no single node is essential. This dramatically reduces the hardware complexity (and cost) of the individual nodes, but also forces a horizontally scaling software architecture. Each node hosting a component of the service is designed to share state and data to its peers, allowing the resulting cluster to deliver high throughput while surviving the failure of individual nodes.

This is how giants like Google, Amazon and Baidu have successfully and economically deployed storage and compute for over a decade. Ubuntu Advantage Storage brings to customers solutions with these same characteristics, but which are built on solid open source technology, and are ready for deploying on your hardware today.

Learn more about Ubuntu Advantage Storage


Newsletter
signup

Get the latest Ubuntu news and updates in your inbox.

By submitting this form, I confirm that I have read and agree to Canonical's Privacy Policy.

Related posts


Canonical
25 June 2025

Native integration now available for Pure Storage and Canonical LXD

Article Canonical announcements

Canonical, the company behind Ubuntu, and Pure Storage, the IT pioneer delivering enterprise-grade all-flash storage, have partnered to introduce a native integration between LXD and Pure Storage FlashArray. This collaboration allows organizations to combine open source virtualization with industry-leading block storage...

Canonical
25 June 2025


Philip Williams
3 February 2025

How to reduce data storage costs by up to 50% with Ceph

Article Ceph

Canonical Ceph with IntelⓇ Quick Assist Technology (QAT) In our last blog post we talked about how you can use Intel® QAT with Canonical Ceph, today we’ll cover why this technology is important from a business perspective – in other words, we’re talking data storage costs. Retaining and protecting data has an inherent...

Philip Williams
3 February 2025


Philip Williams
27 January 2025

How to utilize CPU offloads to increase storage efficiency

Article Ceph

Canonical Ceph with IntelⓇ Quick Assist Technology (QAT) When storing large amounts of data, the cost ($) to store each gigabyte (GB) is the typical measure used to gauge the efficiency of the storage system. The biggest driver of storage cost is the protection method used.  It is common to protect data by either having

Philip Williams
27 January 2025