Yes, You Can Virtualize FreeNAS
FreeNAS is the world’s most popular open source storage OS, and one of the more popular questions I get asked is, “How do I run FreeNAS as a VM?” Due to the number of caveats required to answer that question, I would typically short-circuit the conversation by recommending against it, or only recommend it for test environments since the prerequisite knowledge required to “do it right” can’t be passed on quickly. Somehow over time, this message morphed into a general consensus that “you cannot (or shouldn’t) virtualize FreeNAS at all under any circumstances”, which wasn’t my intention. So, I’m here to set the record straight once and for all: You absolutely can virtualize FreeNAS.
Whether you are test driving the functionality of FreeNAS, testing an upgrade for compatibility in your environment, or you want to insulate your FreeNAS system from hardware faults, virtualization can provide many well understood benefits. That said, while FreeNAS can and will run as a virtual machine, it’s definitely not ideal for every use case. If you do choose to run FreeNAS under virtualization, there are some caveats and precautions that must be considered and implemented. In this post I’ll describe what they are so that you can make well-informed choices.
Before we get started though, I should probably start with a disclaimer…
If best practices and recommendations for running FreeNAS under virtualization are followed, FreeNAS and virtualization can be smooth sailing. However, failure to adhere to the recommendations and best practices below can result in catastrophic loss of your ZFS pool (and/or data) without warning. Please read through them and take heed.
Ok, phew. Now that that’s over with, let’s get started.
1. Pick a virtualization platform
When developing FreeNAS we run it as a VM. Our virtualization platform of choice is VMware, and it’s the platform in which the FreeNAS developers have the most experience. FreeNAS includes VMware tools as well.
Our second choice for a virtualization platform is CItrix XenServer. FreeNAS has no tools built in for XenServer, but you get a solid virtualization experience nonetheless. Other hypervisors such as bhyve, KVM, and Hyper-V also work, but the development team does not use them on a daily basis.
2. Virtualizing ZFS
ZFS combines the roles of RAID controller, Volume Manager, and file system, and since it’s all three in one, it wants direct access to your disks in order to work properly. The closer you can get ZFS to your storage hardware, the happier ZFS is, and the better it can do its job of keeping your data safe. Things like native virtual disks or virtual disks on RAID controllers insulate ZFS from the disks, and therefore should be avoided whenever possible. Using a hypervisor, you typically have a disk on a RAID controller presented to a hypervisor which creates a datastore with a disk on it running FreeNAS. This places two layers between ZFS and the physical disks which warrants taking the following precautions.
- If you are not using PCI passthrough (more on that below), then you must disable the scrub tasks in ZFS. The hardware can “lie” to ZFS so a scrub can do more damage than good, possibly even permanently destroying your zpool.
- The second precaution is to disable any write caching that is happening on the SAN, NAS, or RAID controller itself. A write cache can easily confuse ZFS about what has or has not been written to disk. This confusion can result in catastrophic pool failures.
- Using a single disk leaves you vulnerable to pool metadata corruption which could cause the loss of the pool. To avoid this, you need a minimum of three vdevs, either striped or in a RAIDZ configuration. Since ZFS pool metadata is mirrored between three vdevs if they are available, using a minimum of three vdevs to build your pool is safer than a single vdev. Ideally vdevs that have their own redundancy are preferred.
3. Consider the Use Case
Is this a production or non-production FreeNAS application? The answer to this question has significant implications to the subsequent recommended practices.
If your use case is a test lab, science experiment, pre-upgrade checks of a new version, or any other situation where real data that you care about isn’t at stake, go ahead and virtualize. Create a VM with 8GB of RAM, two vCPUs, a 16GB install disk, and a single data disk of whatever size is appropriate for your testing, and go to town.
This is where things get serious. If you’re using FreeNAS in an application that’s relied on for daily operations, this is considered a “Production Environment”, and additional precautions must be followed closely to avoid downtime or data loss.
If you use PCI passthrough (aka DirectPath I/O), you can then use FreeNAS just like it’s installed on physical hardware. The PCI device passthrough capability allows a physical PCI device from the host machine to be assigned directly to a VM. The VM’s drivers can use the device hardware directly without relying on any driver capabilities from the host OS. These VMware features are unavailable for VMs that use PCI passthrough:
- Hot adding and removing of virtual devices
- Suspend and resume
- Record and replay
- Fault tolerance
- High availability
To use PCI passthrough, you need to use a host bus adapter (HBA) supported by FreeNAS (we recommend LSI controllers of the 2008 chipset variety, which are 6Gb/s and well supported in FreeNAS) as a PCI passthrough device that is connected directly to your FreeNAS VM. The FreeNAS VM then has direct access to the disks. Make sure to adhere to the guidelines on using PCI passthrough. If you use PCI passthrough, it is as if you aren’t virtualizing at all so you’ll be safe to use FreeNAS in a production scenario.
4. Other Considerations
If you are still interested in virtualizing FreeNAS, pay attention to the following:
Adhere to the FreeNAS hardware recommendations when allocating to your FreeNAS VM. It goes without saying that virtualized FreeNAS is still subject to the same RAM and CPU requirements as a physical machine. When you virtualize FreeNAS your VM will need:
- At least two vCPUs
- 8GB or more of vRAM, at least 12GB of vRAM if you use jails/plugins
- Two or more vDisks
- A vDisk at least 16GB in size for the OS and boot environments
- One or more vDisks at least 4GB in size for data storage, at least 3 are recommended
- A bridged network adapter
Striping your vDisks
In this configuration, ZFS will be unable to repair data corruption but it will be resilient against pool corruption caused by damage to critical pool data structures causing loss of the entire pool. If you are using a SAN/NAS to provision the vDisk space, then three striped 1TB drives will require 3TB of external LUN usable space.
RAIDZ protection of your vDisks
In this configuration, ZFS will repair data corruption. However, you will waste an additional virtual disk worth (or two if RAIDZ2 is used) of space since the external storage array protects the LUN and RAIDZ creates parity to protect each vDisk. If you are using a SAN/NAS to provision the vDisk space, then three RAIDZ1 1TB drives will require 4.5TB of external LUN usable space.
Disk space needed for provisioning
With striping, you’ll be required to provision 3TB of space from the SAN/NAS storage array to get 3TB of usable space. If you use RAIDZ protection, it will use one of the virtual disks for parity, and you will be required to provision 4.5 TB of space from the SAN/NAS storage array to get 3TB of usable space. Depending on the $/GB for your SAN/NAS this additional 1,500 GB can get quite expensive.
I have attempted to share some best practices that the engineering team at iXsystems has used while virtualizing, and I hope that I have not missed anything big. With so many different hypervisors, it is difficult to give you specific instructions. You need to take some precautions to utilize your setup in a production environment safely:
- PCI passthrough of an HBA: This is the best case and ideally recommended
- If using a RAID controller/SAN/NAS, Write cache: Disabled
- FreeNAS scrub tasks: Disabled unless PCI passthrough is used
- Disk configuration
- Single disk: Vulnerable to pool metadata corruption, which could cause the loss of the pool. Can detect — but cannot repair — user data corruption.
- Three or more virtual disks striped (even if they are from the same datastore!): Resilient against pool corruption. Can detect — but cannot repair — corrupted data in the pool. Depending on what backs the vDisks you may be able to survive a physical disk failure, but it is unlikely that the pool will survive.
- Three or more virtual disks in RAIDZ: Can detect and repair data corruption in the pool, assuming the underlying datastore and/or disks are functional enough to permit repairing by ZFS’ self-healing technology.
- Never ever run a scrub from FreeNAS when a patrol read, consistency check, or any other sort of underlying volume repair operation, such as a rebuild, is in progress.
Some other tips if you get stuck:
- Search the FreeNAS Manual for your version of FreeNAS. Most questions are already answered in the documentation.
- Before you ask for help on a virtualization issue, always search the forums first. Your specific issue may have already been resolved.
- If using a web search engine, include the term “FreeNAS” and your version number.
As an open source community, FreeNAS relies on the input and expertise of its users to help improve it. Take some time to assist the community; your contributions benefit everyone who uses FreeNAS.
To sum up: virtualizing FreeNAS is great—the engineering organization and I have used it that way for many years, and we have several VMs running in production at iXsystems. I attempted to provide accurate and helpful advice in this post and as long as you follow my guidance, your system should work fine. If not, feel free to let me know. I’d love to hear from you.
iXsystems Senior Engineer
[---- 2018/02/27: This is still as relevant as ever. As PCIe-Passthru has matured, fewer problems are reported. I've updated some specific things known to be problematic ----]
[---- 2014/12/24: Note, there is another post discussing how to deploy a small FreeNAS VM instance for basic file sharing (small office, documents, scratch space). THIS post is aimed at people wanting to use FreeNAS to manage lots of storage space. ----]
You need to read "Please do not run FreeNAS in production as a Virtual Machine!" ... and then not read the remainder of this. You will be saner and safer for having stopped.
<the rest of this is intended as a starting point to be filled in further>
But there are some of you who insist on blindly charging forward. I'm among you, and there are others. So here's how you can successfully virtualize FreeNAS, less-dangerously, with a primary emphasis on being able to recover your data when something inevitably fscks up. And remember, something will inevitably fsck up, and then you have to figure out how to recover. Best to have thought about it ahead of time.
- Pick a virtualization platform that is suitable to the task. You want a bare metal, or "Type 1," hypervisor. Things like VirtualBox, VMware Fusion, VMware Workstation, etc. are not acceptable.VMware ESXi is suitable to the task.Hyper-V is not suitable for the task, as it is incompatible with FreeBSD at this time.
I am not aware of specific issues that would prevent Xen from being suitable. There is some debate as to the suitability of KVM. You are in uncharted waters if you use these products.
- Pick a server platform with specific support for hardware virtualization with PCI-Passthrough. Most of Intel's Xeon family supports VT-d, and generally users have had good success with most recent Intel and Supermicro server grade boards. Other boards may claim to support PCI-Passthrough, but quite frankly it is an esoteric feature and the likelihood that a consumer or prosumer board manufacturer will have spent significant time on the feature is questionable. Pick a manufacturer whose support people don't think "server" means the guy who brings your food at the restaurant.You will actually want to carefully research compatibility prior to making a decision and prior to making a purchase. Once you've purchased a marginal board, you can spend a lot of time and effort trying to figure out the gremlins. This is not fun or productive. Pay particular attention to the reports of success or failure that other ESXi users have had with VT-d on your board of choice. Google is your friend.Older boards utilizing Supermicro X8* or Intel 5500/5600 CPU's and prior are expected to have significant issues, some of which are fairly intermittent, and may not bite you for weeks or months. All of the boards that have been part of the forum recommended hardware series seem to work very well for virtualization.
- Do NOT use VMware Raw Device Mapping. This is the crazy train to numerous problems and issues. You will reasonably expect that this ought to be a straightforward, sensible solution, but it isn't. The forums have seen too many users crying over their shattered and irretrievable bits. And yes, I know it "works great for you," which seems to be the way it goes for everyone until a mapping goes wrong somehow and the house of cards falls. Along the way, you've probably lost the ability to monitor SMART and other drive health indicators as well, so you may not see the iceberg dead ahead.
- DO use PCI-Passthrough for a decent SATA controller or HBA. We've used PCI-Passthrough with the onboard SAS/SATA controllers on mainboards, and as another option, LSI controllers usually pass through fine. Get a nice M1015 in IT mode if need be. Note that you may need to twiddle with setting hw.pci.enable_msi/msix to make interrupt storms stop. Some PCH AHCI's ("onboard SATA") and SCU's ("onboard SAS/SATA") work. Tylersburg does not work reliably. I've seen Patsburg and Cougar Point work fine on at least some Supermicro boards, but had reports of trouble with the ASUS board. The Ivy Bridge CPU era is the approximate tipping point where things went from "lots of stuff does not to work" and began to favor "likely to work."
- Try to pick a board with em-based network interfaces. While not strictly necessary, the capability to have the same interfaces for both virtual and bare metal installs makes recovery easier. Much easier.
Now, here's the thing. What you want to do is to use PCI-Passthrough for your storage, and create a virtual hardware platform that is very similar to your actual physical host... just smaller. So put FreeNAS on the bare metal, create your pool, and make sure that all works ... first! Then load ESXi. ESXi will want its own datastore, and cannot be on the PCI-Passthrough'd controller, so maybe add an M1015 in IR mode and a pair of disks for the local ESXi image and datastore (you have to store the FreeNAS VM somewhere after all!). Create a FreeNAS VM and import the same configuration.
Now at this point, if ESXi were to blow up, you can still bring the FreeNAS back online with a USB key of FreeNAS, and a copy of your configuration. This is really the point I'm trying to make: this should be THE most important quality you look for in a virtualized FreeNAS, the ability to just stick in a USB key and get on with it all if there's a virtualization issue. Your data is still there, in a form that could easily be moved to another machine if need be, without any major complicating factors.
But, some warnings:
- Test, test, and then test some more. Do not assume that "it saw my disks on a PCI-Passthru'd controller" is sufficient proof that your PCI-Passthrough is sufficient and stable. We often test even stuff we expect to work fine for weeks or months prior to releasing it for production.
- As tempting as it is to under-resource FreeNAS, do try to aggressively allocate resources to FreeNAS, both memory and CPU.
- Make sure your virtualization environment has reserved resources, specifically including all memory, for FreeNAS. There is absolutely no value to allowing your virtualization environment to swap the FreeNAS VM.
- Do not try to have the virtualization host mount the FreeNAS-in-a-VM for "extra VM storage". This won't work, or at least it won't work well, because when the virtualization host is booting, it most likely wants to mount all its datastores before it begins launching VM's. You could have it serve up VM's to other virtualization hosts, though, as long as you understand the dependencies. (This disappoints me too.)--update-- ESXi 5.5 appears to support rudimentary tiered dependencies, meaning you should be able to get ESXi to boot a FreeNAS VM first.Due to lack of time I have not tried this. If you do, report back how well (or if) it works.
- Test all the same things, like drive replacement and resilvering, that you would for a bare metal FreeNAS implementation.
- Have a formalized system for storing the current configuration automatically, preferably to the pool. Several forum members have offered scripts of varying complexity for this sort of thing. This makes restoration of service substantially easier.
- Since you lack a USB drive key, strongly consider having a second VM and 4GB disk configured and ready to go for upgrades and the like. It is completely awesome to be able to shut down one VM and bring up another a few moments later and restore service at the speed of an SSD datastore.
If what is described herein doesn't suit you, please consider trying this option instead.