VixEOUTOFMEMORY: Memory allocation failed. Out of memory.
Failed VMware VM backup due to NBD limits
Beside limitations on connections count to VC/ESXi, ESXi itself also limited by a transfer buffer for all NFC connections. This limitation is enforced by the host and can’t by bypassed or to be known in advance. The sum of all NFC connection buffers to an ESXi host can’t exceed 32 MB, and by default it’s configured as 16 MB.
The primary physical CFA uses NBD protocol to backup VM’s disks. NBD, in turn, employs the VMware network file copy (NFC) protocol and thus is a subject of aforementioned limitations.
This happened because the ESXi server couldn’t serve the request due to lack of enough resources (NFC connection buffer).
At the time there were N failed jobs with VixEOUTOFMEMORY error you had N+1 or more simultaneously running FULL backups. All of them were backing up VMs located on a single ESXi host. We use up to 10 MB buffer to transfer data. So there’s a probability of facing with NFC buffer limitation on ESXi host. Which, as you saw, happened to occur. It doesn’t mean the probability is always 100% with parallel backup jobs. It very depends on a lot of factors.
Steps to resolve
Nothing if there are no other failed backups with VixEOUTOFMEMORY error on your CFA on consequent backups.
You can also optimize the ESXi network (NBD) performance by increasing the NFC buffer size from 16 MB to 32 MB and reducing the cache flush interval as suggested in VMware KB article.
Do it on all of your ESXi hosts. You can query current values using the following commands (from ESXi host):
esxcfg-advcfg -get /BufferCache/MaxCapacity and
esxcfg-advcfg -get /BufferCache/FlushInterval. It won’t guarantee VixEOUTOFMEMORY never happen again but will decrease its probability. And it seems to be a good idea in general since you perform a lot of simultaneous backups.
Consider upgrading your network to 10GbE. That should cover every network link in the chain between the CFA and the VMware host.