When we first started with SimpleWeb E-Commerce, we needed to get a distributed file system on the Amazon Web Services (AWS) so that we could have the same file structure for all the different servers. At that time the GlusterFS file system was the one that looked like it made the most sense.
GlusterFS is an open source project that allows you to set up a couple of servers and assign hard drives to those servers and it would make sure that they were distributed and balanced. It has fault tolerance and redundancy, and it was a great system at the time. The biggest issue that we had with it was management. It had the fixed drive sizes, so whenever there was a need to upgrade the hard drives, it almost required building up new servers. The updates sometimes weren't compatible with the previous versions, and so it stuck on a version of it for some time and were not able to get up to the latest version until we decided to rebuild the servers.We always felt like there were some speed issues that in this system as well, but could never really prove it.
Later Amazon came out with its elastic file system, and we were excited about EFS. It is a distributed file system built into AWS based on the network file system (NFS) protocol, and it looked like it was going to lower the costs of file storage tremendously. We liked the idea that it was going to use the standard protocols and it is distributed, so there are the different endpoints for each availability zone inside a region, it has nearly unlimited storage and was easy to set up.
Initially, we were really happy with it, but then we ended up having some performance issues almost straight away. The performance is based on the storage that you have and so more storage you have the more credits you have for performance and the more you can use bandwidth throughput on the system. So we ended up having to fill up the drives with the useless information to get it to perform we needed. These files introduced a pretty massive hidden cost into EFS and made it more expensive than the previous GlusterFS solution.
We are currently using a file system called ObjectiveFS. It uses Amazon S3 as an object store, but it acts like it's a hard drive. So far the performance has been outstanding in our very unscientific tests. There was a particular procedure that evolved quite a bit of writing of small files. EFS took 345.6 seconds while that same process on ObjectiveFS took 5.6 seconds.
It's got unlimited storage because everything is stored on Amazon S3 and the caching seems to be fantastic. It's got a disk cache and a memory cache. Have things nice and fast memory but also store larger quantities on the hard drive so that as long as it can verify that nothing has changed, it doesn't have to do a lot of cost to Amazon S3.
So far the only issue with ObjectiveFS that I found so far is that the monthly fees are based on the number of instances connecting to the file system. They are not involved in any per-instance issues. They don't have any bandwidth or CPUs involved. It's a file that gets loaded on the server, so it seems that per instance pricing isn’t the best pricing for consumers. But for the performance increases which it has, it should be worth the costs. Although I think the costs are going to be the same or less than what we had with EFS.