Making S3 Folders In Ruby

Amazon’s S3 storage service is a really cool way to store and serve up massive quantities of data. Interestingly, it is more like a database than like a file system. Despite the convenience of referring to an object like “/path/to/thing.jpg” S3 really does not have any separate objects for “/path” or “/path/to”. In fact, neither “/path” nor “/path/to” even exist.

This is why, when you store an S3 object, the name it is given is called “key” instead of “filename”. It functions like a database key, returning a file. However, as very “ultra hipster Web 2.0” as this may be, it is still convenient to browse a large collection of files by using a familiar directory/sub-directory/etc.

In fact, all of the S3 utilities that I use allow you to create a folder inside of a bucket, and sub-folders inside of that etc. But I did not see any programmatic facility for creating a folder independent of any object inside of S3. If you have been paying attention, you remember that in S3 there is no such thing as a directory! So how do these nice GUI browsers do it?

I did some Googling, but no luck, beyond “it couldn’t be done”. Since I had two different programs that already did it, I knew THAT wasn’t the case. It took a little old-fashioned investigation… I looked directly at the output from a query to a bucket that had the little folders in them already.

Lo and behold, it turns out, they cheat, by creating a specially named object for each “directory”. For a directory named “/path”, you would create an object with the key “path_$folder$”, and for a directory named “/path/to”, you create an object with the key “path/to_$folder$”. Then to get a directory listing for “/path” you just do a query on S3 for all object whose key starts with “/path”. Ignore any objects that end with “_$folder$” and there you have it: S3 folders.

I decided that it would be nice if the aws/s3 gem would support this foldering the same way that copying a file within a file system does: if the enclosing directories do not exist, they are created before the file copy.

Thanks to the beauty of modern dynamic languages, I was easily able to put together a little monkeypatch for aws/S3 to the S3Object class, that handles this.

# This is an extension to S3Object that supports the emerging 'standard' for virtual folders on S3.  
# For example:  
#   S3Object.store('/folder/to/greeting.txt', 'hello world!', 'ron', :use\_virtual\_directories => true)  
#  
# This will create an object in S3 that mimics a folder, as far as the S3 GUI browsers like  
# the S3 Firefox Extension or Bucket Explorer are concerned.  
module AWS  
 module S3  
 class S3Object  
 class << self  
           
        alias :original\_store :store  
        def store(key, data, bucket \= nil, options \= {})  
          store\_folders(key, bucket, options) if options\[:use\_virtual\_directories\]  
          original\_store(key, data, bucket, options)  
        end  
    
        def store\_folders(key, bucket \= nil, options \= {})  
          folders \= key.split("/")  
          folders.slice!(0)  
          folders.pop  
          current\_folder \= "/"  
          folders.each {|folder|  
            current\_folder += folder  
            store\_folder(current\_folder, bucket, options)  
            current\_folder += "/"  
          }  
        end  
    
        def store\_folder(key, bucket \= nil, options \= {})  
          original\_store(key + "\_$folder$", "", bucket, options) # store the magic entry that emulates a folder  
        end  
      end  
    end  
  end  
end

Sure, you can have the best of both worlds: massive virtual storage, and a convenient directory-like structure. And now you can have it with your favorite Ruby S3 library.

Happy storage!