At the end of 2015, we launched an update to our web-based uploader called the 'Gigaloader':
Simply put, it improved on our regular uploader in two main ways:
Chunks: unlike our older, regular uploader which uploaded the source file from your browser to our server in one continuous upload, the new Gigaloader splits the video into small chunks, uploads those in sequence, then reassembles the video on our server form the chunks. It allows you to get around S3 file-size restrictions.
This approach makes the Gigaloader more resilient as, if there is an error, the whole upload does not fail just a chunk which can be retried. The second advantage is...
Pause and resume: shortly after you start uploading a video with the Gigaloader, you'll see a "Pause Upload" button appear:
Clicking it puts the upload on hold. You can resume the upload by clicking the "Resume Upload" button:
You might use this feature if you're half way through a very large, long upload and you need to disconnect your laptop and move to a different location with a different internet connection.
Our API endpoints and libraries have been updated to support the underlying chunked uploading used by the Gigaloader: API Libraries
Depending on the nature of your application, you may need to consider changing the chunk size used by your uploader. We can switch the value in our back end (possible values are: 5Mb; 16Mb; 32Mb; 64Mb; 126Mb; and 256Mb) so, if this applies, just drop us a line: email@example.com
Chunked upload vs S3MultipartUpload
It’s important to make a distinction between a “chunked upload” and an S3MultipartUpload: a chunked upload refers to the general process of splitting a file into multiple parts and then uploading each part.
An S3MultipartUpload is an S3 object that is created by making an S3 API call. This object acts as a kind of container, into which file parts must be inserted in order to be able to join the parts at the end of the upload process.
Uploading is done to separate S3 bucket specifically used for file uploads. This bucket doesn’t actually handle any S3MultipartUpload instances. This simplifies the process of uploading files because our uploader only has to upload the file in parts, without having to deal with the S3MultipartUpload instance at all. We initiate the S3MultipartUpload on our primary bucket.
The S3 bucket is configured to publish every file creation event to an SNS topic. This is Amazon’s Simple Notification Service.
Subscribed to this SNS topic is an AWS Lambda function. The Lambda function checks to see if there is an active S3MultipartUpload instance on our primary S3 bucket that matches the event data, and if so copies the uploaded part from the upload bucket to our primary bucket. This copy is done into the S3MultipartUpload “container”. If no matching S3MultipartUpload is found, we just copy the file (this will happen for files smaller than 32MB).
When the final file part is uploaded, this triggers a process on our servers that checks to see that the active S3MultipartUpload instance contains the expected number of parts, and if so, we finalize the S3MultipartUpload.
On successful completion of the upload, we continue with our normal encoding process.