Moving off S3 with Drupal 7
hack

Moving off S3 with Drupal 7

Dabitch
Dabitch

It seemed a great idea to use S3, but after receiving a DMCA takedown request from them regarding a film I stored there, I knew that I needed to move. Ambulance-chasing lawyers who send such requests out strangle my choices.

Step one, download everything. Now this took several days, five but who is counting, and since I had to start over a few times, I was a little concerned about it. But it was pretty straightforward, just use the aws version of rsync.

First, install aws-cli on your MacOSX by opening up a terminal window and typing this

curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" 
unzip awscli-bundle.zip 
sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws

Now check that you have it installed by asking what version you have:

aws --version

So you're set, now the best way to download everything, is to simply do it in one go.

aws s3 sync s3://b0wie /Volumes/5terabytes/b0wie

Like I said, it took me days to do and I downloaded it all into an external drive that I named 5Terabytes. With all of this solved, I just needed to upload this to my new fancy server, and make sure my Drupal 7 could find it.

That's where it gets a tiny bit more complicated. While it's simple enough to run a mysql command that switches all of your s3:// to public:// like this:

UPDATE `file_managed`
SET uri = REPLACE(uri, 's3://', 'public://')
WHERE uri LIKE ('s3://%');

This is not the only place in your Drupal 7 database where your files storage settings exist. So you should create your own database up date module, I created one that I called "updatedb" and placed it in the /sites/all/modules folder.

The first file, updatedb.info you could write whatever you like here I suppose

name = Update Database
description = Purpose is to run the hook_update on every release to implement one-time execution of code
core = 7.x
package = Custom

Next file, updatedb.install contains the fun stuff:

<?php

/**
 * @file
 * Install file for the Update Database module.
 */

/**
 * Brute force update of field sources to point to public from S3.
 */
function updatedb_update_7600($sandbox) {
  $results = db_query("
    select id, data
    from {field_config}
    where data like '%s3%'")
      ->fetchAll();
  foreach ($results as $result) {
    $data = unserialize($result->data);
    $data['settings']['uri_scheme'] = 'public';
    $data_serialized = serialize($data);

    db_update('field_config')
        ->fields(array('data' => $data_serialized))
        ->condition('id', $result->id)
        ->execute();
  }
}

You will also have to make a "updatedb.module" file, but you don't need to put anything real in it, so I just wrote:

<?php

/**
 * @file
 * Update Database module.
 */

/**
 * A module file must exist, therefore empty.
 */

Now you can turn on this module in your admin pages, just like you would any other module, and update your database. All the usual warnings, this will make changes in your database, make a backup first, all that jazz.

Once you've done all that, you can check that everything on your site is working. Turn off your s3 module, change your default file directory to files. Clear caches.

Your images might complain, this will be helped if you go to your structure > content types > manage fields. I had issues with my image field "widget type", but just changing that from media browser, to image (and save), and then back again sorted everything out. I had to do the same with a few more image fields that I had.

So that's how you leave S3 and return to local file hosting. Enjoy.


p.s. if you ever hardlinked anything, like I did in certain paragraph fields, you'll need to fix that too.

Figure out what fields you need to change, if you're not using paragraphs it will be in body and body_revision. You will do this:

update TABLE_NAME set FIELD_NAME =
replace(FIELD_NAME, 'Text to find', 'text to replace with');

In my case I had a paragraph fields, so I changed that and the revision.

update`field_data_field_mg_text_html_content`set`field_mg_text_html_content_value`=replace(`field_mg_text_html_content_value`,'://b0wie.s3.amazonaws.com/','://adland.tv/sites/default/files/')

Pretty straightforward.