networked day to day technical issues

6Jan/170

Secure and Scalable WordPress In the Cloud (Amazon S3 for content delivery and EC2 for authoring)

Several months ago I decided to move all of the stuff running on my server (Droplet on Digital Ocean) to various cloud providers. My main motivation was that I did not have time any more to manage my email server which was made up on Postfix + Zarafa + MailScanner + SpamAssassin + ClamAV + Pyzor/Razor/DCC + Apache2 + Mysql . Then I was also dealing with monitoring + backups.
Anyway moving the mail was easy as there are plenty of cloud solutions which are mature.

With my blog (which I did not post to since a long time ago) I decided to try something a bit more interesting so I decided to move it to Amazon S3 as a static website.
In order to achieve this I had to solve the following:

  • convert WordPress from dynamically generated pages to static ones. This was easy using the plugin "WP Static HTML Output" which does what it says
  • find a solution for the comments as with a static page you won't be able to add comments. The solution was to start using Disqus. I've installed the plugin "Disqus Comment System", created a Disqus account and then using the plugin proceeded to import all of the comments which were stored in WordPress' database
  • find a solution for search. Again this was not hard and I've moved to using Google Search (plugin "WP Google Search")
  • once I had the above I generated the a static release which was a .zip file.
  • I've created an S3 bucket called aionica.computerlink.ro . The bucket must be named as your site/blog and bucket names are unique across all of AWS S3 which means that if someone else is already having a bucket called like that then you're out of luck and your remaining option then is to use CloudFront together with a differently named S3 bucket
  • created a DNS CNAME entry for aionica.computerlink.ro pointing at aionica.computerlink.ro.s3-website-us-east-1.amazonaws.com.
  • setup an S3 bucket policy allowing anyone to read any bucket content
  • uploaded the contents of the .zip file to the S3 bucket root
  • With all of the above I could not browse my blog and it was hosted on S3.

The next challenge was to create an authoring system which is cost effective and which I can easily use to create and publish new content.
I've decided to go with AWS EC2 so what I did was:

  • create an IAM EC2 policy allowing read + write access to the S3 bucket hosting aionica.computerlink.ro:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "Stmt1483709545000",
                "Effect": "Allow",
                "Action": [
                    "s3:ListAllMyBuckets"
                ],
                "Resource": [
                    "*"
                ]
            },
            {
                "Sid": "Stmt1483709567000",
                "Effect": "Allow",
                "Action": [
                    "s3:AbortMultipartUpload",
                    "s3:DeleteObject",
                    "s3:GetObject",
                    "s3:ListBucket",
                    "s3:ListBucketMultipartUploads",
                    "s3:ListMultipartUploadParts",
                    "s3:PutObject"
                ],
                "Resource": [
                    "arn:aws:s3:::aionica.computerlink.ro",
                    "arn:aws:s3:::aionica.computerlink.ro/*"
                ]
            }
        ]
    }
  • create an EC2 Security Group having the following rules:
    -- allow HTTP, HTTPS and SSH traffic only from my (home's) public IP address
    -- allow ICMP from anyware
    -- allow all traffic from any interface having the same Security Group applied
  • spin up and Ubuntu 16.04 server running on an EC2 instance of type t2.micro and ensure the instance has a public IP address allocated upon boot (so no Elastic IP permanently allocated). Also at creating time I've associated the above mentioned IAM EC2 role and Security Group with the instance
  • install WordPress, Apache2 and Mysql and make them work as on the old server. WWW root was set as /var/www/html/
  • edit /var/www/html/wp-config.php and around the top add:
    1
    2
    
    define('WP_HOME','http://1.2.3.4');
    define('WP_SITEURL','http://1.2.3.4');

    where 1.2.3.4 was the public IP address of the instance at that point in time.
    This is needed because otherwise WordPress would redirect you away to your original site url (in my case aionica.computerlink.ro)

  • install boto package (apt-get install boto) . Boto will be used to sync content to S3. I tried first s3cmd but after several hours of debugging I figured out it can't correctly guess the MIME type of files and so it was setting Content-Type text/plain for CSS files and the site was incorrectly rendering (took hours to figure out).
  • create a script for publishing new releases and placing it at /usr/local/bin/publish_blog_to_s3 :
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    #/bin/sh
    S3_BUCKET='aionica.computerlink.ro'
    BLOG_WWW_ROOT='/var/www/html'
    WORKDIR='/tmp/uncompress_archive'
    ARCHIVE_FILE=$(find ${BLOG_WWW_ROOT}/wp-content/uploads/`date +%Y/%m` -type f -name "wp-static-html*.zip" -printf "%T+\t%p\n" | sort | tail -n1 | awk {'print $2'})
    rm -rf $WORKDIR
    mkdir -p $WORKDIR
    unzip $ARCHIVE_FILE -d $WORKDIR
    aws s3 sync ${WORKDIR}/ s3://${S3_BUCKET}/ --delete
  • create a script which adds IPTABLES rules which contain the Public and Private IP of the EC2 instance. This is needed because the plugin which generates static WordPress pages tries to connect back to the server using the server's public IP address. This causes problems as the Public IP is provided by AWS using NAT(Network Address Translation) and it looks like AWS Networking doesn't provide NAT Reflection/NAT Loopback which means I needed to add an Iptables rule which makes traffic destined for the Instance's public IP to be sent back to the instance (so it never leaves the server).
    I've also took the chance to have this script also edit in /var/www/html/wp-config.php the WP_HOME and WP_SITEURL so upon boot they will be correctly set up with whatever Public IP the instance will have. The resulting script was placed in /usr/local/bin/configure_environment.sh and has content:

    1
    2
    3
    4
    5
    6
    7
    
    #!/bin/sh
    PUBLIC_IP=`wget -qO- http://169.254.169.254/latest/meta-data/public-ipv4`
    PRIVATE_IP=`wget -qO- http://169.254.169.254/latest/meta-data/local-ipv4`
    # put in NAT rule to deal with lack of AWS NAT reflection handling
    /sbin/iptables -t nat -I OUTPUT -d $PUBLIC_IP -j DNAT --to-destination $PRIVATE_IP
    # Adjust Wordpress base-url based on whatever public IP we have
    /bin/sed -i -e "s#'WP_HOME','http://.*'#'WP_HOME','http://${PUBLIC_IP}'#g" -e "s#'WP_SITEURL','http://.*'#'WP_SITEURL','http://${PUBLIC_IP}'#g" /var/www/html/wp-config.php

    And I've configured /etc/rc.local to call it upon boot

Now when I want to author and publish new content the workflow is:

  • power on VM using EC2 console
  • adjust my IP address in the security group - optional only if my home's public IP address changes
  • check what public IP was allocated to the EC2 vm and point my browser at http://public-ip/wp-login.php
  • write content, publish and then go to "Tools > WP Statich HTML Output" and click Generate Static Site
  • loging to the server via SSH and run the script /usr/local/bin/publish_blog_to_s3 via sudo
  • shutdown EC VM in order to save costs

The publishing part could be further automated so instead of running a script one could write an WordPress Plugin which somehow triggers a run of the /usr/local/bin/publish_blog_to_s3 but given I'm the sole user of this setup, there is no point in doing so.

Advantages of the above setup:

  • low running costs - it all depends on the traffic you have (see S3 pricing), for example for US East 1 it's 2.3 cents per gigabyte uploaded
  • scalability - it's all static html so it's really fast to be served. Even so at any time you can enable CloudFront and massively scale (also cheap)
  • security - there is no active component on your side which is reachable by the general public. The only weak point could be your authoring system but if you keep it locked down via the firewall then you're fine
  • low management costs

P.S. this blog post was produced and is delivered by such a setup