Un-%&$-ing MySQL character sets and collations across an entire server

Recently, requests to one of our data-backed web services started timing out.  It turned out the problem was that some of our data tables had been (re)created using the wrong character sets and collations.  And as everyone should know, and now I do:

Indexes are useless for joins unless the collations match

The carefully optimized queries were running as un-indexed.  Ugh.  Once I’d spotted the problem, the task was “simple” –  find and fix incorrect character set and collation settings through the entire database server.  Here’s what worked.  Except the RDS updates, all work was done directly from the mysql client command line, and for reference, this service included a couple hundred databases and a couple thousand tables.

First back up

First, I verified that our nightly backups were intact and complete.  While we did not have any problems with data loss, garbling, etc, through this process, YMMV.  (Note that if you back up using mysqldump with default options, it will save the bogus character sets and collations, so be careful if you have to restore.)

RDS settings

The target databases were running on an AWS RDS instance using MySQL 5.6.  Out of the box, the defaults for this version are latin1 and latin1_swedish_ci (yes, really!) so I created and applied an RDS  Parameter Group with the following settings:

  • character_set_client utf8
  • character_set_connection utf8
  • character_set_database utf8
  • character_set_filesystem utf8
  • character_set_results utf8
  • character_set_server utf8
  • collation_connection utf8_general_ci
  • collation_server utf8_general_ci

Changing these parameters will not modify anything in any of your current databases, but it will set proper defaults for creating new database objects and hopefully keep things from getting messed up again.

Databases

Like the RDS settings, changing database defaults will only affect newly-created data objects.  But it’s worth setting proper defaults this to avoid future headaches.  I handled this in two steps.  First I queried to see which databases needed updating, then I ran the updates.  Rinse and repeat until it’s all good.  Here’s the SQL I used to find errant databases:

I got a ton of hits.  Instead of handling each one by hand using  ALTER DATABASE `<database name>` CHARACTER SET utf8 COLLATE utf8_general_ci; , I wrote a little SQL to create all the commands:

I took the output from this query, cleaned it up with a text editor, and ran it from the command line.  I ran the database test query again ,and got zero records.  Success

Tables

Next, I needed to update the tables.  Finally a step that should affect real data, not just future additions.  Here’s how I found tables that needed the fix:

I omitted the mysql, sys, and performance_schema databases because the account I was using lacked permissions anyway.  It did not seem to matter in any way.  Again I got a boatload of results, so I wrote some SQL to create the update SQL for me for each target table.

The first SQL “update” query I tried was  ALTER TABLE `databasename`.`tablename` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  but this didn’t change tables unless they included string/text fields.  I was able to convert the rest of the tables using simply  ALTER TABLE `databasename`.`tablename` COLLATE utf8_general_ci;  The code example here tries the first version, then the shortened one to cover all the tables.  I ran the resulting sql, then ran the table check query again and another success.

Columns

It was not clear whether I needed to explicitly convert table columns after doing the table conversions.   So I checked:

And… I got zero rows.  So at least for our environment, altering the tables with  CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci was sufficient to update the table columns as well.  Docs were sparse about the effects of these changes, so I recommend checking to make sure anyway.  At this point all the defaults for our entire database service, the defaults for all databases and tables, and the actual parameters for all data objects match our preferred character set and collation.  As hoped, our application immediately perked up and the timeouts stopped.

 

Ubuntu 14.04… apt-get install “disk full” error (aaack!)

Recently we have been considering moving some MySQL database services from our cloud servers to Amazon RDS to simplify our management tasks, and wanted to run atop on some Ubuntu 14.04 LTS machines to get an initial sizing for RDS instances.

While trying to apt-get install  the atop package, I was getting a “disk full” error, but df -h said there were several GB’s left. It took a while to figure out the problem wasn’t disk space… I was out of inodes (basically, file handles.) Running df -i  showed 100% of the inodes were used up which has a similar result.  We monitor a lot of things, but inode usage was not one of them!

While figuring this out, I did manage to brick one of our servers, making this the day that all the work building our infrastructure with Chef paid off!  Instead of panicking and having to work all night, I was able to build and configure a new machine in just a few minutes.  But I digress.

The problem

Here’s what was going on:  Most of these servers have been running a while, and we apply all the linux/Ubuntu security updates regularly, and this often updates the linux kernel.  What we didn’t realize was that none of the old versions are ever automatically deleted.  We ended up with a /usr/src directory that looked like this:

And each version included a ton of individual inode-sucking files.  Finally some of the machines had gotten to the point where kernel updates were failing silently due to the inode shortage.

The solution

It turns out  apt  does not play well once something’s messed up, and as I said, subsequent fiddling bricked a machine before I figured out this solution.

We had to start by making some headroom using dpkg directly.  While you can shoot your foot off by deleting the kernel version you are using, and possibly recent ones, we had good success by starting with the oldest ones and only deleting a couple, like this:

etc… This ran ok and freed up enough inodes so we could run apt-get without disk full errors.  The autoremove option did all the work.

That removed the rest of the unused versions all at once, though we still had to manually “ok” a configuration option while uninstalling some of the packages.

From what I read, the default install of Ubuntu 16.04 LTS will default to removing old versions of kernels once they are not needed.  Until then, I’m writing this down in case it happens again!

basic_vsftpd cookbook for Chef

Recently I was rebuilding an old “virtual user” vsFTPd server, this time using Chef.  Of course I started by looking for a decent vsftpd cookbook.   There are several popular ones, for example:

  • The vsftpd “supermarket” cookbook might be fine for some users, but it’s not been updated since 2010, and it seems to lack a good way to override most of the default vsftpd.conf settings.
  • TheSerapher’s chef-vsftpd cookbook on github is popular but it’s opinionated with respect to defaults and seems to be aimed at setting up FTP for local users.

I was disappointed that these didn’t suit our needs, and a little bummed I wasted so much time reading the code to figure that out.  But by that time, I was so brushed up on vsftpd config that, against all advice, I started from scratch and created the basic_vsftpd cookbook.

My goal was to create a general purpose cookbook designed on three principles:

  • To be as simple as possible, only about installing vsftpd and nothing else
  • To let you create any possible vsftpd configuration
  • To set no defaults and make no assumptions about the intended use
  • OK, four.  Using code that’s easy to read and understand

In other words, the goal was to create a solid base recipe that is easy to use or extend via a wrapper cookbook.  The pleasing result was that after building/testing this cookbook, using it to deploy a real FTP server was a dream.  I’ve made this a public repo in hopes others will find it useful as well.

basic_vsftpd

“A basic and fully configurable cookbook for the vsftpd package.”  on github, and at the Chef supermarket.

Recipes

  • default – installs and configures the vsftpd package
  • chroot_list – Creates a chroot_list file for vsftpd
  • userlist – Creates a userlist file for vsftpd

Resources

  • user_conf – Creates a user configuration file in the vsftpd user_config_dir directory

Authenticating vsFTPd virtual users with pam_pwdfile.so

For years, the standard way to set up password authentication for vsFTPd FTP server was to use PAM with the pam_userdb.so module.  It looks great on paper, but if you have tried this, you know that generating a Berkeley DB password file is a PITA, debugging is blind and brutal, and password file generation does not play well with automated deployments.  On top of that, it turns out that pam_userdb.so is (apparently) being phased out of the PAM package.

I stumbled across the pam_pwdfile.so module and it worked for us without all the confusing dead-ends we got with userdb. This module seems to be supported long-term, and uses an htpasswd-like password file. Here’s how to set it up, in four steps:

Installing pam_pwdfile.so

We’re using Ubuntu 14.04 at the moment, and you must install this module as a package:

or in a chef recipe, simply:

Creating a PAM service

Create this file at /etc/pam.d/vsftpd

This creates a “PAM service” named vsftpd.   The debug option dumps some extra info to /var/log/auth.log and is very helpful in getting things set up the first time.  The pwdfile= option denotes the filename of the user/pw database we’ll create next.

Configuring vsFTPd

To use this new service, just add the following option to /etc/vsftpd.conf.

Creating the user/password file

This is the payoff.  There are a couple ways to generate the password file.  From the command line you can user the Apache htpasswd utility, and there seems to be a number of other tools to generate these files as well.

But we’re deploying with Chef and it would be great to be able to automate our deployment, and with this file format we can do it. The key here is to know you can create a properly-hashed password using  openssl passwd -1 mypa$$w0rd . Here’s an example of how to create the whole pwdfile in a Chef recipe:

That’s it.  VIrtual users should now be able to log in using passwords hashed in the passwd file.  (I’m assuming the rest of the vsFTPd configuration supports using virtual users.  This can be a can of worms to get it set up the way you want, but is beyond the scope of this post.)

Troubleshooting

First off, don’t forget to restart the vsftpd service after all the changes… and make sure it starts!  A common issue is that certain config error seem to send vsftpd into a restart loop and the system kills it.  So your start messages looks good but then it dies.

In my experience, the most likely problem here will be with the vsftpd setup and not the authentication.  To effectively “stub out” the authentication, temporarily replace the /etc/pam.d/vsftpd file with this:

This allows any user/pw to log in.  If you cannot log in now, your problem is with vsFTPd, your firewall, etc.  (Don’t forget that this leaves the FTP server wide open!)

For PAM problems, the debug option in the pam service file is helpful, as is just watching the FTP connection/login conversation.

Good luck.  I hope this saves you some of the 8+ hours we spent screwing with promising “solutions” that did not work!

BackupBuddy and Amazon S3

BackupBuddy, the popular WordPress plugin to back up your WP site, has an option to back up to Amazon S3 storage.  S3 storage is a great solution because it’s cheap, easy to use, and can be secure if you set it up right.  Unfortunately I couldn’t find any decent instructions on how to set up S3 for BackupBuddy.  The few references I found only showed how to set up wide-open S3 access, with a disclaimer this it was not very secure.

We do a fair amount of work with S3 storage here so I worked out the details.  These are my recommendations for setting up your S3 access credentials on the Amazon AWS side. Note that this is based on WP version 4.2.2 and BackupBuddy 6.0.1.2.  These recommendations may not apply to other versions.

Your AWS account and IAM

First you have to have an AWS account, of course.  You can sign up here.  In case you don’t notice in all of the agreements and terms of service, someone gaining access to your root AWS account could not only mess with all your AWS resources, but could spend a lot of your money.  (That’s why Amazon recommends securing the login and actually deleting the access keys for your root account!)  What you need is to set up an “IAM” account, which is sort of a proxy for your root account, except that you can limit what the account can do.  Read about IAM here.

Bottom line is that in BackupBuddy’s “Remote Destinations” setup, you should never use your AWS root account access keys.  You really should create an IAM account with limited privileges and use the IAM account’s access keys to do your backups. 

Setting up minimum privileges for your IAM user

Amazon S3 objects (for example, your backups) are saved into “buckets.”  Access to each bucket may be controlled individually, and that’s what I recommend:  each site you’re backing up should have its own S3 bucket, and should have privileges ONLY for that bucket.  I also recommend you create your S3 buckets manually using the AWS console so you don’t have to give your IAM user the ability to create buckets.  Then the minimum s3 privileges your IAM user needs are:

  • s3:ListBucket (for your bucket)
  • s3:PutObject (for items in your bucket)
  • s3:GetBucketLocation (for your bucket)

I don’t think you really need s3:GetBucketLocation to run BackupBuddy backups, but if you don’t include it, you’ll get an error when you click the “test” button on BackupBuddy’s Remote Destinations page, and the test button is your friend since it can save you tons of time troubleshooting.  (Note: the test button works by trying to create a small test file in your S3 bucket, but it won’t be able to delete it because of the tight rules I’m recommending.  Don’t worry – that’s not a problem.)

Setting up a limited-access policy can be tricky.   In the IAM Management Console, create your IAM user, then click “Create User Policy” and enter this JSON policy code (using your bucket name, of course.)

Notice that the s3:ListBucket and s3:GetBucketLocation actions are allowed for the bucket, but the s3:PutObject action must be allowed for items inside the bucket, indicated by the trailing “/*”.  For security reasons, this policy prevents reading or deleting items once they are created in the bucket.

Deleting old backups automatically

Since the policy prevents deleting existing backups, BackupBuddy won’t be able to limit the number of backups by deleting old ones.  It’s best to avoid any settings that will make it try.

For deleting (or archiving) old backups, the Amazon S3 “Lifecycle Management” rules work well and are easy to set up.  For example you can set up a rule to delete backups older than 90 days, and AWS will take care of it for you.  Setting up a rule to archive some or all of your backups to “Glacier” storage can provide additional security.