Deploying MediaWiki via Chef

After deploying a few custom servers to support both Ruby and php applications, setting up a server to run MediaWiki seemed like a breeze.  And it is, sort of, so long as you know where to draw the line.

It seemed plausible to create a Chef cookbook to deploy a ready-to-run MediaWiki installation.  And I wasted a lot of time trying.  At first I was delighted that there was so much setup documentation, but as I dug  in to it, I realized that it was virtually all written for a barely-technical audience. Important details were missing and once you got outside the “recommended” web-based install directions, there were even factual errors.

So after trying a number of things,  documented and undocumented, I finally conceded that the only really supported way to do an install is using the web install method that MediaWiki provides.  It’s a shame really, since all it does is set up the database and create the LocalSettings.php file.  But the docs are simply not sufficient to support automating anything further.

Here’s what works

You can use Chef to set up:

  • the basic server niceties, like a firewall, email support for crontab jobs, and the like
  • the basic LAMP stack and php using the supported apache2, php and mysql cookbooks
  • MySQL users and create the database.  While the MediaWiki installer would create the production MySQL user and database for you, there’s no harm in letting Chef do it first, and the value is that Chef will maintain the resources.
  • the MediaWiki image and initial files.  I automated pulling the MediaWiki tar file, and expanding it into the public www directory, with a guard to prevent if it already exists.

But that’s all!  Efforts to build out the database using the MediaWiki scripts failed, and the LocalSettings.php file just seems to require manual editing whatever you do.  So after the Chef build you still have to run the web-based setup, and copy (and probably edit)  the generated LocalSettings.php back to the document root directory of the site.

That’s not how we like to roll here… it’s not terrible but it’s a step backwards toward the manual setup + “servers as pets” thing, and for now we’re stuck with it.

Update: it gets worse

After backing off from trying to automate any actual MediaWiki installation, I thought I was out of the woods.  Since I was planning to add Semantic MediaWiki and Semantic Forms, I thought it would be OK to install these packages using Composer as part of the server build.  (Nope) That results in an immediate error when trying to run the MediaWiki web-based installer.

So now my “automated” install has to work like this:

  • Use Chef to converge a new server
  • Run the manual web-based MediaWiki installer
  • Copy the LocalSettings.php file back up to the new server
  • Manually add the extensions I want to use via php Composer (and any other required install steps, including a frequent requirement to edit the LocalSettings.php file.)

It’s really not looking much like an automated solution anymore, but that seems like the best I can do.  After using tools like Chef and Capistrano for deployment, I am finding the process for installing MediaWiki to be rather unsatisfying.

 

What’s Wrong with Chef and Other Rants

My apologies, this is a placeholder for a number of things I have meant to write about using Chef to deploy our cloud servers.  It turns out the task took longer and was a lot more convoluted than imagined, even after considering Hofstadter’s Law, so the writing has all been pushed back.  But here’s the proposed list of topics worth writing (& reading) about:

Last first, since it is current.

New theme :-(

Today I upgraded a raft of things on this WordPress site, one of them my base theme.  And the site looked like crap-o-la.

Apparently, creating a child theme to avoid losing custom styling when upgrading the base theme… is virtually no help at all.

So it goes.  Spending the afternoon fiddling with CSS seemed like a terrible idea, so I just picked  a new theme.

It seems like most of theme authors assume you really don’t have much to say because most the the page is filled with header graphics, etc.  Well maybe this one (WP-Forge) errs on the side of boring simplicity.  So it goes.  It works and I like it

Announcing encoding-inspector

First there was the encoding_sampler Gem

Last November, I created the encoding_sampler Gem (RubyGems, github) to help find the desired encoding for a particular text file. From the Gem description:

EncodingSampler helps solve the problem of what to do when the character encoding is unknown, for example when a user is uploading a file but has no idea of its encoding (or typically, even what “character encoding” means.) EncodingSampler extracts a concise set of samples from the selected file for display so the user can choose wisely.

If you deal with client data files, you know the problems… the client’s eye’s glaze over when you ask about character encoding. And the encoding problems may be sparse; If there’s one line with an “®” character 2MB down in the file, you’re just not going to find it by inspection. Encoding_sampler does the heavy lifting.

But… while getting the encoding right is critical, we have found that building an encoding detection tool into our general data import applications is overkill. It complicates the user interface and confuses our clients.

So… the encoding-inspector web service

My solution was to create a separate web application to determine a text file’s encoding:

encoding-inspector, at http://encoding-inspector.triresources.com

The inspiration comes from common tools like the W3C Validations Service, and this online JSON parser. Encoding-inspector is essentially a front-end to the encoding_sampler Gem. Here’s a screen shot of a typical result:

Typical encoding-inspector output

Results provide simple visual samples so non-programmer/non-tech users can understand the options and choose the right encoding.  In this example, you can see that UTF-8 is the correct encoding.

encoding-inspector is built and maintained by Roll No Rocks and it’s hosted by TRI Resources International. It’s our hope that people find this resource useful, and that we can continue to provide it as a free service.

Fear and loathing with Chef and nginx

I always hate having to explain why I spent 4 hours on one line of code:

It’s not really even code, it’s just a configuration setting.  I ran into the problem using Chef, the opscode nginx cookbook, and the very-new Ubuntu 14.04 LTS.  Not sure if this applies to other configurations but debugging this sort of thing takes a long time with all the vagrant up-ing and all, so in case this saves someone else a few hours…

Symptoms

Whenever I’d “vagrant up” a new instance, and then run a Capistrano deploy, everytthing looked ok but would not serve web requests.

  • Web response was 404.  Logs showed the missing content was something like “/var/www/nginxdefault/index.html”, the default that’s created with a new install of nginx (though the chef run had already deleted it.)
  • Cap tasks to manage nginx were ineffective, and SSH’ing to the server and running the raw commands manually against nginx (“nginx -s reload” etc) didn’t solve anything either.
  • Web requests failed until I did a reboot then magically, everything worked fine.

After poring over all the code in the whole stack from the chef run list through the Rails application code, it turns out the problem was simply…

The pid file location!

It turns out that the pid file setting in the config file (/etc/nginx/nginx.conf ) and the startup file (/etc/init.d/nginx)  was

but the pid file for the running process created from the initial install was actually at

Issuing nginx command line commands didn’t work because they referred to the config file for the pid location.  The only simple way to get rid of the bad nginx processes was kill -9, or a reboot.

 Simple solution

The right thing to do is to fix the chef recipe so the new nginx processes match what’s in the recipe configuration, but with new versions of Ubuntu, the opscode guys must be rather busy.  I guess I should figure this out and send a pull request .  But the quick solution is shown on the second line of this post.

Since it’s not obvious how the initial startup pid file is set, it’s easier just to set the pid file location defined in the nginx config and startup files to match what it turns out to be.  Then you can control nginx properly all the way through the Capistrano deploy, and over reboots.

Yep, it’s brittle.  If the nginx package or the chef recipe changes that, the same old problem is back.

Update…

I ran a clean build with Ubuntu 14.04 LTS, and installed the default nginx package manually.  The default config files do not specify the pid file location explicitly.  The install leaves a running instance of nginx with the pid file at /run/nginx.pid.

It turns out this is actually not a new issue at all.  It’s described as a bug here nginx.pid location changed for Ubuntu releases 11.04 and up on 09/2012, and the report references an 08/2011 AskUbuntu question Why has /var/run been migrated to /run?.  Apparently starting with Ubuntu 13, /var/run is no longer symlinked to /run and this will start breaking.

Bottom line is that now, if you do not happen to pick the actual default pid file location (which is not what’s set by the recipe defaults) the pid file location is changed after the service is running.  Then you’ll have to manually kill the original nginx processes or just reboot.  Not what anyone wants.