CMS Architecture: Before you begin

It’s important to plan any large-scale software development. Fail to plan, plan to fail as the old saying goes. So before a single line of code is written, before even any architecture or technology decisions have been made, I’m going to define what I want my CMS to be.

Firstly, and most importantly, the CMS needs to be extensible. In the same way that CRM (customer relationship management) systems became XRM (entity relationship) systems this CMS needs to be an XMS.. This will. mean that the system can be used as much more than just a system for managing pages, and instead can be used to manage any kind of data.

Content is generally thought of as pages and files. but just below the surface the vast array of types of content is easy to find:

– Property website have pages which represent houses
– Band websites have albums, which have songs
– E-commerce systems have products, which have alternative versions (like sizes) and related products
– Magazine sites have articles, which may be several on one page or longer articles spread across multiple pages

Forcing users to manage and – more crucially – conceptualise their content as just “pages” will lead to confusion and badly maintained data.

WordPress handles these different types of data through Custom Post Types, and does it very well. All post types share the same basic properties, and can be extended by adding further metadata. Relationships can be defined between posts using the taxonomy system.

This capability is crucial to the power afforded by a CMS and needs to be baked in from the very beginning.

Secondly the CMS needs to be easy to work with by developers (ease of use from a front-end user perspective goes without saying). There are three traits of well-designed systems that make them easy to use by developers.

1) Consistent

If the system is not consistent in its approach every line of code has the potential to trip the developer up. Consistency is not just about ensuring class and method names follow a pattern, but about the ethos behind the architecture of the system. If the user has to instantiate a helper class in module X, but in module Y there is a static manager class then consistency is broken.

Also having two – or even more – ways to achieve the same thing without clear direction on which one to choose puts. doubt in the developer’s mind.

2) Discoverable

Following on from the system being consistent is making it discoverable. It may not be possible to create a system where developers never need to check the documentation, but aiming to get as close to that as possible is a worthy goal.

This can be achieved by applying standard patterns that developers will recognise, and by carefully thinking through each architecture decision to ensure it is as intuitive as possible to use the system.

3) Sensible

If just one of these traits is followed, make it this one; make the system sensible. Have sensible defaults, sensible naming conventions, sensible use of existing patterns, sensible architecture decisions. The moment the system forces developers to do something non-intuitive, a little bit crazy, then a battle is lost.

Better to spend a little time now ensuring the most sensible decision is taken than spend a lot of time later reversing a crazy decision.

Thirdly the system must make as few assumptions as possible. Setting hard limits, such as the number of properties an entity may have, is a recipe for disaster – even if you think those limits are really high. Assuming that users will always need to be authenticated by a username and password will fail the moment an organisation needs users to enter their membership number as well.

Assuming that particular HTML output will always be used is also a sure-fire way to ensure the system will get outdated quickly. And assuming that your users will never want – or need – to use a different database system will also limit future development. Designing a system this flexible is hard work and requires a lot of thought, but when the alternative is painting yourself into a corner the choice is clear.

Many of these system attributes can only be properly verified as the system is being used. And there’s no better way to verify a system can be used than actually build something. So it makes sense to build the entire front-end system, including a default theme, using the API. This has proved to be a successful approach for several companies, and ensures a level of internal testing beyond the usual unit and functional tests.

Bearing in mind these principles we’re ready to start making actual technical decisions. That will be the subject of the next post in this series.

CMS architecture: Part 1

I’ve been doing a lot of thinking about the architecture of content management systems (CMS) recently. Little wonder, that’s my full-time focus at the moment. By “architecture” I mean pretty much everything to do with the planning and development of a CMS. This blog post is the first in a series that explores some of the elements to think about if you’re going to create a CMS from scratch.

This is unashamedly going to be at an advanced level – I’m not talking about a simple system just to keep a few pages updated. I’ll try to keep as technology-agnostic as possible, but I will be coding at least part of this system to ensure what I say is technically feasible.

The areas I’m going to tackle, in no particular order, and almost certainly incomplete, are:

  • Data storage

    Any serious CMS need a database, but is a relational database (MySQL, Postgresql, SQL Server) a better choice than a NoSQL database? What about extensibility, making complex queries possible for reporting purposes, performance, versioning? How about scalability and data security?

  • System security

    Unless you want everyone to be able to do everything you need to be able to secure aspects of the system. So you need user accounts with authentication mechanisms. Securing individual parts of the systems (particular modules, or specific related data) needs to be possible, and what about SSL? There’s also the question of authenticating 3rd party systems, for example users of APIs.

  • Extensibility

    WordPress, which I love, has a fantastic API which enables developers to write plugins for almost every conceivable use. Plugins are cool, and the hook and filters that power them are a must. But what about cutting a little deeper than that; allowing entire subsystems and modules to be swapped out? What about an API?

  • Output

    Obviously a CMS will have some form of HTML output. But how do you architect the system so the sweet spot between allowing front-end developers a large degree of control over the HTML and the system producing what it needs to run? How about themes and templates? Repurposing content is going to be come increasingly important, and so how do you handle microformats and data schemas? What about alternative outputs: PDF, XML, JSON etc? Then there’s the tiny matter of internationalisation.

  • Assets

    Assets are a big part of any CMS. Storing files securely is just one aspect of this, but how do you handle versioning and repurposing of assets (PDFs also available in Word and ODF, for example). And with images getting more complex with high-DPI displays, how do you handle resizing imagery?

  • Performance and scalability

    Caching is key, but what do you do when you grow from 1 server to 10, to 100, to 1000?

I don’t pretend to have all the answers to this stuff, it’s just an area that interests me and I want to explore. If I end up with an experimental CMS at the end of this that handles a few of these thorny issues then I’ll be a slightly better developer than I am now. And even if I don’t I’ll still have done some serious thinking about these issues.

Blog highlights

I enjoyed myself with this trip through my blogging history, but I guess something you’d like to see is some highlights of what I’ve written about. Here’s the greatest hits of stillbreathing.co.uk (in my opinion, of course).

*Sniff*. Good times.

Protecting your bits

My car is poorly. Yesterday there was a “big metallic bang”, according to my wife, and then it started “clanking”. Gotta love these technical people! The guys at Kwik Fit soon diagnosed the problem: the front passenger side coil spring had snapped. Great, more expense. And all due to the state of the roads. Thanks, local council.

But one thing the Kwik Fit bloke said interested me. Looking at the broken spring he commented how it was good the car manufacturer had started putting a plate at the bottom of the spring, as in days gone past springs would snap and go through the tyre. That would, obviously, have caused a serious accident. But the spring fortunately broke in a safe way, and I’ve got a reasonably drivable car.

When writing software we have to do the same thing. We code for the possibility that bits will break, to protect other bits and the application as a whole. There are a number of ways we do this, here’s a quick list of the ones I can think of.

  • Ensure that if we’re going to use a variable, it is set
  • Check the type of variables: if a variable must be an int then make sure it is an int
  • Checking whether we need to do an operation at all, for example not looping a collection if there’s nothing to loop
  • Checking a collections length before trying to get an item with a non-existent index
  • Catching exceptions
  • Providing meaningful error messages
  • Persisting form information so users can try again if their submission doesn’t work
  • Checking variables are within the required range, for example validating a birth date

And there are probably loads more, including ensuring that the UI looks and functions reasonably, even if the user doesn’t have the latest, greatest browser.

With all of these things we’re aiming to ensure that if something breaks – and it will, let me assure you – it doesn’t cause an accident. Car manufacturers have figured this out, and rightly so as they have a responsibility towards the safety of road users. I don’t want to think of how many tires were blown before they added those safety plates.

Here’s an old, but true, saying; an ounce of prevention is better than a pound of cure.

So it appears I’ve been doing this blogging thing a while

Time’s a funny old thing. I was looking back over my blog and was shocked how much of it there is: 131 pages at the present time, stretching back to 3:58 pm on October 21, 2004. Actually I had been blogging since 2002 (11th July 2002, to be exact) but that particular blog was an emotional release for me at the time, so will remain incognito forever. But as if proof were needed here’s the very first thing I blogged:

Breath deeply and relax (11/07/2002)

Well now we’ve got that little lot sorted out, maybe I can start to write something.

As if I have anything to write.

(There’s a bit more to the post than that, but those were my very first words sent into the blogosphere). I think “that little lot” was a reference to the blogging system I’d just written.

Ah, you need some historical context. OK, bear with me while I wander down memory lane. Back in mid-2002 my very good friend Dave said he’d heard about this blogging lark and could we offer blogs to the good parishoners at the community site we ran (and still run, in a fashion). I said of course, and proceeded to write a system to do it. Why I didn’t look around at what other systems (ahem) were available at the time I have no idea.

Actually it was, I recall, an opportunity for me to get into some more serious PHP which I’d been dabbling in for a couple of years before that. My day job was building sites in classic ASP using VBScript, so perhaps PHP was a way to escape that world. At any rate, I embarked upon this quest with my usual gusto – and devotion to inventing new wheels. The fact I built a multi-user blogging system *without a database* shows I didn’t really know what I was doing. It’s true, the original Wiblog site was a flat-file system which used XML files to store blog posts. Hardly cutting edge, even for the time.

It’s pretty interesting to note this all happened was about a year before WordPress came into being as a fork of b2, probably around the time Donncha was working on the b2++ project which became WordPress MU. Sorry the links for that stuff are a little squiffy, Donncha’s personal site seems to be down tonight.

So in the summer of 2002 you’ll have found me either spending time with the beautiful Katharine (who is now my wife, we just celebrated our 7th wedding anniversary) or coding the Wiblog system far too late into the night. I “officially” started blogging about tech stuff on October 21st, 2004 with a post about Firefox – which, according to Wikipedia, was just before version 1.0. That makes it all the more amazing that I wrote:

On the auspicious occasion of my company disabling web access for Internet Destroyer Explorer and instead promoting the use of Mozilla Firefox I thought I’d finally start the techy blog I’ve been threatening people for ages about.

Reading stuff like that makes me wonder if my dates are correct. I even checked the archived Wiblog site, and the date is correct. Sheesh, I must be old.

I moved the site over to WordPress, importing my geek blog post from Wiblog.com, on 23rd January 2007. That means I’d have installed (again, according to Wikipedia) version 2.0 or possibly 2.1 of WordPress. A couple of years later, on 8th November 2008, I moved all the other Wiblogs onto WordPress 2.6, onto the site we’re still using today.

So, 653 posts later, I’m still here. Blogging has been, in roughly equal parts, self-therapy and self-flagellation. Long may it continue.