# The ultimate Weblogging system, outlined

All my humble opinion, of course.

  • Forward compatibility
    • License under the GPL (minimizing lock-in, architecture rot, and wasted development effort).
    • Work with at least one Free database (e.g. mySQL).
    • In case of emergencies, allow entries to be exported to XML.
    • Use entirely non-crufty URIs.
      • Give individual entries URIs (permalinks) of the form http://base/2003/05/02/oneMeaningfulWordFromTheTitle.
        • No irrelevant system-specific cruft (e.g. mt-static/, msgReader$, or weblog.cgi).
        • No ? characters, so all entries get indexed by search engines.
        • No irrelevant filetype-specific cruft (e.g. .html, .php, or .xml).
        • Every entry is on its own page, not just an internal anchor on a daily/weekly archive (which makes search engines and statistics tools less useful).
        • Net effect: Even with a stupidly worded inbound link (e.g. “I came across this”), a reader can tell a lot about an entry (host, date, and hint at subject) from glancing at its URI.
      • Give daily archives URIs of the form http://base/2003/05/02/.
      • Give monthly archives URIs of the form http://base/2003/05/.
      • Give yearly archives URIs of the form http://base/2003/.
      • Give category archives URIs of the form http://base/name-of-category/2003/05/, etc.
      • Theory: URL as UI, Cool URIs don’t change.
      • Practice: Making clean URLs with Apache and PHP.
  • Metadata
    • Each entry has a title, a category string, contents, time posted (auto-generated), and one or more objects (e.g. images).
    • Invite (but do not require) the author to provide a summary for any item longer than n words, for use in mobile editions and RSS feeds.
    • Categories are faceted. I may categorize an entry by subject, by current location (integrating with GPS devices), by mood, and so on.
    • Each category facet can be hierarchical. (For example, an “interface design” subject category could be subdivided into “desktop application design”, “Web design”, “appliance design”, and “signage and artifact design”.)
    • Invite (but do not require) an author to subdivide a category whenever it collects more than n entries (rather than forcing them to be architecture astronauts specifying all their categories at the beginning).
    • An entry may have multiple values for each category facet. (For example, one post might be about both CSS specifications and buggy Web browsers.)
    • Why does all this need to scale so deeply? Because when you’ve been keeping a Weblog for twenty or thirty years, and you can’t remember any semi-unique words you used in a particular entry, finding it will be horribly difficult, and you’ll need all the semantic help you can get.
  • Syndication
    • Provide an RSS feed for the Weblog as a whole.
    • Provide an RSS feed for any category.
      • Because of the faceting, category feeds will need to be dynamically generated, but they should still send correct caching responses.
    • Automatically ping Weblogs.com.
    • Automatically convert Slashdotted entries to static pages, and switch back to dynamic generation once the traffic subsides.
    • Integrate support for Creative Commons licenses.
  • Management
  • Backward compatibility
    • Import entries from Blogger, Radio, Manila, Movable Type, etc.
    • Keep URIs the same for legacy entries, while still allowing control over their appearance.

… Update: I wrote up this outline mainly for the benefit of a couple of people who are implementing GPLed Weblog systems right now, but already it’s being commented on elsewhere. So, answers to some questions:

Why GPLed?

It needs to be Free Software, because it’s intended to be used to maintain a Weblog for a lifetime. Over several decades, the usability of any non-Free software approaches zero — unless you can still emulate the eventually-defunct application vendor’s last supported operating system in whichever future operating system you’re using, and even then you won’t have the platform integration expected of a native application. (That’s what I meant by “architecture rot” above.) And of the Free licenses, the GPL provides the largest pool of already-written code to draw on, minimizing duplicated effort.

LinuxSTEP? Huh? What about Windows, or the Linux UIs that people actually use?

I regard it as even more important to have substantial non-infringing use of Free operating systems, in the home, in the short term (to make involuntary DRM, proprietary media formats, etc. unviable), than it is to have people using Free software to maintain their Weblogs even in the long term. So it is more important to provide an incentive to use a non-sucky Free operating system (a sweet Weblogging client on LinuxSTEP) than it is to make it easy for Windows (or Gnome or KDE) users to use my hypothetical Weblog software. (There would, of course, be nothing preventing other people from implementing clients on other platforms; I just wouldn’t regard that as a high priority.)

What about NucleusCMS?

That looks pretty good — it’s GPLed, it uses mySQL, it supports the metaWeblog API, and it even imports Blogger entries! But it needs a bit of work — its URIs are undated, and crufty by default, and its categorization scheme doesn’t look scalable enough to last for a lifetime.

Posted on 5/2/03; 2:42:33 PM