Tag: Programming

  • On Blogs

    This is a bit about the blog software I wrote for this site. If you’re into the technical aspects of blogs, or PHP and MySQL, you’ll be interested in this. If not, you can safely skip it and not really miss out on anything.

    I’ve taken to calling my home-grown blog software blognutt (“blog + chuggnutt,” very clever, ha-ha), and I’ve noticed recently that at least one person found this out by viewing the HTML source of the site and searching for “blognutt” to see what they could find out. They didn’t find out much. Not for any reason of secrecy or anything like that; I just haven’t talked about it, no real mystery. It’s written in PHP 4 with MySQL on the backend, and that’s about it.

    I use the Template and DB classes from the immensely helpful PHP Base Library, though I’ve modified them extensively for my own purposes. The reasons I use phplib instead of another package like PEAR, for instance, are simple: I’ve been using phplib forever so I’m very quick and comfortable with it, it’s easier to use with a much lower-overhead code base, and I’ve already hacked the code to do things I want to do. PEAR is a fine project and I hope to contribute some classes (like my Stemmer class) there someday, but for coding purposes I haven’t used it much. Yet.

    Why did I write my own blog software, rather that using one of the many available blogging tools already available? I looked at several PHP blog packages, and looked at what other systems like Movable Type offer, but it boils down to the same approach I take to a lot of programming projects: I wanted to hack it out myself, because that’s the best way to learn. So I did. (Plus, I wanted to have absolute control of the software. I’m anal that way.) I started with what I determined were the core elements of a weblog, and made it work.

    When you get down to it, a weblog is misleadingly simple: it’s a data retrieval and presentation system. Retrieve the top X most recent items and display them; offer the ability to browse and search past entries, and there you are. The trick is in the execution.

    MySQL was the logical choice for the data store. I wanted to be able to sort and group by date, search entries, and make changes to the data structure on the fly. For me, the additional overhead introduced by adding a relational database like MySQL is worth it for the benefits I get, and since I’m doing all the programming I can make it do things that might not be available to users using other packages that they can’t control.

    I can do anything any of the other blog software packages can do. (I think. I may be missing something somewhere.) Here’s a list of some common weblog features, and some commentary:

    • Entries: Full HTML, since I control the format and storage. Each entry is tied back to the user (just me so far), date- and time-stamped, and can be flagged as a draft (and therefore not displayed to the public).
    • names: I just re-implemented these in a bot-friendly and more human-intuition-friendly manner. Now they look like /2003/07/27/name.html rather than the (less friendly but just as workable/legal) /blog_entry.php?content_id=27 style of links.
    • Comments: I’ve got the code in place to handle comments, but I haven’t turned it on yet.
    • Archive: I’ve got archive links, sorted by year and month. I can control the sort and display of archive links by changing a single line of code. You can view the archive by year, month, day, or entry.
    • Search: Another advantage of using MySQL: its fulltext indexing capability, which allows you to do natural language queries against text and returns results by relevance.
    • Categories: Easy. I’ve been thinking about categorizing my entries, but it’ll be a pain in the ass to go back through 90+ entries.
    • Calendar: Just recently implemented the calendar, showing which days have entries.
    • Last X entries: I haven’t implemented this because it seems redundant as I keep the last 10 entries on the front page anyway. It’d be easy to do, though.
    • Blogroll: Fancy name for a list of links to other blog sites. I just put some up last night.
    • Syndication: Using RSS for aggregators. I’ve written the code to produce the XML files for this, which turned out to be extraordinarily easy, but I haven’t turned it on yet largely because I’m nervous about bandwidth issues.
    • Trackback: A way for bloggers to link to other bloggers’ entries such that the blog they’re linking to knows they’re being linked to. Clever. I don’t know if I’ll support it or not, since I’m the only one running blognutt software :-). Plus, I can already find out who’s linking to me from the server log files.

    There’s more issues. One of the selling points of the bigger blogging systems is that you can update your blog from anywhere on the web, using XML-RPC. Well, the admin interface I wrote for my blog let’s me update from “anywhere on the web” too—from any computer connected to the web, no special software required, just a browser. Nothing fancy. Seems to me that XML-RPC will require some tool or client utility to use, or some interface somewhere, and I guess I don’t see the appeal in this, except possibly to save you time from opening your admin in another browser window. It’s entirely possible I’m missing something here. Having an XML-RPC API interface to a system is cool, I admit, but is it necessary? Maybe someone could enlighten me, here.

    (Of course, I’m not developing software for use by a general audience, so my way of doing things may not be appropriate for a large user base, and XML-RPC might make perfect sense in that situation.)

    One thing I am interested in using XML-RPC for is pinging sites like Weblogs.com to notify them when this site is updated. It would be easy to do; just include a checkbox on the entry add/edit screen that lets me decide when any changes should be pinged, and have PHP send the XML-RPC packet when the form is submitted. In fact, that will probably be the next change I make to the system.

    I have done something cool that I haven’t seen elsewhere (on blog/personal sites, at least): if a user comes to a “top-level” page from a search engine, searching for something specific, I helpfully list up to 3 entries that might be related to what they were searching for. For example, a user searches on Google for “dealing with a strong willed child” (this was an actual search on my site) and follows the results to my site. If they don’t get to a specific entry, and instead come to the home page, or to all the listings for 2002 for instance, then that’s too vague— so I search the database and show the top 3 results for what they might be looking for. Hopefully, this leads to the user exploring the site a little more than just hitting the home page, not finding what they searched for, and leaving.

    I’ve considered releasing my blognutt software as open source, but that raises an issue I’m not sure I want to tackle yet: support. The other issue is competition; there’s already a lot of weblogging software out there, some of it very good. Do I really want to play keeping up with the Joneses with everyone else, or should I just keep myself happy tinkering with my own system?

    All the same, I’ve been going through and cleaning up a lot of the code and modularizing it better in anticipation of a possible release, and there’s still more to finish. If you’re interesting in chatting about blogs, or seeing my code, drop me a line and let’s chat.

  • Some Items.

    Item 1. My in-laws are in town, from today until Sunday. This always causes some tension around here, as my wife doesn’t get along very well with her parents, but the kids just love them, so all’s well. Plus, they’ll watch the kids one night so we can finally go see The Matrix Reloaded.

    Item 2. The free ebooks section of my site here is definitely generating traffic; I get more hits from search engines (mostly Google and MSN) to this page than anywhere else on the site. I’ve also noticed some hits coming in from Google to the page with my PHP porter stemming algorithm.

    Item 3. Never, ever watch the movie SwimFan.

    Item 4. I’m writing up a long, geeky rant on comic books to post here sometime soon. If you’re into comics, keep an eye out. If not, no worries.

    Item 5. I actually can’t think of an “Item 5,” so that’s all, folks.

  • Site Updates

    I’ve been busy on this site the last few days with updates and revisions. To wit:

    1. Got the search feature working (finally). This is using MySQL’s built-in FULLTEXT indexing capabilities; it’s pretty slick, the first time I’ve played with it. It does natural language searches using frequency of keywords to produce relevancy scores… if you understood that, drop me a line because you’re as big a geek as I am…
    2. I’ve been adding new ebooks for the Palm Reader. I finally buckled down and loaded the Dropbook software on the new machine at home, and hacked up a PHP command-line script to do almost all of the work in converting Project Gutenburg texts to PML format. Now I can crank out several ebooks a day.
    3. Added the “Word Stemmer” item to the list of projects on the right. It’s a PHP class I wrote myself available for download; I’ll be putting more up there as I get them prettied up, code-wise.

    There’s been several other tweaks I’ve been doing behind the scenes, too. Nothing overt, but stuff to make things (hopefully) run more smoothly.