Chuggnutt!

Tag: Programming

Amazon Links

Astute readers will notice that I now have Amazon related links (books, actually) on some entries (spun out of my Amazon’s Web Services post). Hopefully they’re not too intrusive; I have them limited to a max of three results right now, and they’ll only show up on blog entries that I specifically keyword.

All done with Amazon’s web services. It’s not completely automatic, since I have to keyword the entry, but it beats looking up items by hand. Using the web service interface is extremely easy; simply build a URL and send the request to Amazon, and you’ll get XML results. I’m using the excellent Snoopy PHP class for the communication piece, and PHP’s built in XML parsing (using expat) to extract the information I want from the XML.

Some tips, after trial-and-error: Use a “Power” search in the Amazon request, especially if you have multiple keyword sets. An example might look like:

Power=keywords:(web services) or (xml) or (http programming)

The regular “Keyword” search turns useless after four or five words, it seems, and the “TextStream” search returned totally random results.

I played around with have the results sorted by rating (“reviewrank”), but dropped this because I was finding that older editions of the same book (hardcover vs. paperback, for example) might have a higher rating, but not actually be available. By dropping the sorting entirely, Amazon returns surprisingly relevant results.

The results can include images, all hosted on Amazon’s servers. Use them! They come in three sizes.

And finally, pick your keywords carefully. Or you’ll get some weird, totally unrelated items.

February 17, 2005
PHP Suggest

Over on php.net, they announced a full implementation of a search field suggestion box:

The function list suggestions we started to test a year ago seemed to be working better as some bugs were found and fixed, so it was time to make the result available on all php.net pages.

Whenever you type something into the search field, while having the function list search option selected, you will get a list of suggested functions starting with the letters you typed in. You can browse the list with the up/down keys, and you will be able to autocomplete the function name with the spacebar.

Couple things I find interesting about this. First, it predates Google Suggest by a year (prior art that everyone heralding Google Suggest seemed not to notice); did Google get the idea from the PHP site, or is this more common?

The second point is a bit more trivial, but I noticed when I was trying it out by typing in “date” that there are two additional PHP date functions that appeared in the list: date_sunrise() and date_sunset(). These are new to PHP 5. They take a timestamp, latitude, and longitude and return the respective time of day for sunrise or sunset. What’s interesting is that they are remarkably similar to two functions I had written well before PHP 5 came out. (“Written” is subjective, more like “adapted,” probably from a Java function somewhere.) However, from the looks of the manual, these new built-in functions only take a Unix timestamp, which limits the results to dates between 1970 and 2038, while my functions take any combination of month, day and year. The point? I just like to toot my own horn sometimes. :)

December 28, 2004
PHP code rant

This is a mini-rant on PHP that can be safely avoided by non geek types.

This post over on PHP Everywhere caught my attention, vis-a-vis programming semantics and practice. Basically, inside a switch statement, someone placed the default block before the case blocks and was surprised when that default condition executed, and the “expected” case did not.

Some are calling this a bug; I do not. This is the exact behavior I expect switch and default to display, and I always place any default blocks last in the statement, because that makes the most sense semantically and logically. I expect this because that’s how I learned it when learning C years ago; it’s the way the switch construct works and why it’s so fast.

Relevant snippage from the PHP manual:

The switch statement executes line by line (actually, statement by statement). In the beginning, no code is executed. Only when a case statement is found with a value that matches the value of the switch expression does PHP begin to execute the statements. PHP continues to execute the statements until the end of the switch block, or the first time it sees a break statement. If you don’t write a break statement at the end of a case’s statement list, PHP will go on executing the statements of the following case….

A special case is the default case. This case matches anything that wasn’t matched by the other cases, and should be the last case statement.

Seems pretty clear to me. I would expect PHP to immediately execute the default block as soon as it encounters it, even if this “cuts off” remaining case blocks below it. So quit complaining and write cleaner code.

Okay, done ranting.

October 8, 2004
COBOL

From Tim Bray tonight comes this amazing fact:

There are five billion new lines of COBOL getting created every year, and there are (wait for it) 220 billion lines of COBOL in production. (Holy cow, now that I think about it, I bet I wrote ten or twenty thousand of them).

September 22, 2004
overLIB

Pointer to a totally excellent JavaScript library for creating popups: overLIB. I’ve been using it the last few days to put together a dynamic drop-down menu for a Web project at work. And I’ve used it before to create popup context menus and tooltips. It’s simply one of the best JavaScript tools out there that I’ve come across—it’s clever, simple to use, and it just works, period.

April 8, 2004
Search Patch
While waiting to find out if my hosting provider will change the minimum fulltext word length for MySQL, here’s what I’ve done in the meantime to deal with viable three-character search terms.

First, I split the search string into the component words (an array). I subtract any stopwords (I’ve got a big list) and for any remaining words that are under four characters long, I add to the SQL query I’m running.

Here’s the basic form of the query that I’m running, say searching for “porter”:
```
SELECT *,
MATCH(body) AGAINST('porter') AS relevance
FROM content
WHERE MATCH(body) AGAINST('porter')
AND [additional conditions]
ORDER BY relevance DESC
LIMIT 10
```
This uses fulltext indexing to search for “porter” with weighted relevance, and returns the appropriate content and its relevance score. Pretty straightforward, and it works really well.

Here’s what the modified query looks like, if there’s short words present, for the search “porter php”:
```
SELECT *,
MATCH(body) AGAINST('porter') +
  (1 / INSTR(body, 'php') + 1 / 2[position of word in string])
AS relevance
FROM content
WHERE ( MATCH(body) AGAINST('porter')
  OR body REGEXP '[^a-zA-Z]php[^a-zA-Z]'
  )
AND [additional conditions]
ORDER BY relevance DESC
LIMIT 10
```
Two new things are happening. First, in the WHERE clause, I’m using both the fulltext system to find “porter” and using a regular expression search for “php.” Why REGEXP and not LIKE? Because if I write LIKE '%cow%' for instance, I’ll not only get “cow” but also “coworker” and other wrong matches. A regular expression lets me filter those scenarios out.

That takes care of finding the words, but I also wanted to tie them into relevance, somehow. The solution I hit upon in the above SQL is relatively simple, and does the trick well enough for my tastes. Basically, the sooner the word appears in the content, the higher its relevance, which is reflected in the inverse of the number of characters “deep” in the content it appears. And I wanted to fudge the number a bit more by weighting the position of the keyword in the search string; the sooner the keyword appears, the higher the relative score it gets.

It’s not perfect, and I definitely wouldn’t recommend using this method on a sufficiently large dataset, but for my short-term needs it works just fine. The only thing really missing in the relevance factoring is how many times the keyword appeared in the content, but I can live without that for now.
March 19, 2004
PHP Development Hint

Here’s a general hint for PHP development: A quick and easy way to check for syntax or compile errors without uploading the PHP script to the Web server and testing online through a browser is via the command line. It’s obvious, and I don’t know why I didn’t think of this sooner, but I’ve been doing more and more of it lately.

I develop primarily under Windows (with PHP installed) and upload to a Unix-variant server, and this what I’ve been doing to run a PHP script on the command line on my Windows system:

php-cli -l filename.php

You could omit the -l option (it’s a syntax check option only) to parse and run the code, if you like. Either way, it’s an easy way to check your code without uploading it and potentially breaking your site.

March 16, 2004
Computer Languages History Timeline

From the Computer Languages History site comes an impressive computer languages timeline chart. It’s as much a language family tree as it is a timeline. Very nice, though a little hard to read.

March 11, 2004
Rasmus is the Man

… Rasmus Lerdorf, that is, the creator and godfather of PHP. He’s got an article on the Oracle Technology Network titled “Do You PHP?” that’s definitely worth a read. Here’s a sample:

What it all boils down to is that PHP was never meant to win any beauty contests. It wasn’t designed to introduce any new revolutionary programming paradigms. It was designed to solve a single problem: the Web problem. That problem can get quite ugly, and sometimes you need an ugly tool to solve your ugly problem. Although a pretty tool may, in fact, be able to solve the problem as well, chances are that an ugly PHP solution can be implemented much quicker and with many fewer resources. That generally sums up PHP’s stubborn function-over-form approach throughout the years….

Despite what the future may hold for PHP, one thing will remain constant. We will continue to fight the complexity to which so many people seem to be addicted. The most complex solution is rarely the right one. Our single-minded direct approach to solving the Web problem is what has set PHP apart from the start, and while other solutions around us seem to get bigger and more complex, we are striving to simplify and streamline PHP and its approach to solving the Web problem.

The guy just oozes common sense. Here’s another bit about PHP that he wrote on the PHP-DEV mailing list about two years ago, one of my favorites that just sums up beautifully the philosophy of PHP:

The golden rules of PHP are to keep the WTF(*) factor low and the POTFP(**) factor high.

(*) What The Fuck
(**) Piss Off The Fewest People

No two ways about it: he’s one of my heroes.

March 4, 2004
Formatting changes

I love templates. I was able to make some changes to the site formatting in mere minutes thanks to templates. Change two files, and it all propagates throughout the site. Lovely.

I use a modified version of the Template class from the PHP Base Library for just about any PHP programming project I work on any more. I’ve looked into other, similar classes for PHP but haven’t really found anything that comes close to the PHP Base Library Template.

I’ve never gotten into using Smarty largely because from what I know of it, it doesn’t fit my needs—it’s overkill for a templating system. (Caveat emptor. I could very well be wrong here.) Here’s a hint: not everything you use a template for needs to be/should be/can be compiled into PHP, which is what Smarty does. I can use my hacked Template class to build any kind of files, like my RSS file—not just PHP and HTML. Plus it’s very easy to use and it’s not burdened down with all the additional template scripting code (yeah, code) that Smarty allows.

For my money, if you’re working with Smarty, you might as well just forego it entirely and code in native PHP. But that’s just me.

February 7, 2004