Updated Search

I’ve been vastly updating the search functionality on my site. I’m still using MySQL‘s built-in FULLTEXT indexing to perform searches, but I’ve made the results page look a lot more (okay, almost exactly like) Google‘s. The main differences are that I’m not paginating search results (yet)—all searches limit to 10 results—and that I’m showing a relevance percentage, the first result being arbitrarily determined to be a 100% relevant.

To determine relevance, I’m relying on MySQL: a fulltext MATCH(field) AGAINST('search string') directive will return the relevance number that MySQL computes when used in the SELECT part of a query. (See MySQL Full-text Search in the online manual for detailed info on this.)

Further plans for searching that I haven’t implemented yet: utilizing MySQL’s IN BOOLEAN MODE parameter with searching to allow advanced things like phrase searches (with quotes), required word matching (using the plus sign), and subexpressions using parentheses. It’s pretty cool stuff. Oh, and I want to be smarter about presenting excerpts: Google tries to show you content excerpts with your search terms in them, I want to be able to do the same; currently I’m just showing the first 250 or so characters of the text with HTML stripped out of it.

And since I’m developing my whole Personal Publishing System in an open process, I’ll write up a detailed technical article soon on how to effectively use MySQL fulltext searching and show Google-like results. All real-world; the code will be cribbed right out of my search.php file.

Comments

3 responses to “Updated Search”

  1. Mike Boone Avatar

    I tried a search on PHP and got no results. I thought that was weird, but then I realized the MySQL default minimum word length in fulltext is 4 characters.

    You could change that by adding a line to your my.cnf file:

    [mysqld]
    ft_min_word_len=3

    http://www.mysql.com/doc/en/Fulltext_Fine-tuning.html

  2. Jon Avatar

    D’oh! Yeah, that’s one of the limitations to fulltext searching: it only indexes words greater than 3 characters in length. Forgot about that 🙂

    But thanks for the tip, I’ll implement that.

  3. smiffy Avatar
    smiffy

    Hi
    I’ve set up FTS to show relevance but don’t know how to show it as a percentage, Ive looked on http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html, either i’m stupid (most likely) or its not very obvious – can you tell me how to do it please,

    Yours desperatley – SMIFFY 🙂