Chuggnutt!

Category: Online

Google’s AutoLink
Lots of invective and rhetoric being written about Google‘s new Toolbar functionality, AutoLink. Originally I probably wasn’t going to write anything about it, it’s really such a non-issue, but I’m growing irritated by the number of bloggers—mostly A-listers—who are speaking out against it. I’m not irritated as a knee-jerk reaction in defense of Google, but because most of what I’m reading is just plain wrong.

Quick background: Google’s new Toolbar (which is in beta, only runs on Internet Explorer for Windows and which you have to knowingly install to use) has a new function called “AutoLink” which, when manually invoked, searches for certain types of text on a web page and will automagically turn them into links, if there weren’t any links there already. The type of text it search for seems to be:
- Addresses. These will create links to Google Maps.
- ISBN numbers. These will create links to the product-specific page on Amazon.
- Shipping tracking numbers.
- Vehicle ID numbers (VINs).
Right off, I have to say I agree 100% with what Cory Doctorow wrote about this on Boing Boing:

It’s not a service I’d use, but I believe that it’s the kind of service that is vital to the Web’s health. The ability of end-users to avail themselves of tools that decomopose and reassemble web-pages to their tastes is an issue like inlining, framing, and linking: it’s a matter of letting users innovate at the edge.

I think I should be able to use a proxy that reformats my browsing sessions for viewing on a mobile phone; I think I should be able to use a proxy that finds every ISBN and links it to a comparison-shopping-engine’s best price for that book across ten vendors. I think I should be able to use a proxy that auto-links every proper noun to the corresponding Wikipedia entry.

And so on — it’s my screen, and I should be able to control it; companies like Google and individuals should be able to provide tools and services to let me control it.

Of all the sites I read, I think this was the lone voice of reason on the topic. Instead, you have people like Robert Scoble and Dave Winer calling this “evil” and a “slippery slope” that will lead to the end of the web as we know it and mass censorship by Google.

I’m not kidding. This is what Winer wrote:

And if links are changeable, is text subject to change as well? Might Google correct our spelling? Or might they correct our thinking? Where is the line?…

What’s next? Could they link it to Gmail, and where ever the name of a Gmail user appears in a page, change it to a mailto link so you can send them mail? If you’re in the widget business, might they change the links to your widgets to links to your competitors’ widgets? (Aren’t they already doing that to Barnes and Noble?) Would they add discussion software so that any Internet user can mark up your page with their comments, no matter how inane or immature?…

The AutoLink feature is the first step down a treacherous slope, that could spell the end of the Web as a publishing environment with integrity, and an environment where commerce can take place.

What’s funny is that email programs already autolink email addresses and web addresses—often wrong, I might add—in messages I get. And—get this—on any blog with comment functionality on it (like mine), users can already mark up that page with their comments.

(A note on the Barnes and Noble reference, though—yes, AutoLink does link a plain ISBN on Barnes and Noble’s site to Amazon. I confirmed it myself. Personally, I find it rather amusing; I know B&N will successfully lobby to get this fixed, so I’m not worried about it.)

And here’s some of what Scoble’s written:

I believe that anything that changes the linking behavior of the Web is evil. Anything that changes my content is evil. Particularly anything that messes with the integrity of the link system. And I do see this as a slippery slope….

The fundamental building block of the Web is linking. Linking is MY EDITORIAL CONTENT….

My editorial is sacrosanct. Linking is editorial.

Ironically, Scoble runs a linkblog where he reposts other authors’ blog entries, with his name highlighted, and adds a “Related” and “Comments” link to other people’s writing even as he writes the above.

It’s even more ironic that people like these guys who are all about innovation and are outspoken user advocates would come off like this. I see a “slippery slope” all right, but it’s going the other way.

How? Well, AutoLink is basically simplifying this process:
1. Highlighting a piece of text on a web page (like an address).
2. Opening a new browser window, going to Google (or MapQuest or Amazon, etc.).
3. Pasting that copied text into the search box, and clicking the search button.
4. Done.
No one should object to doing this, right? Well, the way I’m reading many of these arguments, pretty soon they will be. There’s the slippery slope, pretty soon the “content producers” are going to object because you might be using their text to search somewhere else on the web. So, let’s ban copying text from the browser. But wait, someone could just retype the text in without copying-and-pasting. Better take away the users’ keyboards so they don’t infringe on your content.

See? It’s a fun game.

The arguments almost all object to a third-party tool changing the content of their web pages by adding links. Okay, but what about the many pre-existing toolbars, plugins, extensions, and browsers themselves that already do this? Hell, the ability to do this is even built into the browser—you can turn off images, JavaScript, and stylesheets, and I guarantee doing that will alter the content of many, many sites—I’ve developed sites myself that depend on JavaScript and/or images, so I’m not exaggerating. This is a ridiculous argument.

In fact, the only good argument I’ve seen comes from Rogers Cadenhead: the copyright issue. By essentially altering a work (a web page, in this case) that is copyrighted for public consumption, the AutoLink feature may be in fact violating the copyright of that page. That’s a reasonable, intelligent argument and is something that should be addressed.

Until then, jeez. C’mon people, like Cory said, it’s healthy for the web. It’s innovation. Instead of whining about it, why not be productive? I’ve seen suggestions for an opt-out feature on web pages, that’s a good start; make it a META tag.

Or what about this? Make the toolbar smart enough to not change copyrighted pages, only those that are using an appropriate Creative Commons license, or are public domain. How would it know? META tags, again; Creative Commons licenses already embed RDF inside the content, so it’s not a stretch.

In fact, this is a good incentive to do something I’ve been meaning to do for awhile: convert my blogs over to Creative Commons copyrights. I personally have no qualms about toolbars or other software altering my content for a particular user’s display, so I’ll make it totally legal for them to do so. Within the week.

In the meantime, everyone complaining—take a breath and get over yourselves.
February 28, 2005
Wikipedia’s unusual articles

One of my new favorite Wikipedia pages is the Unusual articles list. You gotta love that. Where else could you learn about such things as Heribert Illig, a German ~~historian~~ crank who claims the Dark Ages didn’t exist and the years 614 to 911 AD are invented? Or that some guy legally changed his name to Optimus Prime, after the Transformers character? Or that the smallest park in the world is in Portland, Oregon?

February 23, 2005
Google in The Dalles

I first spotted the news a few days ago on Metroblogging Portland: Google in The Dalles. Then my wife read about it online this morning, and now it’s on Slashdot. Sounds interesting, but it seems like kind of a random place to plunk down a data center (if that’s what they intend to build). Well, it’s better than Medford or Umatilla, I guess.

I wonder if this means The Dalles will be the next technology nexus in Oregon?

…yeah, right.

February 19, 2005
Amazon Links

Astute readers will notice that I now have Amazon related links (books, actually) on some entries (spun out of my Amazon’s Web Services post). Hopefully they’re not too intrusive; I have them limited to a max of three results right now, and they’ll only show up on blog entries that I specifically keyword.

All done with Amazon’s web services. It’s not completely automatic, since I have to keyword the entry, but it beats looking up items by hand. Using the web service interface is extremely easy; simply build a URL and send the request to Amazon, and you’ll get XML results. I’m using the excellent Snoopy PHP class for the communication piece, and PHP’s built in XML parsing (using expat) to extract the information I want from the XML.

Some tips, after trial-and-error: Use a “Power” search in the Amazon request, especially if you have multiple keyword sets. An example might look like:

Power=keywords:(web services) or (xml) or (http programming)

The regular “Keyword” search turns useless after four or five words, it seems, and the “TextStream” search returned totally random results.

I played around with have the results sorted by rating (“reviewrank”), but dropped this because I was finding that older editions of the same book (hardcover vs. paperback, for example) might have a higher rating, but not actually be available. By dropping the sorting entirely, Amazon returns surprisingly relevant results.

The results can include images, all hosted on Amazon’s servers. Use them! They come in three sizes.

And finally, pick your keywords carefully. Or you’ll get some weird, totally unrelated items.

February 17, 2005
Amazon’s Web Services

I’ve been playing around with Amazon‘s web services because in my quest to make money off my blogs (quixotic? I don’t know yet), I thought it would be interesting to implement book recommendations based on keywords pulled from individual blog entries.

What got me thinking about this is that my Amazon associate links have already generated three orders from books I’ve linked to (two from The Brew Site and one from here), which kind of surprised me since I haven’t had the Amazon affiliation for very long. But I don’t really want to spend all my time writing about books just to generate clickthroughs—seems to go too far on the “shill” side of things—so I figured I go more the route of the Google AdSense ads: automatically generating results from content.

The web services are pretty straightforward, though I have to wonder why the PDF documention you can download is over 400 pages long. Holy crap! Instead, I did a quick read through the HTML version they have and picked up enough in a half hour to get started.

So, you might start seeing Amazon recommendations appearing on the individual entry pages. It’ll be an experiment; if I don’t like how they work, I’ll pull them.

February 12, 2005
php|tropics

A bit over a year ago I blogged about the PHP Cruise. Well, this year there’s another PHP conference organized by the folks at php|architect, though it’s not a cruise this time: php|tropics!

It’s in Cancun, Mexico, from May 11 through 15. Now, if I only had a few grand lying around and could convince work that it’s a business trip…

February 3, 2005
Much Ado About nofollow

Watching the various debate about Google’s nofollow initiative has been enlightening. Ostensibly, it was supposed to be a way to fight comment spam on weblogs, but predictably it took no time at all for people to figure out how to game the system. Also predictably, anti-nofollow support launched equally quickly.

I won’t use it. At all. Why? Mostly because it’s such a non-issue (it won’t do a thing to comment spam), but a large part of the reasoning is that I won’t be held hostage to what I can write and link to by any one search engine or technology. Nor am I going to let the ranking alorithm of one search engine make me do its work for it, especially if PageRank is broken like some people believe.

It’s a misnamed attribute, actually. Google says links with it “won’t get any credit when we rank websites in our search results,” but the “nofollow” label makes it appear that Google won’t actually follow the link itself. Not so. Google will follow the link, it just will not confer ranking.

More bothersome is the fact that other search engines (Yahoo and MSN, notably) have signed on to this. Why bothersome? Well, because Google’s PageRank algorithm is supposed to be a Trade Secret, and theoretically other search engines’ technologies are Trade Secrets also, so who knows how the others will actually implement processing of this attribute? Will they choose to actually not follow such links, allowing sites to potentially drop out of their indices? There’s no guarantees. But if they’re all similar to PageRank, and PageRank is broken, then they may all be broken and this won’t fix things.

Oh well. My various megalomaniacal rantings won’t change things in the world at large, so I’ll stick to what I can do on my own site. :)

February 2, 2005
Oregon tsunamis

This article on Bend.com is interesting, about the occurence (and likelihood of) tsunamis off the coast of Oregon.

Some time between 9 and 10 p.m. on Jan. 26, 1700, a similar great earthquake, with the same estimated magnitude as the one in Asia, struck the Northwest, rocking the region with strong shaking for several minutes. The specific time can be told through a variety of evidence closely studied by scientists in recent years, such as land levels, sand deposits, the rings of ancient trees and historic records….

Geological evidence indicates that mega-quakes have occurred in the zone at least seven times over the past 3,500 years, meaning they happen, on average, every 400 to 600 years.

With a little digging, I found out this was the Cascadia Earthquake (thank you, Wikipedia), a magnitude 9 megathrust earthquake that slammed the Pacific Northwest. I also found this page which has a somewhat more consequential description:

The earthquake collapsed houses of the Cowichan people on Vancouver Island and caused numerous landslides. The shaking was so violent that people could not stand and so prolonged that it made them sick. On the west coast of Vancouver Island, the tsunami destroyed the winter village of the Pachena Bay people, leaving no survivors. These events are recorded in the oral traditions of the First Nations people on Vancouver Island.

Freaky. I knew the area was geologically active—volcanoes and such—but I had no idea it was this active.

December 28, 2004
Science night

A bunch of science links tonight. Kind of a year-end thing. First, as reported by the BBC, Science Magazine has compiled their list of ten key scientific advances of 2004. The top three are the Mars rovers finding evidence of water on mars, the discover of the Indonesian “hobbits,” and the South Koreans announcing the cloning of human embryos.

The next link, via Slashdot, is this New Scientist article about Mt. St. Helens:

In late September 2004, a series of earthquakes signalled that the volcano was awakening. Since then, enough lava has oozed into the volcano’s crater to build a dome the size of an aircraft carrier. The new dome, standing 275 metres off the crater floor at its highest point, is now taller than a nearby dome built by a previous set of eruptions over the course of six years.

“Something extraordinary is happening at Mount St Helens. We are scratching our heads about it,” says Dan Dzurisin of US Geological Survey’s Cascades Volcano Observatory (CVO) in Vancouver, Washington, US. The new dome has grown so quickly – almost four cubic metres every second – that it has bulldozed a 180-metres-thick glacier out of its way. If this rapid growth rate continues, there is a growing risk of a dome collapse which could trigger a major eruption, researchers warned at the American Geophysical Union meeting in San Francisco.

Finally, via Boing Boing, The Top Cryptozoology Stories of 2004. These include the “hobbits” again, Ogopogo in Canada, and (good grief) Chupacabras.

December 17, 2004
Wikipedia amusement

I love Wikipedia and all, but sometimes I really have to shake my head in amusement/amazement when you compare the amount of content in something like the Doctor Who article (and supporting articles) to the amount in the esotropia article. One of those things that really highlights the weird imbalance of content that critics are always going on about.

December 15, 2004