stakebait | (Reply)

From:

icedrake.livejournal.com

(This is just not my HTML day. Sorry for the notification spam)
I adore the Smithsonian tile story -- that's brilliant design, even if it was accidental. The same approach was used by a Russian architect when a new neighbourhood was constructed in Moscow -- his design had no pedestrian trails. Instead, he waited for a few months, and then had the builders pave trails where the pedestrians stamped out the grass.

Having said that -- notice that both our examples use unconscious user feedback? Now consider something massive and popular that relies on conscious user-controlled tagging. The only one I can think of is a service I've been using for quite a while, Blink List.

Choosing a tag that's likely to be fairly popular (it's present on BL's front page), we get this. Which, by itself, is great. Except... Suppose you're looking for stories on people defrauding the Google AdSense program. What exactly are you looking for? Is it fraud, clickfraud, click fraud, Google ad fraud, Adsense fraud, AdSense fraud, or is it cheating AdSense? All are valid, potentially relevant tag terms. I've yet to see a system that recognises their similarity. An even worse example of the fault used to be Engadget. Despite the fact that the blog was entirely edited by a hired group of professionals, it was searched by random passers-by. Thus, tags had to cover every eventuality, both with spaces between words and without, with capitalisation and without... At least they didn't try to cover common misspellings! It seems that Engadget found this method didn't work, however; the recent site redesign has done away with tags completely.

I'd love to see visualisations of tag clouds, with similar tags being grouped together, the way Music Plasma does it for bands. But that works great in specialised, narrow areas (like music) and much less well when transformed into a generic, unspecialised system. In music, there are very specific ways an expert can define similarities. In the free-for-all that is the internet, no such clarity exists. Pandora exploits (and in my opinion, quite well)these rigidly structured similarities, whereas all you can count on Gnod Books to produce is a fairly expected listing of popular authors, most of whom you've already heard of, and the cloud itself representative of the common average, not at all tied to your personal tastes.

I've touched on the problem of popularity, but I want to expand on it.
Here's a challenge: Try finding a listing of albums released by Muse. No, not the annoying alt-rock hoarse-male-voice band. The Scandinavian all-instrumental group. That's all I know about it, and I have yet to find even a bio in English. Oh, what would I give for a button that'd tell Google to exclude all references to the alt-rock one!

This isn't a new issue -- it's the problem with peer-reviewed scientific journals, and is being talked about quite a bit nowadays. The likelihood of truly challenging, revolutionary papers being published in the best-known journals is quite low. They don't want to take the risk of being wrong or even of offending the majority of their readership, who of course hold to the prevalent views.

You get the same thing with tag clouds and even the choice of tags. How many people you know are likely to correctly tag something as a simile as opposed to a metaphor? How random web afficionados are likely to correctly distinguish between genetics and genomics when the very first definition Google brings up is incorrect? Step off the beaten path for even an instant, and you're lost forever, wandering through a maze of dead Geocities pages and vaguely-funny 404 errors.

(continued above)