When I was in Kiev a few months back, I realized that all web addresses are in Roman script. Now this may not be much of a revelation, I’ll grant you, but I was intrigued to imagine that while I’m guessing much of the older generation in countries that don’t use the same alphabet as I do have very little understanding or recognition of the Roman letters, the younger, Internet savvy generation, probably have to have. And not just the ones who have learned English either.
I was reminded of this the other day when I was searching for something on Google, but unusually (in fact maybe uniquely) I was looking up something in Hungarian. It was then that I realized that the way Google works in English may not be quite as successful in Hungarian. If I type an English word into it, I know Google will find all the instances of that word in its database. Exactly that word. But in Hungarian, a word will vary in its spelling depending on its role in a sentence and whether it has suffixes stuck on it. If I type in “tojás” (egg) for example, it will presumably return all instances of the word tojás. But, if tojás is the direct object of a verb (as in “I boiled an egg”) it will be “tojást”, or, as far as a search engine is concerned, a completely different word. And that’s just one possibility. For place names the range of possibilities is endless. Off the top of my head the name of this town could be rendered as Csikszereda, Csikszeredán, Csikszeredában, Csikszeredára, Csikszeredába, Csikszeredát, Csikszeredához, Csikszeredával, Csikszeredábol, Csikszeredárol, and almost certainly loads of others depending on whether you’re in the town, going to the town, coming from the town, or just hanging around in the general vicinity of the town.
I checked this out on Google, as I suspected that they may have worked something out for this – after all even in English you get plurals which are in essence different words – and it seems they have. They claim to use something they call “Stemming technology” (isn’t that what George Bush wants to ban?) to ensure that different variants of the same root word are recognized when you search. I wonder if this only works with English or it somehow crosses languages. Or if google.hu uses a different Magyar version of stemming technology? If not I fear there are a lot of searches that may miss their targets. But how does stemming technology work – is it a piece of software that guesses which words have the same roots? So if you type in station you might get hits for both stationary and stationery? And if not, then presumably the groups of related words have been programmed by someone.
If not (or before the miracles of stemming technology) I’m guessing use of a search engine is/was quite a different skill for a Hungarian than it is for me, for example. Thinking about it occupied my brain for a few minutes anyway, and now, thanks to the miracles of the internet, I've shared that inner monologue with all of you. My generosity knows no bounds.
Some new favourite Hungarian words: Kinel, which is a question word meaning (something like) “at whose place?”, and which is amusing because, well it sounds like “kinel”. It may be only British readers who see why that’s remotely amusing to my puerile mind, but if you are really interested I’ll explain it in the comments. And Prezli, which means “breadcrumbs" and is amusing basically because it is pronounced exactly the same as the surname as the singer of Heartbreak Hotel, and one or two other songs. It amuses me to think of Elvis Breadcrumbs. Not sure why, but there you go. Hungary even has it’s own Elvis figure, a bloke called Fenyö Miklos (Nicholas Pine-Tree), who is very big on the rubbish variety programmes shown on New Years Eve circuit.
Football & The Law: The “FIFA Trial” – Week One
10 hours ago