My colleague Karl got ahold of a ton of interesting log data from a big site that I'm sure he'll write about soon enough. He shared this beauty of a useragent string with me today on IRC:
CETRIX-CS1280 Linux/3.0.13 Android/4.0.4 Release/↩ \xD9\xA1\xD9\xA2.\xD9\xA1\xD9\xA1.\xD9\xA2\xD9\xA0\xD9\xA1\xD9\xA2 ↩ Browser/AppleWebKit534.30 Profile/MIDP-2.0 ↩ Configuration/CLDC-1.1 Mobile Safari/534.30 Android 4.0.1;
↩ symbols to indicate where I've forced a line break (because I'm a designer and don't want to degrade the experience of this sweet blog theme, obviously).
So like, what is going on here? It looks like unicode in the UA string, which is... novel. If you paste
\xD9\xA1\xD9\xA2.\xD9\xA1\xD9\xA1.\xD9\xA2\xD9\xA0\xD9\xA1\xD9\xA2 into Hixie's utf-8 decoder and select "Hexadecimal" as the input type you'll get the following result:
As character names:
U+0661 ARABIC-INDIC DIGIT ONE character (١) U+0662 ARABIC-INDIC DIGIT TWO character (٢) U+0661 ARABIC-INDIC DIGIT ONE character (١) U+0661 ARABIC-INDIC DIGIT ONE character (١) U+0662 ARABIC-INDIC DIGIT TWO character (٢) U+0660 ARABIC-INDIC DIGIT ZERO character (٠) U+0661 ARABIC-INDIC DIGIT ONE character (١) U+0662 ARABIC-INDIC DIGIT TWO character (٢)
As raw characters:
OK, at this point you're probably feeling like that guy in the Da Vinci code books (Tom Selleck maybe?), so you go and put that in Google Translate (selecting Arabic and English because Roman Numerals doesn't seem to be an option, thanks Google).
...and you get 12112012, which just so happened to be 49 minutes after the Winter solstice (Illuminati much?). Or just a version number. In Arabic. In Unicode. The key to that secret is probably figuring out why the UA string contains both
Android 4.0.1. Current scholarship suggests that the 49 minute difference is derived from the 7 characters in the word "Android" times 7 (hence it being doubled up) and, well you get the picture.