Entries tagged as english

Languages and translation technology

Friday, November 9. 2012, 22:53
Chinese timetableJust recently, Microsoft research has made some progress in developing a device to do live translations from English into Mandarin. I'd like to share some thoughts with you about that.

If you read my blog on a regular basis, you will know that I traveled through Russia, Mongolia and China last year. If there's one big thing I learned on this trip, it's this: English language is - on a worldwide scale - much less prevalent than I thought. Call me a fool, but I just wasn't aware of that. I thought, okay, maybe many people won't understand English, but at least I'll always be able to find someone nearby who's able to translate. That just wasn't the case. I spent days in cities where I met nobody that shared any language knowledge with me.

I'm pretty sure that translation technologies will become really important in the not-so-distant future. For many people, they already are. I've learned about the opinions of swedish initiatives without any knowledge of swedish just by using Google translate. Google Chrome and the free variant Chromium show directly the option to send something through Google translate if it detects that it's not in your language (although that wasn't working with Mongolian when I was there last year). I was in hotels where the staff pointed me to their PC with an instance of Yandex translate or Baidu translate where I should type in my questions in English (Yandex is something like the russian Google, Baidu is something like the chinese Google). Despite all the shortcomings of today's translation services, people use them to circumvent language barriers.

Young people in those countries are often learning English today, but it's a matter of fact that this will only very slowly translate into a real change. Lots of barriers exist. Many countries have their own language and another language that's used as the "international communication language" that's not English. For example, you'll probably get along pretty well in most post-soviet countries with Russian, no matter if the countries have their own native language or not. This also happens in single countries with more than one language. People have their native language and learn the countries language as their first foreign language.
Some people think their language is especially important and this stops the adoption of English (France is especially known for that). Some people have the strange idea that supporting English language knowledge is equivalent to supporting US politics and therefore oppose it.

Yes, one can try to learn more languages (I'm trying it with Mandarin myself and if I'll ever feel I can try a fourth language it'll probably be Russian), but if you look on the world scale, it's a loosing battle. To get along worldwide, you'd probably have to learn at least five languages. If you are fluent in English, Mandarin, Russian, Arabic and Spanish, you're probably quite good, but I doubt there are many people on this planet able to do that. If you're one of them, you have my deepest respect (please leave a comment if you are).

If you'd pick two completely random people of the world population, it's quite likely that they don't share a common language.

I see no reason in principle why technology can't solve that. We're probably far away from a StarTrek-alike universal translator and sadly evolution hasn't brought us the Babelfish yet, but I'm pretty confident that we will see rapid improvements in this area and that will change a lot. This may sound somewhat pathetic, but I think this could be a crucial issue in fixing some of the big problems of our world - hate, racism, war. It's just plain simple: If you have friends in China, you're less likely to think that "the chinese people are bad" (I'm using this example because I feel this thought is especially prevalent amongst the left-alternative people who would never admit any racist thoughts - but that's probably a topic for a blog entry on its own). If you have friends in Iran, you're less likely to support your country fighting a war against Iran. But having friends requires being able to communicate with them. Being able to have friends without the necessity of a common language is a fascinating thought to me.

merkaartor, another editor for OpenStreetMap

Friday, February 29. 2008, 12:42
merkaartorAfter some more »out of memory«-messages by josm, I thought it's time to look out for alternatives.

For the openstreetmap-project, the two main editors are josm (java) and potlach (flash). I think using java probably wasn't a very wise decision (I still wonder how an app can get »out of memory« after loading about 5 MB of images, do they create a pixel class and store every pixel in an object?) and I don't like flash either.

There's another project called merkaartor and today I had a look. My first feeling is that it's promising. It has good performance, it does nice live-rendering and it supports the basic features (adding nodes and ways, up/downloading stuff to the osm-system).

Sure, comparing to the large list of plugin features josm has, it's limited. Maybe I'll try to hack in some bits I'm missing at the moment.

In my continuing effort to improve gentoo for geo-related stuff, I've just added a merkaartor package to portage.

Manually decrypting S/MIME mails

Tuesday, February 26. 2008, 21:05
I recently took the new CAcert assurer test. Afterwards, one has to send a S/MIME-signed mail to get a PDF-certificate.

Having the same problem like Bernd, the answer came in an RC2-encrypted S/MIME-mail. I'm using kmail, kmail uses gpgsm for S/MIME and that doesn't support RC2.

While this opens some obvious questions (Why is anyone in the world still using RC2? Why is anyone using S/MIME at all?), I was able to circumvent that without the hassle of installing thunderbird (which was Bernd's solution).

openssl supports RC2 and can handle S/MIME. And this did the trick:
openssl smime -decrypt -in [full mail] -inkey sslclientcert.key

It needed the full mail, which took me a while, because I first tried to only decrypt the attachment.

Cross Site Scripting (XSS) in the backend and in the installer

Tuesday, February 26. 2008, 14:23
I want to give some thoughts about some more advanced XSS-issues based on two vulnerability announcements I recently made.

First is backend XSS. I think this hasn't been adressed very much, most probably all CMS have this issue. If you have a CMS-System (a blog is also a CMS system in this case) with multiple users, there are various ways where users can XSS each other. The most obvious one is that it's common practice that a CMS gives you some way to publish raw html content.

Assuming you have a blog where multiple users are allowed to write articles. Alice writes an article, Eve doesn't like the content of that article. Eve can now write another article with some JavaScript adjusted to steal Alice's cookie. As soon as Alice is logged in and watches the frontpage with Eve's article, her cookie get's stolen, Eve can hijack her account and manipulate her articles.

Solution is not that simple. To prevent the XSS, you'd have to make sure that there's absolutely no way to put raw html code on the page. Serendipity for example has a plugin called trustxss which should do exactly that, though there are many ways to circumvent that (at least all I found should be fixed in 1.3-beta1, see advisory here: CVE-2008-0124). All fields like username, user-information etc. need to be escaped and it should be the default that users aren't allowed to post html. If a superuser enables html-posting for another user, he should be warned about the security implications.

A quite different way would be separating front- and backend on different domains. I don't know of any popular CMS currently doing that, but it would prevent a lot of vulnerabilities: The website content is, e.g., located on www.mydomain.com, while the backend is on edit.mydomain.com. It would add complexity to the application setup, especially on shared hosting environments.


Second issue is XSS/CSRF in the installer. I'm not really sure how I should classify these, as an open installer most probably has more security implications than just XSS. I recently discovered an XSS in the installer of moodle (CVE-2008-0123) which made me think about this.

I thought of a (real) scenario where I was sitting in a room with a group, discussing that we need a webpage, we would take Domain somedomain.de and install some webapp (in this case MediaWiki, but there I found no such issues) there. I suddenly started implementing that with my laptop. Other people in the room hearing the discussion could send me links to trigger some kind of XSS/CSRF there. This probably isn't a very likely scenario, but still, I'd suggest to prevent XSS/CSRF also in the installer of web applications.

Time-syncing external devices like cameras, mobiles

Friday, February 15. 2008, 13:18
I recently wrote about geo-tagged images. This makes use of the fact that different devices collect data and you can associate the data by the timestamp. It's most probably interesting for much more than gps/images.

While it's possible to get accurate timesetting by hand, it's usually not what you want. Preferably one wants to sync all devices with an internal clock automatically from the computer or some kind of network connection.

As a first step, we want to get our computer's time accurate. There are tons of tools out there, some linux distributions (and also windows xp) do this automatically on boot. I'm usually using rdate, it's small and simple:
rdate -s [any public timeserver]

There's a list of public time servers here. Other tools like netdate, ntpdate etc. will do it as well.

Now, my digital camera is a Canon Ixus 50. It uses PTP (picture transfer protocol) for data communication. If you have a PTP camera, most likely it supports time syncing. Syncing the camera time to the system time was recently added to libgphoto svn, but it's not yet available in a release. It also doesn't support any timezone management yet, so I'll get GMT time (while I live in the CET zone). The command to do it is:
gphoto2 --set-config synctime=on

If you don't have PTP, you're not completely lost. There's support for a lot of proprietary cameras in gphoto, some of them also support time syncing. Give it a try. I don't have information about usb storage devices (many cameras are just storage devices), links welcome.

Next device is my mobile phone, Nokia 6230i. As a mobile phone is permanently connected to the GSM-network, the obvious option would be time syncing over gsm. This protocol exists and most phones (including mine) support that. But bad luck, many mobile providers don't support it. So I'm out of luck here (vodafone, pointers to information about different provider support that are welcome).

Now, this device also speaks bluetooth, so timesetting via the computer should be possible. Both gammu and gnokii (the common applications to talk with all those proprietary mobiles out there) have a timesetting-option, but it rounds down the time to zero seconds, thus making it useless for exact time. I'm not yet sure if this is a limitation of the hardware or a bug in the software. An option would be to send the timesync-signal at the moment seconds turn to zero, but that would require application support, as there's a relevant diff between the application call and the moment the time get's set (because you have to ack the connection on the phone).
Though at the moment my phone needs manual timesetting, but the only data I'm collecting with it is gps-data, which get's it's timestamp via gps, so this is fine.

Zeitgeist

Wednesday, May 2. 2007, 03:08
I knew that kindergarden, blitzkrieg, gesundheit and umlaut are german foreign words in the english language. Today I noticed the word »zeitgeist« in an english talk.

Wikipedia has some more examples
(Page 1 of 1, totaling 6 entries)