As this particular project is coming to and end, I figured I’d do a quick blog post on it.
Wiki Loves Monuments (WLM) is a photo contest for European monuments, organized by Wikimedia this September. Last year some JavaScrip hacks on the regular Wikimedia Commons (the media repository for Wikipedia and other Wikimedia Foundation projects) upload interface where used for this contest. This year the new and completely awesome Upload Wizard (UW) will be used, with configuration optimized for WLM. I created a campaign-based configuration system from the UW and also added a bunch of new settings.
2 new special pages where added. One listing all campaigns, their status, and edit and delete links. This is at Special:UploadCampaigns.
The other special page handles the edit action and displays a list of all available settings that can be modified for the campaign. This is at Special:UploadCampaign/name.
A campaign can be applied to the UW by adding the “campaign” url parameter with as value the campaign name, ie ?campaign=wlm-be.
One fun thing about the architecture of the campaign system is that the setting support is very generic. I created a new settings class that pulls in the default settings, overrides these with the wikis config (ie PHP vars in LocalSettings.php), passed URL arguments and finally the upload campaign settings if a campaign is specified. I like this kind of setup, as it’s a lot nicer then dealing with over 9000 global variables, and in the meanwhile already applied some variation of it in Semantic Signup and in my new Surveys extension. And I wrote up a more general and powerful version of such setting handling in the Maps extension. Unfortunately this code uses late static bindings and thus requires PHP 5.3, making it not usable in actual code for quite a while
Another neat thing is that the upload campaign class only specifies a lift of settings that should be configurable for upload campaigns, together with what kind of HTML form input they should be displayed. That info is then merged with the settings obtained from the settings class and put into a FormSpecialPage, which uses HTMLForm to display anything without any further hassle
Not used the Upload Wizard before and curious how it works? Go upload some nice stuff to commons then
A few days ago I released version 1.2 of the Live Translate MediaWiki extension, which is a major update bringing mainly under-the-hood improvements. I’ve worked on this for about 3 days in my free time, mainly to try out some JavaScript techniques I had not utilized yet.
These are the changes for 1.2:
This post is about the first two.
Some background
The Live Translate (LT) extension allows live translation of content in wiki pages. For this it uses translation services such as Google Translate or Microsoft Translator. It also allows specifying your own translations for certain words within the wiki, which will then be left alone by the (remote) translation services. Such specifications of translations are called translation memories (TM), and are typically done in a special XML-based format called Translation Memory eXchange (TMX). LT also supports a more wiki-friendly format, custom written, which is DSV-based. Translation memories in both these formats can be embedded in wiki pages designated as TM or you can point to files hosted somewhere else. What it comes down to is that there are a set of local translations, which require special handling: local translation and be ignored by the remote translation service.
On every translation, the JavaScript needs to know which are the special words that have a local translation, so translations for these can be requested, and measures can be taken to not send them to the remote translation services. This means doing a call to the wikis API to obtain these words. In case of big translation memories, this requires several calls to obtain all words, often resulting in a few seconds wait before local translations are even requested. If there are words that have local translations on the page, a single request I send to another part of the API to obtain these translations for the language currently translated to. This usually bring the total time to complete local translation to somewhere between 2 and 5 seconds, after which, in version 1.1 and earlier, remote translation is kicked of.
The idea
Translation memories do not tend to change all the time, so it’s very inefficient to request all special words for every translation, and in somewhat lesser degree to always request the translations. The obvious answer to this is local caching, and since I wanted to play around with HTML5 localStorage a bit, this is exactly what I did. I also wanted to make use of JavaScript capabilities I was not really aware of back in last December, when writing Live Translate, so took this as an opportunity to also do some JavaScript refactoring. These being primarily prototypes, closure scopes and callbacks.
The realization
I made a whole bunch of client side changes (and some server side changes to the API), but the most significant ones are the creation of a translation memory object which takes care of all caching, and the rewrite of the translation control to a jQuery plugin.
Translation memory object
In file includes/ext.lt.tm.js.
The translation memory object class is named simply “memory” and resides in the “lt” namespace. It acts as abstraction layer via which special words and translations of those special words can be accessed. It takes care of all API interaction and caching and exposes 2 simple functions, getSpecialWords and getTranslations, which are called by the translation control.
When the cache is empty, the memory will request a new hash via the API, which indicates the “version” of the translation memories on the server, and is later used for cache invalidation. It the proceeds fetching the requested special words or translations of special words and returns these via a callback passed to either getSpecialWords or getTranslations. Before this last step is done, the obtained data is cached in memory (the words and translations fields, one lines 26 and 30, respectively), and, when available, also in HTML5 localStorage. The in memory caching only yields advantages when doing multiple translations on a single page, which is rather rare, so is not that much of a win. The data stored in localStorage on the other hand, persists when navigating to other pages, even when closing the browser and re-opening it. localStorage really isn’t a cache on it’s own, but the lt.memory class uses it as one.
When the cache is not empty, a single request to the API is made to compare the earlier obtained hash and see if any changes to the TMs have been made, and thus if the cache should be invalidated. If changes have been made, the stored data is discarded and pretty much the same as when the cache was empty happens. If no changes have been made, locally stored data is used where possible. In case of the list of special words, no requests will have to be made at all, since all such words are already known. For the translations of these words it’s a little trickier, since the needed data here varies from page to page, and also depends on both the source and destination language. The lt.memory class checks which of the needed data if available, and in case there is a remainder of non-known translations, requests these. The newly obtained translations are then of course also added to the cache.
jQuery plugin
In file jquery.liveTranslate.js.
This plugin contains a lot of already existing code from Live Translate 1.1, but is structured a lot better. It takes care of creating all the HTML needed for the control (while in 1.1, the HTML was provided, and only events where bound to it) in it’s setup function (line 147). The click event handler for the translation button calls the obatinAndInsetSpecialWords which uses the getSpecialWords function of the lt.memory class to obtain the words with local translations, and then inserts them, meaning that occurrences of these words are wrapped into notranslate spans, which then enables finding all words which should be translated locally, and makes them be ignored by the remote translation services. The click handler function passed doTranslations as completion callback to obatinAndInsetSpecialWords, which starts both local and remote translation in parallel. Local translation is done, as you can uncountably guess, by calling getTranslations function.
The results
Once the cache is warm (the user made a translation before) and valid (the TMs have not changed), local translation is practically instant (~0.4 seconds in my tests). Since remote translation now starts as soon as the special words are known, this can take as little as ~0.2 seconds, a huge difference compares to the earlier up to 5 seconds and possibly longer. All assuming you are using a modern browser of course
Not to forget, the code is a lot better structures now. It should be a lot easier to track the order of execution, and it’s now possible (JavaScript wise) to place multiple translation controls onto a single page (which has little practical value, but indicates a better design).
And, maybe most importantly for me, I now have a much better grasp of the earlier mentioned prototypes, callbacks and closure scopes. Perhaps most of the time I spend on this version was figuring out how to properly use these and debugging out misconceptions I had about how they worked
Live Translate

Categories
Tag Cloud
Blog RSS
Comments RSS
Last 50 Posts
Back
Void « Default
Life
Earth
Wind
Water
Fire
Light 