Subscribe: words & more
Added By: Feedage Forager Feedage Grade B rated
Language: English
apertium  fundraiser  language  languages  machine translation  people  projects  time  translation  wikipedia  work  world 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: words & more

Bina's blog

All sorts of things - whatever is interesting to me. - Alle möglichen Sachen - alles, was mich interessiert. - Tante cose diverse - tutto quel che mi interessa.

Updated: 2018-03-02T18:17:42.435+01:00


Ladin Wikipedia on the incubator


A week ago I received the following post (in German):

Hallo Sabine,

da Du Dich ja mit "less resourced languages" befasst, wollte ich fragen,
ob Du vielleicht ein paar Leute aus Südtirol, dem Trentino, der Provinz
Belluno oder sonstwoher kennst, die gerne Artikel auf ladinisch
Schreiben würden. Die Wikipedia liegt schon seit einiger Zeit im
Inkubator, aber keiner will dieses Ei ausbrüten.

Da die Ladiner in vielen Vereinen organisiert sind, habe ich mir
überlegt, in nächster Zeit eine davon Anzuschreiben, so z. B. die
ladinischen Feuerwehren und die Grupa per la defendura di uciei
(Vogelschutzbund). Falls Du auch ein bisschen Werbung machen kannst,
währe ich Dir dankbar :)

Liebe Grüße

Actually Andi asks me if I can help with recruiting people who help with the Ladin Wikipedia which is still on the incubator. Well, I am on one hand part of the LangCom and I am not sure if I should do this ... but on the other hand I am the CCO of Vox Humanitatis which actively promotes less resourced languages and the related projects. Well, Wikipedias in these languages are projects to be promoted and helped. Therefore I would like to ask anyone who is reading this to help to connect me/us to people speaking/writing Ladin so that I can connect them with the right people from my side. I am also going to write a number of associations that maybe can help. (E-Mail: s.cretella [at]

Please note that we will actively support your help requests, but please consider that we only have a certain amount of time, so whatever needs to be done is going to be done even often it will take some days until we actually can move. Our to-do list is quite long :-)

Thanks for helping to spread the word!

Technorati tags: , ,

Blogger's code of conduct


Actually I wanted to write this blog some 10 days ago, but like so often things just get crazy around you and time just gets too short ... but maybe it is good that there is some time in between, just to make people aware of it again.

Approx. two weeks ago there were some people who IMHO did not behave correctly. I mean: by blogging and by being connected to various organizations we also have a certain responsibility. What we need to be aware of is that often a simple statement, sometimes maybe joking around with friends, can be misunderstood and lead to really ... ehm ... not so nice results.

On Wikia the Blogger's Code of Conduct was published (as much as I understood it came about thanks to the co-operation of Jimmy Wales and Tim O'Reilly). I would like to ask you all, of the various foundations, associations and of course also all individuals who read this, to have a closer look and to adopt the Blogger's Code of Conduct when writing your blogs.

Thank you!

Apertium ... three lines to localize ...


Hi Apertium is a MT (machine translation) tool which is particularly strong in translating similar languages. It is also used for pre-translation of Wikipedia articles for certain language pairs (Spanish-Catalan, Spanish-Occitan etc.) and shall be included in the Ubuntu distribution. Therefore it would be nice to have the three needed sentences localized in as many languages as possible.

To see which languages are actually there please have a look at

From there you can create the .po file for your language. Once you translated the file please send it to: maybe sending also a copy to me s.cretella (at) (just to make sure the file goes through - you never know).

Of course, I understand if you don't know how to work with .po files.
In that case please indicate the language code (and language) you are translating in in the subject of your e-mail and send the lines below translated to the apertium-stuff list above and a copy to me. Please note that only the lines starting with msgstr need translation, the rest remains untouched.

If you work with particular languages that could create utf-8 coding problems in e-mails, pleas just copy and paste the text into a file and send us the file with the translated lines.

msgid "Mark unknown words"
msgstr "Mark unknown words"

msgid "This program is licensed under the GPL."
msgstr "This program is licensed under the GPL."

msgid "Translator"
msgstr "Translator"

An example of the screenshot can be seen here.

Thank you in advance for your help :-)

Afrophonewikis and localzations of Mediawiki


I just learnt from the afrophonewikis list that three African languages finished the first localisation milestone, that is: Swahili, Sesotho sa Leboa and Amharic. That is really wonderful. Thanks Gerard for letting us know.

Well at this stage I also would like to thank those who actually did all the localization work and in particular Siebrand who is doing the organizational work, which is mainly done in the background and is key to whatever is done in terms of translation on Betawiki. Knowing how time consuming the organizational part is a special thanks to him.

Of course: whoever feels to be able to contribute, please do so and should you know people who could eventually help: tell them. The more the message gets out the better it is for all of your communities. Something that people often forget: the communities' work is their very own success - so please: write your very own success story by contributing to your community's projects.

Translating Wikipedia articles (2)


Like I already said yesterday, I would come back to this argument today.

Apertium is already used in some projects, one of which is the Occitan Wikipedia. For those who are not familiar with Wikis: there you have the possibility to compare the not proofread version with the proofread version and that is something you will see by clicking here.

What you see on the left hand side is the text as it was after the machine translation and on the right hand side the proofread version of the text. The changes are highlighted in green on the left and in blue on the right hand side. There are even some parts of the text that were not changed at all.

The work on the glossary and the grammar rules (well I am not using the specific terminology here to make things understandable for all) has been going on for approximately one year now.

At a certain stage the problems arise from vocabulary that is missing and not so much from the rules. Of course these translations will probably never be a 100% perfect, but the quality depends very much on us and our adding terminology and classifying it.

Comparing the above result to what you would see for Spanish-Catalan, well the last one having been under development for years is much better.

You can find further reading about co-operation between Wikipedia and Apertium on the Apertium Wiki.

Language pairs that are right now available are:

  • Spanish←→Catalan
  • Spanish←→Galician
  • Spanish←→Portuguese (pt and pt_BR)
  • Catalan←→English
  • Catalan←→French
  • Catalan←→Occitan (oc and oc@aran)
  • Romanian→Spanish

Many other language pairs are under development. Of course: you may start on any language combination that is comfortable for you. Please keep in mind: the more similar two languages are the easier it is to program the rules, the faster the translation engine will produce good translations.

If you want to start to work on wordlists, please write me at: s.cretella (at) and tell me which language pair you are interested in. You can also reach me by skype at: sabinecretella

I will upload a wordlist to google docs and give you access. Please let me know if you have difficulties to work online (that is if you work with a dial-in connection).

The Apertium Chat is on Freenode.

One more thing I just received criticism since machine translation would flatten the language: well any translated text, in particular when it comes to literature translations, is post edited by a second person. The translation is never published directly since during translation - and you can be the best translator of the world - there are always some bits and pieces that sound a little strange or that do not really transport the scene into the other culture. And please allow me to introduce the concept of cultural localization here that will be explained in one of the future posts here and that was coined by Dr. Martin Benjamin who is part of the advisory board of Vox Humanitatis. The concept of cultural localization became then immediately part of the scope of the association.

And since I am adding notes here: please remember that the Fundraiser of the Wikimedia Foundation is still running and that you can help by donating and telling others that the fundraiser is on. For more information and to donate please click here.

Translating Wikipedia articles ...


... into less resourced languages. Well, time has come that we can start to think about how to go about a faster creation of contents for the many small Wikipedias. As you all know, often we have just a handful of people creating and translating and then adapting articles. Well ... combining various Open Source and Open Content projects we can now go a further step into the direction of fast contents creation, but that does not mean: stub upload. This is a completely different way of doing things.Apertium is a machine translation tool that works really great with similar languages. Approx. a year ago I had a translation from Spanish to Catalan done by Apertium through the online interface ( and asked some people of the Catalan Wikipedia to have a look at it. They told me that of course it was not perfect, but that it would be easy to proofread it and much faster than actually translating it. In March I made a similar test during a masters for translation studies in Pisa. I asked one of the students who was bilingual Spanish and Catalan to have a look at the outcome of the machine translation of a general text. The grammar was almost perfect and and also the terminology. There were just 5 corrections in a bit more than half a page (A4).Now what does this mean to us: if we have a bilingual wordlist for two similar languages under a free license, we can pass it on to the Apertium people. From there we are a step closer of getting machine translation for that specific language combinations on their way.One note inbetween for the Apertium people who might read this: please don't mind me not using specific terminology to describe what needs to be done. It could become to techy.So the next step is to identify what a term is and how it needs to be handled. That is for example a verb needs to be declared as such, then one needs to give it a tag that indicates which conjugation scheme needs to be applied. This needs doing for all word types, that is verbs, nouns, adjectives etc. After that grammar rules need to be considered. Step by step the correctness level will be improved and the time invested to complete wordlists which will be available as google doc spreadsheet and to add all the additional information will help to save a lot of time. That is: now it will take longer, once the engine "learnt" how to deal with the terminology and grammar for that specific language combination creating contents will become much faster. This will help the small projects in such a way that the few editors can concentrate on proof reading and adapting and will result in a faster contents growth that has quite high quality.This project that is going to care about less resourced languages will be one of the first lead through Vox Humanitatis. Should you be interested in helping with the wordlists, please let us know which language combination you would like to work on (that is starting from English right now and step by step from others since most of the Terminology is there in English). We will get you the access to the online document. If you need to work offline, please let us know. You can contact me by e-mail: s.cretella (at) voxhumanitatis.orgI just received a list of the supported language combinations as well as an example for Catalan-Occitan and some notes on evaluation of machine translation co-operating with a Wikipedia community. This means I have quite some further stuff to tell you. I'll post that info tomorrow, otherwise this blog would become too long.Please also note that the documents will be released under CC-BY license and therefore they can be integrated into any wiktionary.[...]

When things don't go as expected ...


Well, many of you know that I was supposed to deal with the Fundraiser 2007 and then at some stage, in October, I disappeared ... many probably asked themselves what happened and did not find an answer (I just sent a note to a very small group of people about what was on): well my husband was in hospital and came out again some days ago. In the meanwhile my kids were ill and had to take antibiotics and as last member of the family I had the same cold like they had and just finished to take my medicine. Again I was called from the school that Marco was not well and that I should take him home ... I went there and took him home – well: he is not really ill, just some coughing ... but enough to create some trouble at school.

I already tried to get back on my track ... like so often, things simply don't want to go like I want them to go and therefore I am not sure how much I will really be able to contribute during the next weeks.

Thanks to all those who supported me during this period, thanks to all who are actively helping the fundraiser.

Probably I will be able to do only very limited things, but I can see that things are going ahead and that is the most relevant thing.

Again: thanks to all who helped and are helping. I hope things will be back to normal, soon.

Wikimania 2008 ... where is it going to be?


Well, these days many of us talk privately about it ... no, I am not going to tell you what others say, I am going to tell you where I would "feel" it right.


A place where cultures and people meet, it is somewhat a central point when it comes to connecting the modern with the ancient world. It holds the biggest library of the ancient world, a place where wisdom is collected ... wisdom that now reaches up to our days.

Isn't this a perfect merge? The antique world of knowledge combined with free knowledge for all?

The antique centre of wisdom meets the centre of wisdom of the present and future ... I find it unique. And besides many other facts that also are advantageous for Alexandria, this is my main point ... it can and probably will take us to the next level.

No well, one other thing is probably really relevant, besides the "feeling": Wikimania 2008 in Alexandria can attract people from the Middle Orient and can also contribute to peace when people start to co-operate on projects about knowledge. The wiki world is a very particuar one and I believe that many of you will agree when I say: it can change a live and how people think since we all feel or felt it ourselves.

I personally favour Alexandria for Wikimania 2008.

Wikimedian of the hour ....


Well, it is not about being the best, the biggest, the whatever ... it is about contributing actively to the fundraiser ... a test-feature that shows a photo of the Wikimedian of the hour is running on two wikipedias for now - the Piemontese and Neapolitan Wikipedias. Besides that we decided that it made sense to have the donation page in our languages as well. Piemontese is already there ... Neapolitan still needs translation and then proof reading.

Now let's come to the point: the sense of this exercise is to get more people look at the donation page during the fundraiser period. We all know that a picture or graphic attracts our eyes more than just a written line. The Wikimedia Foundation will need more and more funds since it is continuously growing exponentially. This means we need and want to reach user groups that before were not reached and there are plenty of them. Of course we cannot do everything "right now" since the time left to the next fundraiser is short, but at least we should start to do something. Showing pictures of the community creates a different feeling of "being part of real people". It welcomes people in a different way.

These "Wkimedian of the hour" pictures can be used anywhere - also in the village pumps for example. Well, I would like to see you ... yes, you who are reading ... among us as well. You are a Wikimedian, so you should be there. Of course, those who want to remain anonymous can send me their picture and I will simply upload it without information about who it is. Pictures I receive by e-mail for publication on commons and flickr are released under cc-by-sa 3.0 and GFDL license.

If you don't know how to get that picture of yourself, well you can do a collage or just a screenshot like the one included in this blog.

The picture cycle gets updated regularly substituting the actual pictures with a bot, that is when some new pictures are added.

If you are on one of the projects that would already like to have the Wikimedian of the hour online, please let me know. I can pass you the file for the upload or eventually upload it with my bot.

I believe that for small projects (not only Wikipedia, but also Wiktionaries, Wikisource etc.) it will not only have fundraising effect, but will eventually also attract more people looking and hopefully contributing to these projects.

You can reach me by e-mail at scretella (at) wikimedia (dot) org

So I hope to see your picture online soon or have it in my mailbox. :-)

Two who love wikimedia projects ....


(image) ... and regularly contribute were on holiday on the Amalfi Coast in Maiori ... it was great to meet them after such a long time knowing each other from various projects. It was funny to listen to French and understand it and to answer in German or just talk German :-). This is also the way on how my twins got their "Zuckertüte". I suppose they had fun here on the coast and they visited a place where I never have been: the top of mount Vesuvius ...
When we went to Positano to have a look at the shops, the particular way the houses are built and where I, this time, had the chance to meet a well known artist from the Amalfi Coast (who after the fundraiser will get its article on nap.wikipedia) we met with Michele Cinque in a bar and so I took the chance to take some pictures ... of course also for our WikiLove campaign.

(image) What? You don't know what that is? Well we are trying to get pictures of many people who love our projects to be used for driving more attention to the upcoming fundraiser. There is a flickr group and a category on Commons. Well, yes ... it still needs to grow and I wanted to take my photo today, but again my husband is not here to take it ... and really one in pyjama when I wake up when he comes home after work would not give the best impression ;-)
And what about you??? Is your picture already there? No? .... So what about adding it on Commons or flickr? So many will be able to see you during the fundraiser period ...

Have fun!!!

Wikimedia Foundation and China ... Beijing and Fundraiser ...


Really this is something I was already playing with for some days now and now it happened that Karl Siu added me as a friend on Facebook and so I saw his question: "Should XXIX Olympic Games in Beijing be boycottet?" ... My first answer is no - by boycotting them you don't reach anything, just some more problems are created. Instead of boycotting we should support them.

And what does this have to do with the Wikimedia Foundation and the Fundraiser? At a first glance nothing ... at a second: it can make a huge difference, depending on how we bring the message over and how our community would like to adopt such a message.

What do journalists very likely anywhere in the world use to find background information on the news? Wikipedia, right? Which kind of information will reporters from anywhere in the world need? Well: all that is in some way connected to the Olympic games ....

Now what would happen, if we start a project now, making it public to the press, that is about creating background information on the Beijing Olympic Games? That is making Wikipedia the most relevant information resource for background information for that event? What if we actively ask journalists to tell us which kind of information they would like to see and that the improvement on the articles and new articles are based on these questions?

What if these articles then are translated (or written) in as many languages as possible?

I believe that we have enough people who can help with such a giant project ... I believe that we have such a great community that has enough of that knowledge or is able to research the needed bits and pieces ... and that will save journalists loads of time, right?

Now ... do you believe that journalists who are going to save quite some time would also help us in some way? I would say: the likelyhood is very high. So what could they do for us? Well: help us to talk to people ... that's their job ... they can help us in different ways by telling the world that such a project is starting (now?) ...

1) they can and will attract readers and writers
2) they can talk about the fundraiser and ask people to remember that (donations) - don't forget: these people know how to get messages over ... so they could do it ... right?
3) the newspapers could grant some space to get our message in
4) people who read Wikipedia and follow the project that should then be very active during the fundraiser eventually (and hopefully) will see that the fundraiser is on and they donate ...

So: it is, in the end, all connected.

Maybe this is the way to involve a huge part of our community indirectly in the fundraiser?

And: it could also be a good opportunity to open a formal contact with the Chinese Olympic Committee ...

Btw. the press agency for Bejing will also follow the Earthrace - an event that wants to make biodiesel more attractive by tempting the world record of the world circumnavigation. There are some youtube videos around ... I also saw a presentation video a good week ago ... don't find the link right now - just search for Earthrace on youtube and you will find it ... it's quite interesting.