Recently I've had the chance to see one
of our nieces call for help on the Facebook to identify a flower via
my wife's page. I at once recognized it through its shape to be what
we knew as butterfly gamon which my late big sister grew a
long time ago.
I googled for ဂမုန်းလိပ်ပြာ
and found the following page on Wikipedia in Myanmar
language:
Searching with the image of the flower
on Google I found out its scientific name to be Tacca chantrieri.
Looking up, I found Wikipedia's entry as:
Comparing the Myanmar version with the
English version of the description of this flower I was unhappy
because the Myanmar version seems to be relying too much on folklore
and falls short on science as if the author(s) were entirely unaware
of the English version. They could at least have given its scientific
name in the Myanmar version, I thought.
This reminds me of the way modern day
researchers criticized the math genius Ramanujan when he wrote about
squaring the circle (Arndt
and Haenel 2001, Pi
Unleashed,
p. 58).
What I would like to say is that
including elements of Myanmar folklore in an Wikipedia article
certainly makes it colorful and interesting. But they have to be
pointed out as such. For example, instead of writing like
One who cares
for the Zawgyi gamon is likely to win in lottery or
Its leaves should not be cut off. If done, quarrels between husband
and wife are likely to happen or
A tea-cupful of liquid extract obtained by grinding its leaves taken
for about ten days cure the coughing up of blood (consumption),
quotation
marks could be placed around them. Or a phrase like “Many believe
that ...” could be added to make it more explicit that we are
dealing with folklore.
The
Wikipedia's philosophy that an article like Zawgyi Beard Gamon which
ranks as a Stub will
have contributions to expand and improve it as it called for and
hoped, didn't materialize for this particular case and maybe for many
more. I guess that is because we Myanmars were so late in getting
interested in, and involved with, Wikipedia or other sites and
services on the Internet, except of course, the immensely popular
Facebook.
If we Myanmars were
not interested, wouldn't any non-Myanmars be? We don't know. But if
they were interested, most of them may try for a Google translation,
for example, to make sense out of an article like “Zawgyi Beard
Gamon” in Myanmar language. This is what they would get now:
What you see is a
translation where the original is not at all recognizable. It is
distorted and looks funny. But in reality, it isn't a laughing matter
at all. Yes, Google Translation has problems but it may not be
entirely Google's fault because it is successful with other
languages.
Google Translate
first added Myanmar language in December 2014. According to the
official Google Translate Blog:
- Myanmar (Burmese, မြန်မာစာ) is the official language of Myanmar with 33 million native speakers. Myanmar language has been in the works for a long time as it's a challenging language for automatic translation, both from language structure and font encoding perspectives. While our system understands different Myanmar inputs, we encourage the use of open standards and therefore only output Myanmar translations in Unicode. ...
We’re
just getting started with these new languages and have a long way to
go. You can help us by suggesting your corrections using "Improve
this translation" functionality on Translate and contributing to
Translate
Community.
Well, Google Translate is using 'neural
machine translation
engine
- Google
Neural Machine Translation
(GNMT)
- which translates "whole sentences at a time", rather than
just piece by piece'. This sounded
to me like they are doing something very advanced and very good.
While the technology of the translation
engine is way beyond our heads, it is not hard to understand that it
needs data to use in its process of grinding out translations . For
that Google Translate seems to need at least a collection of the same
text in a pair of languages (Myanmar and English versions, for
example) of more than 150-200 million words, and another collection
of more than a billion words each for Myanmar and English separately.
So it
seems we could improve translations from Myanmar to other languages
(not only with Google's GNMT, but possibly with other approaches)
generally by making available a wider range of material to work on.
That means making documents in Myanmar language in digital format
widely available and in big volume - the bigger the better. That also
means making sure they are in Unicode format. Why? Because, it seems
so obvious.
No comments:
Post a Comment