This post arises as an afterthought to
my explorations relating to dinari's question which was the
subject of my last post. The solution in that post was about reading
in and writing out UTF-8 coded Myanmar-Sar with R. Before I stumbled
upon dinari's question, I came to realize that reading in and
writing out Myanmarsar correctly in R was not for the fainthearted,
just by looking at a post like “R
on Windows: character encoding hell”,
for example. It was by sheer luck that I could find relatively fast
the solution given in my last post. However, I had dismissed the task
of displaying Myanmar-Sar on console as too hard. That would be
desirable, but not that important so long as we could import
Myanmar-Sar correctly into R, process it and export it correctly out
to a text file. Just now I've seen my kind of idea expressed in an
answer to “Cannot
read unicode .csv into R”
on Stack Overflow.
By the time that
I've read that post by puslet88 I
have already worked out how to correctly display Myanmar-Sar on R
console. First let's revisit dinari.
In the beginning we have to understand that we were able to see the
Myanmar-Sar in the text file we had written out only because we have
installed Myanmar3 font (or for that matter, any Unicode compliant
font such as Tharlon, ThanLwin, or Padauk, … ) on our machine. In
my own experience, I was able to use Myanmar3 font on CSPro software
for developing data entry application on Android phones. So I looked
around for a font that could be applied everywhere in Windows and
found the MyMyanmar Language
System.
In my last post, displaying Myanmar-Sar
on the R console has reached as far as:
Then, after installing Padauk font we
get the kind of half Myanmar-Sar half Unicode codepoint that danari
had complained. We need to use the cat( ) function:
The Myanmar-Sar outputs looked as if
some wrong font has been used! I tried tweaking the Windows registry
like modifying MS Shell Dlg and MS Shell Dlg2
both
to Padauk in …\FontSubstitutes. That
didn't improve the R console display, but in filenames in Windows
Explorer where small squares only were visible before, I could now
read in Myanmarsar.
Looks
like the names of first two files were written in Zawgyi. However, in
this context I have to caution you to make a registry restore file
before you try to modify the Windows registry.
Now that I've installed Myanmar
Language System, I run the part
of the script shown in my last screen shot above, and voilà!
So the
solution for dinari's problem is to have a Myanmar-Sar font installed
that could display system-wide and to correctly use the subscripts
for list: [[1]][1],
converting the results to character, and use the cat( )
function. According to him his problem was in Mac environment which
I've no knowledge. But I guess the R syntax won't be that different
from the Windows environment for R.
After
I had installed MyMyanmar and successfully ran the above code I
discovered this post (too late?) from CNET,
http://download.cnet.com/MyMyanmar-Unicode-System/3000-2094_4-10578829.html:
The downside of MyMyanmar is
that when I became worried about using the MyMyanmar Language System
and uninstalled it, I was left with Windows popups that have barely
readable texts!
Then no amount of my safe tweaking with
Window's “Personalize” or the registry could bring that back to
normal. I was about to reinstall Windows, but then I tried installing
ThanLwin and Padauk fonts from SIL. Miraculously, the problem
disappeared! I don't know which one of them delivered me, but it was
fine.
Well, I won't blame the MyMyanmar
people for that. We all need encouragement. We all are on the same
boat and we need to go far. But they would do well warning us about
this side effect and give us the remedy. Their uninstall
should have handled this problem.
To play safe, I installed MyMyanmar
on another laptop to run the last code fragment I've shown just
for writing this post. For my work, and presently, I am just
happy with the ability of R to import, process and export UTF-8 data
(including Myanmar-Sar) independently of how they are displayed on
the console.
No comments:
Post a Comment