Wednesday, August 3, 2016

Myanmar-Sar in R – II: Displaying on the Console


This post arises as an afterthought to my explorations relating to dinari's question which was the subject of my last post. The solution in that post was about reading in and writing out UTF-8 coded Myanmar-Sar with R. Before I stumbled upon dinari's question, I came to realize that reading in and writing out Myanmarsar correctly in R was not for the fainthearted, just by looking at a post like “R on Windows: character encoding hell, for example. It was by sheer luck that I could find relatively fast the solution given in my last post. However, I had dismissed the task of displaying Myanmar-Sar on console as too hard. That would be desirable, but not that important so long as we could import Myanmar-Sar correctly into R, process it and export it correctly out to a text file. Just now I've seen my kind of idea expressed in an answer to “Cannot read unicode .csv into Ron Stack Overflow.


By the time that I've read that post by puslet88 I have already worked out how to correctly display Myanmar-Sar on R console. First let's revisit dinari. In the beginning we have to understand that we were able to see the Myanmar-Sar in the text file we had written out only because we have installed Myanmar3 font (or for that matter, any Unicode compliant font such as Tharlon, ThanLwin, or Padauk, … ) on our machine. In my own experience, I was able to use Myanmar3 font on CSPro software for developing data entry application on Android phones. So I looked around for a font that could be applied everywhere in Windows and found the MyMyanmar Language System.

In my last post, displaying Myanmar-Sar on the R console has reached as far as:


Then, after installing Padauk font we get the kind of half Myanmar-Sar half Unicode codepoint that danari had complained. We need to use the cat( ) function:


The Myanmar-Sar outputs looked as if some wrong font has been used! I tried tweaking the Windows registry like modifying MS Shell Dlg and MS Shell Dlg2 both to Padauk in \FontSubstitutes. That didn't improve the R console display, but in filenames in Windows Explorer where small squares only were visible before, I could now read in Myanmarsar.


Looks like the names of first two files were written in Zawgyi. However, in this context I have to caution you to make a registry restore file before you try to modify the Windows registry.

Now that I've installed Myanmar Language System, I run the part of the script shown in my last screen shot above, and voilà!


So the solution for dinari's problem is to have a Myanmar-Sar font installed that could display system-wide and to correctly use the subscripts for list: [[1]][1], converting the results to character, and use the cat( ) function. According to him his problem was in Mac environment which I've no knowledge. But I guess the R syntax won't be that different from the Windows environment for R.

After I had installed MyMyanmar and successfully ran the above code I discovered this post (too late?) from CNET, http://download.cnet.com/MyMyanmar-Unicode-System/3000-2094_4-10578829.html:


The downside of MyMyanmar is that when I became worried about using the MyMyanmar Language System and uninstalled it, I was left with Windows popups that have barely readable texts!

Then no amount of my safe tweaking with Window's “Personalize” or the registry could bring that back to normal. I was about to reinstall Windows, but then I tried installing ThanLwin and Padauk fonts from SIL. Miraculously, the problem disappeared! I don't know which one of them delivered me, but it was fine.

Well, I won't blame the MyMyanmar people for that. We all need encouragement. We all are on the same boat and we need to go far. But they would do well warning us about this side effect and give us the remedy. Their uninstall should have handled this problem.

To play safe, I installed MyMyanmar on another laptop to run the last code fragment I've shown just for writing this post. For my work, and presently, I am just happy with the ability of R to import, process and export UTF-8 data (including Myanmar-Sar) independently of how they are displayed on the console.



No comments:

Post a Comment