Wednesday, October 28, 2015

JSON, who?


I had read about Jason and the golden fleece in my school boy days and now I remembered that vaguely as a piece of Greek mythology and nothing more. These help me refresh my memory:

I guess JSON is pronounced the same but the difference is that it is very real and becoming more and more visible on the Web.

The first time I had heard of JSON was about two years ago when I was looking for large data files of one Terabyte or more so that I could try playing with Big Data. Looking for sources of big data, I vaguely came to understand that big boys like Amazon or Google for example, could let me get such data in a format called JSON. It was the first time I heard of that name and I thought it must be terribly hard to learn and use it. So I dumped the idea of trying to get data that way. And I went for more traditional statistical data formats like text format, or SPSS format, or Stata format and ended up collecting a couple of sub-terabyte data files.

The official website (json.org) described JSON as

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language ... JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.

Stat 545 getting data from the Web – part 2 gives the example of JSON,


and XML:



I'm still surprised at how many people are unaware that 22 of the top federal agencies have data inventories of their public data assets, available in the root of their domain as a data.json file. This means you can go to many example.gov/data.json and there is a machine readable list of that agencies current inventory of public datasets.

I currently know of 22 federal agencies who have published data.json files

Looks like JSON is somewhat new, even in the U.S. A quick look at the situation of open data of various nations in July 2015 and looking at Hong Kong and Singapore I found that 20 datasets of Hong Kong were available in JSON, but looks like there's none in Singapore where the data sets were mostly in XML format.


Ten years ago, XML was the primary data interchange format. When it came on the scene, it was a breath of fresh air and a vast improvement over the truly appalling SGML (Standard Generalized Markup Language).

It enabled people to do previously unthinkable things, like exchange Microsoft Office documents across HTTP connections. With all the dissatisfaction surrounding XML, it’s easy to forget just how crucial it was in the evolution of the web in its capacity as a “Swiss Army Knife of the internet.”

But it’s no secret that in the last few years, a bold transformation has been afoot in the world of data interchange. The more lightweight, bandwidth-non-intensive JSON (JavaScript Object Notation) has emerged not just as an alternative to XML, but rather as a potential full-blown successor. A variety of historical forces are now converging and conspiring to render XML less and less relevant and to crown JSON as the privileged data format of the global digital architecture of the future. I think that the only question is how near that future is.

Well then, that inspired me to experiment with accessing JSON data and to play with it. I looked for R packages that would help me do it and I found the Jasonlite package. I installed it, and after some frustrating moments I was able to get two or three data files in JSON format converted to the standard data frame in R.
Looking for JSON data, one convenient source was the Data.gov site hosted by the U.S. Government. From there I found the Biodiversity data by County for the State of New York. This is the R script I used for downloading the data and converting it into a data frame.


This exercise took me 91.3 seconds on my i5 laptop with 8GB RAM and Windows-7.
I dare you to mess with JSON. Who's afraid of JSON, anyway.






Wednesday, October 21, 2015

PVT tawla or lost in font jungle


As mentioned in my last post Myanglish PVT, I was frustrated, so much frustrated in failing to get a respectable version of PVT data collection application in Myanmar language. It must have been because I have committed the sin of borrowing the name Tawla (into the woods), a classical genre of Myanmar poetry not by its time-honored name as it is, but calling it Sylvan Stroll for my other blog, sooner or later the gods will have to punish me. Here's how my PVT got tangled in the font jungle.


From those screenshots, since I've created the application with Myanmar3 font, I could very well understand that default font, and Zawgyi One could not reproduce the original one for Myanmar3 as shown in the first screenshot at the top. Because they are not fully compliant unicode fonts.


Among the two Android phones (HTC Desire-X, Xiaomi Redmi) and one tablet (Samsung Galaxy Tab3), only the Samsung allowed the installation of Myanmar fonts and I had installed Myanmar3, Padauk unicode, and Zawgyi One for the purpose of this test. The tablet already had Frozen Zawgyi keyboard pro and Frozen keyboard installed when my son gave it to his mom. From that I could guess Zawgyi is the popular font for Facebook and pardon me for not been a part of this immensely popular culture here.

For HTC Desire-X, it came with its own default Myanmar font. Comparing it with the last screen shot of previous picture I saw an exact match. So the font in HTC must be Zawgyi One. This laboriously found piece of information by poor me, and more, must have been just some common knowledge in the Myanmar font community.

But what surprised me most was the way the Myanmar3 font displayed perfectly on Windows environment becomes garbled on the Android. It was said that the Android platform is not as good as the Windows platform in rendering Myanmar unicode fonts. But the difference in Myanmar3 font on Windows and Myanmar3 font on Android seems to lie in reasons deeper than that.

Looking at the screenshots I hope the Myanmar font gurus may see at once what was wrong, and then they would look for solutions. The following examples from Myanmar Fonts which follow Unicode rules and some other fonts in the Unicode family not reproduced here, shows the same Myanmar-sar, excepting a few glitches here and there.


If I am right those examples were for Windows platform. Obviously however, regardless of platform, if different Myanmar fonts comply with the same unicode standard, they should display the same Myanmar-sar. Or would we be expecting too much for the Android platform at present?

On my Windows-7 laptop, Myanmar3 and Tharlon are interchangable and in fact when I believed that the list of political parties posted on UEC website has been done in Myanmar3 it was because I could perfectly see Myanmar-sar without loss with that font. Now when I try repeating the process by copying from the UEC website (here) and pasting it on Excel, I notice now only that the font name appears as Tharlon. Well, I must have been influenced by the information I got somewhere that Myanmar government is using Myanmar3 as a de facto standard to miss that clue. That didn't seem to harm my PVT application, though.

As I said, from among the three Android phone/tablet available to us only the Samsung tablet allowed me to install Myanmar font without “root”. I installed Myanmar3 and Padauk fonts using the iFont application available from the Google Playstore. Then the views by different fonts shown in my screenshots were enabled by making changes with Settings Display Font Style. The result on my PVT data collection application is as you can see in the first set of screen shots shown above.

In the case of my Xiaomi Redmi phone, I can't get the Myanmar3 or Padauk font installed as yet. I came to know that Xiaomi phones available locally here have been customized to be able to install such fonts. Mine is with the ROM Redmi 1 WCDMA Stable Version (Singapore) JHBMIBH25.0 and can't install application software designed for other ROMs, such as for Myanmar. I've started a thread on the official MIUI website asking for help and replies on it don't seem to promise any solution as yet.

As for the HTC Desire-X phone, I couldn't find Settings Display Font Style on it and so didn't try installing the required Myanmar font for my PVT application.

All in all, even if I could find solution to install the desired unicode fonts on any Android phone, it would be futile if I would get the kind of garbled results that I have demonstrated. Anyway, it's out of my league to understand why it happened the way it happened. Do I blame the Android or the font designers, or both?


So much for complexities. What I thought would have been a simple task of replacing English labels with Myanmar on the Android application turned out to be something not so simple at all. For now, I've gotten lost in the font jungle in the very first verse of my PVT tawla.

Tuesday, October 13, 2015

Myanglish PVT


Myanglish, though looked down by well meaning adults, appeals to young people like my grandson for example. I have been thinking that they should just learn to use real English instead of stomping in trash. That was my verdict until I tried fashioning a Myanmar version of my PVT data collection program for the Android hand-phones a few days back.

The truth is that I found it really hard to render questions, instructions, and response categories in English of my PVT application to Myanmar language. The CSEntry program for Android says it is multilingual and so I thought it wouldn't be much trouble at all. As it turned out, I had to struggle with every step of the way. It's worth sparing at least one separate post for my experience on this and it could then be more of blurting out my frustrations than anything.

Briefly, first, I have to get my Windows 7 laptop display Myanmar language in Myanmar 3 font. Why Myanmar 3 in particular? Because I found the UEC (Union Election Committee, in charge of the the Myanmar Elections 2015) posted the list of Political Parties that have registered for the election of November 8, 2015. This list of ninety-one parties is the essential part of my PVT application on Android phones and it is in Myanmar 3 font. As for reading this list on my laptop I have no problem because I have this font installed some time before.

Next I have to find a way to write in Myanmar on my laptop. Thanks to a Google group active in Myanmar language, I found the “myanmar3-wins.zip” from here, “MyFontsSettings.xlsx” from here, and “MM3FontInstallationGuidewindow7.docx” from here. These were enough to let me type Myanmar words in Microsoft Word or Excel or in a text document in Open Office.

Since my real need was to make a Myanmar language version of my PVT data collection application, I have to do the language conversion first in CSPro program and then using it to develop and compile the Android data collection application. It was easy enough copy-pasting the UEC political party list one by one into the data entry application on the CSPro side. Only boring and tedious. But having to type the questions that have to appear on the Android application had great difficulties. You can type in with the Myanmar 3 font, but once it is there you can't edit it. Similarly, you can type in Word, then copy and paste on the application, but here again you can't edit. All you can do it is to delete it, type the questions in a document outside, then paste it again.

Now the problem is typing in with Myanmar translations. Because I'd never typed in Myanmar language, I could never do the whole thing. So I looked for help and a younger IT person came to the rescue. He had it done in a few hours, and with some laborious edits by me, here you are able to see finally the Myanmar version of the Parallel Vote Tabulation application “PVT_5”.

For playing with my PVT_5 application, you need to get PVT_5.pen and PVT_5.pff files. You also need to have the CSEntry for Android installed on your Android phone or tablet. CSEntry could be downloaded from the Google Play store and:

PVT_5.pen is available here.
PVT_5.pff is available here.

For a little tip on getting started with using such data collection applications on Android see my earlier post Yan Can Cook or More fun with PVT. The present post must have been quite dry, because I can't include screen shots of my application unlike in my earlier one. The fact is that between my wife and I, an old couple, we have three hand-phones, and two tablets and yet none of them could display the Myanmar 3 font.

My hasty search on the Web shows that I might need to “root” my Android phones to allow for installing Myanmar 3 font. It's something like “jail breaking” of Apple phones and tablets, and not risky they say. I still have to think about it and would like to learn more. But while I have been developing this application, I saw it worked on a young friend's Android phone that could display the Myanmar 3 font.


One last thing. I am far from being able to make everything perfect on my PVT application in Myanmar language. It is yet crude, incomplete, and some English entries remain unconverted. That's also the reason I call it a Myanglish application in a sense different from the standard concept of Myanglish of my grandson and others. Nevertheless, we weren't that different in having to make do with whatever we have at the moment, either for lack of knowledge on our part, or for the lack of command over resources, or for great many other things, or for just all of them.