Thursday, November 5, 2015

Dummy meets JSON – II


Before I tried out my dumb approach shown in my last post, I guess I had the workable idea to flatten the second column whose name is “apis” and which is a list of 102 data frames each with two variables, namely, “name” and “url”, with variable number of rows.

To do so, I need to find way to convert the list of 102 data frames into a single data frame and then combine this with the first column of the original data frame. For that I found this work perfectly:
This single data frame consisted of 421 rows and to combine with x$name which has 102 rows, I could create an id column in the collapsed data frame and using row.names of x as integer I could then merge the two data frames. Then I could extract the appropriate columns of the merged data frame to write to a csv file. The problem was that I didn't know how to add id to the collapsed data frame. That's why I resorted to my dumb approach given in my last post.

Now I found that I could add id to the list of data frames x$apis (or x[[2]]) with:
That let me get a not-so-dumb solution as follows:

But wait, the R gurus would like to see something more elegant than this. Instead of the “for” loop in line number 9 of my script, they would like to see something that is not using a loop. Well, it is beyond my reach right now.


Wednesday, November 4, 2015

Dummy meets JSON - I


It must have been a not-so-intelligent autodidact's typical approach to something new for him.

There's this website posting some data in JSON format. I stumbled upon it. More accurately it was a series of stumbling. I tried to find out more about open data on the web; stumbled upon APIs (application programming interface); couldn't understand much; looked for examples; found more about JSON; stumbled upon Data.gov website; looked for ways to access data in JSON format; found and installed jasonlite an R package; found some example scripts for accessing JSON with jasonlite; got a couple of successes with simple JSON data like the one on biodiversity of New York State demonstrated in my last post.

Well, that wasn't the end of my story. From the API Resources for Federal Agencies available here, I downloaded the individual_apis.json file. The following is a part of the file opened by notepad and you can read and understand everything written there.


With the following code I could access the JASON file from R.

The code head(x) gives the first six rows of x;


Moving the slider in the scroll bar to the right shows the remaining part.


All this seems fine. So I tried to write the whole of “x” to a comma separated value (csv) text file.

# write csv file
write.csv(x, file = "individual_apis.csv", row.names = FALSE)

It couldn't be done. The error message was:

Error in .External2(C_writetable, x, file, nrow(x), p, rnames, sep, eol, :
unimplemented type 'list' in 'EncodeElement'

I tried looking for possible solution on the web and tried out many that seemed promising. But failed miserably. Finally I found a post by Mark Needham (R: write.csv – unimplemented type ‘list’ in ‘EncodeElement’) that showed the remedy for a problem with the error message that exactly matches mine. He advised that:

If we do have a list that we want to add to the data frame we need to convert it to a vector first so we don’t run into this type of problem”

In our case, as well as his, the data frame consisted of two columns and the second column happened to be a list instead of a vector. The difference between his data frame and mine was that his second column consisted of data elements when converted to a vector has the same number of elements as the first column, whereas my second column consisted of a list of data frames each with two columns and variable number of rows. And I don't have enough knowledge of R to be able to adapt his method to my more complex situation.
Then as my second and third screenshots above showed, the whole of my data frame could be displayed on the screen properly. So in a typical dummy way I realized that if I could transfer that output to a text file it would be the solution. The only way I know of for that is to use the “sink” function and after some struggle I succeeded. My script runs:

Then the saved text file is read in two parts, cleaned, combined into one data frame, and csv file written and is done (credit due to f3lix for the answer to How to trim leading and trailing whitespace in R? on Stack Overflow).
.

However, I couldn't be triumphant for long because when I compared contents of the csv file with the output on the console I found at once a discrepancy. The console displayed for the 8th row:

Cropland Data Layer, Quick Stats API, http://www.nass.usda.gov/research/Cropland/sarsfaqs2.html#_Cropscape1.2, http://quickstats.nass.usda.gov/api

while individual_apis.csv gives:

Cropland Data Layer, Quick Stats API, http://www.nass.usda.gov/research/Cropland/sarsfaqs2.html.


So it missed #_Cropscape1.2, http://quickstats.nass.usda.gov/apiand I couldn't get that right. Besides there could be lot more misses or some other errors which means I'll have to look for a better way.  

Wednesday, October 28, 2015

JSON, who?


I had read about Jason and the golden fleece in my school boy days and now I remembered that vaguely as a piece of Greek mythology and nothing more. These help me refresh my memory:

I guess JSON is pronounced the same but the difference is that it is very real and becoming more and more visible on the Web.

The first time I had heard of JSON was about two years ago when I was looking for large data files of one Terabyte or more so that I could try playing with Big Data. Looking for sources of big data, I vaguely came to understand that big boys like Amazon or Google for example, could let me get such data in a format called JSON. It was the first time I heard of that name and I thought it must be terribly hard to learn and use it. So I dumped the idea of trying to get data that way. And I went for more traditional statistical data formats like text format, or SPSS format, or Stata format and ended up collecting a couple of sub-terabyte data files.

The official website (json.org) described JSON as

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language ... JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.

Stat 545 getting data from the Web – part 2 gives the example of JSON,


and XML:



I'm still surprised at how many people are unaware that 22 of the top federal agencies have data inventories of their public data assets, available in the root of their domain as a data.json file. This means you can go to many example.gov/data.json and there is a machine readable list of that agencies current inventory of public datasets.

I currently know of 22 federal agencies who have published data.json files

Looks like JSON is somewhat new, even in the U.S. A quick look at the situation of open data of various nations in July 2015 and looking at Hong Kong and Singapore I found that 20 datasets of Hong Kong were available in JSON, but looks like there's none in Singapore where the data sets were mostly in XML format.


Ten years ago, XML was the primary data interchange format. When it came on the scene, it was a breath of fresh air and a vast improvement over the truly appalling SGML (Standard Generalized Markup Language).

It enabled people to do previously unthinkable things, like exchange Microsoft Office documents across HTTP connections. With all the dissatisfaction surrounding XML, it’s easy to forget just how crucial it was in the evolution of the web in its capacity as a “Swiss Army Knife of the internet.”

But it’s no secret that in the last few years, a bold transformation has been afoot in the world of data interchange. The more lightweight, bandwidth-non-intensive JSON (JavaScript Object Notation) has emerged not just as an alternative to XML, but rather as a potential full-blown successor. A variety of historical forces are now converging and conspiring to render XML less and less relevant and to crown JSON as the privileged data format of the global digital architecture of the future. I think that the only question is how near that future is.

Well then, that inspired me to experiment with accessing JSON data and to play with it. I looked for R packages that would help me do it and I found the Jasonlite package. I installed it, and after some frustrating moments I was able to get two or three data files in JSON format converted to the standard data frame in R.
Looking for JSON data, one convenient source was the Data.gov site hosted by the U.S. Government. From there I found the Biodiversity data by County for the State of New York. This is the R script I used for downloading the data and converting it into a data frame.


This exercise took me 91.3 seconds on my i5 laptop with 8GB RAM and Windows-7.
I dare you to mess with JSON. Who's afraid of JSON, anyway.






Wednesday, October 21, 2015

PVT tawla or lost in font jungle


As mentioned in my last post Myanglish PVT, I was frustrated, so much frustrated in failing to get a respectable version of PVT data collection application in Myanmar language. It must have been because I have committed the sin of borrowing the name Tawla (into the woods), a classical genre of Myanmar poetry not by its time-honored name as it is, but calling it Sylvan Stroll for my other blog, sooner or later the gods will have to punish me. Here's how my PVT got tangled in the font jungle.


From those screenshots, since I've created the application with Myanmar3 font, I could very well understand that default font, and Zawgyi One could not reproduce the original one for Myanmar3 as shown in the first screenshot at the top. Because they are not fully compliant unicode fonts.


Among the two Android phones (HTC Desire-X, Xiaomi Redmi) and one tablet (Samsung Galaxy Tab3), only the Samsung allowed the installation of Myanmar fonts and I had installed Myanmar3, Padauk unicode, and Zawgyi One for the purpose of this test. The tablet already had Frozen Zawgyi keyboard pro and Frozen keyboard installed when my son gave it to his mom. From that I could guess Zawgyi is the popular font for Facebook and pardon me for not been a part of this immensely popular culture here.

For HTC Desire-X, it came with its own default Myanmar font. Comparing it with the last screen shot of previous picture I saw an exact match. So the font in HTC must be Zawgyi One. This laboriously found piece of information by poor me, and more, must have been just some common knowledge in the Myanmar font community.

But what surprised me most was the way the Myanmar3 font displayed perfectly on Windows environment becomes garbled on the Android. It was said that the Android platform is not as good as the Windows platform in rendering Myanmar unicode fonts. But the difference in Myanmar3 font on Windows and Myanmar3 font on Android seems to lie in reasons deeper than that.

Looking at the screenshots I hope the Myanmar font gurus may see at once what was wrong, and then they would look for solutions. The following examples from Myanmar Fonts which follow Unicode rules and some other fonts in the Unicode family not reproduced here, shows the same Myanmar-sar, excepting a few glitches here and there.


If I am right those examples were for Windows platform. Obviously however, regardless of platform, if different Myanmar fonts comply with the same unicode standard, they should display the same Myanmar-sar. Or would we be expecting too much for the Android platform at present?

On my Windows-7 laptop, Myanmar3 and Tharlon are interchangable and in fact when I believed that the list of political parties posted on UEC website has been done in Myanmar3 it was because I could perfectly see Myanmar-sar without loss with that font. Now when I try repeating the process by copying from the UEC website (here) and pasting it on Excel, I notice now only that the font name appears as Tharlon. Well, I must have been influenced by the information I got somewhere that Myanmar government is using Myanmar3 as a de facto standard to miss that clue. That didn't seem to harm my PVT application, though.

As I said, from among the three Android phone/tablet available to us only the Samsung tablet allowed me to install Myanmar font without “root”. I installed Myanmar3 and Padauk fonts using the iFont application available from the Google Playstore. Then the views by different fonts shown in my screenshots were enabled by making changes with Settings Display Font Style. The result on my PVT data collection application is as you can see in the first set of screen shots shown above.

In the case of my Xiaomi Redmi phone, I can't get the Myanmar3 or Padauk font installed as yet. I came to know that Xiaomi phones available locally here have been customized to be able to install such fonts. Mine is with the ROM Redmi 1 WCDMA Stable Version (Singapore) JHBMIBH25.0 and can't install application software designed for other ROMs, such as for Myanmar. I've started a thread on the official MIUI website asking for help and replies on it don't seem to promise any solution as yet.

As for the HTC Desire-X phone, I couldn't find Settings Display Font Style on it and so didn't try installing the required Myanmar font for my PVT application.

All in all, even if I could find solution to install the desired unicode fonts on any Android phone, it would be futile if I would get the kind of garbled results that I have demonstrated. Anyway, it's out of my league to understand why it happened the way it happened. Do I blame the Android or the font designers, or both?


So much for complexities. What I thought would have been a simple task of replacing English labels with Myanmar on the Android application turned out to be something not so simple at all. For now, I've gotten lost in the font jungle in the very first verse of my PVT tawla.

Tuesday, October 13, 2015

Myanglish PVT


Myanglish, though looked down by well meaning adults, appeals to young people like my grandson for example. I have been thinking that they should just learn to use real English instead of stomping in trash. That was my verdict until I tried fashioning a Myanmar version of my PVT data collection program for the Android hand-phones a few days back.

The truth is that I found it really hard to render questions, instructions, and response categories in English of my PVT application to Myanmar language. The CSEntry program for Android says it is multilingual and so I thought it wouldn't be much trouble at all. As it turned out, I had to struggle with every step of the way. It's worth sparing at least one separate post for my experience on this and it could then be more of blurting out my frustrations than anything.

Briefly, first, I have to get my Windows 7 laptop display Myanmar language in Myanmar 3 font. Why Myanmar 3 in particular? Because I found the UEC (Union Election Committee, in charge of the the Myanmar Elections 2015) posted the list of Political Parties that have registered for the election of November 8, 2015. This list of ninety-one parties is the essential part of my PVT application on Android phones and it is in Myanmar 3 font. As for reading this list on my laptop I have no problem because I have this font installed some time before.

Next I have to find a way to write in Myanmar on my laptop. Thanks to a Google group active in Myanmar language, I found the “myanmar3-wins.zip” from here, “MyFontsSettings.xlsx” from here, and “MM3FontInstallationGuidewindow7.docx” from here. These were enough to let me type Myanmar words in Microsoft Word or Excel or in a text document in Open Office.

Since my real need was to make a Myanmar language version of my PVT data collection application, I have to do the language conversion first in CSPro program and then using it to develop and compile the Android data collection application. It was easy enough copy-pasting the UEC political party list one by one into the data entry application on the CSPro side. Only boring and tedious. But having to type the questions that have to appear on the Android application had great difficulties. You can type in with the Myanmar 3 font, but once it is there you can't edit it. Similarly, you can type in Word, then copy and paste on the application, but here again you can't edit. All you can do it is to delete it, type the questions in a document outside, then paste it again.

Now the problem is typing in with Myanmar translations. Because I'd never typed in Myanmar language, I could never do the whole thing. So I looked for help and a younger IT person came to the rescue. He had it done in a few hours, and with some laborious edits by me, here you are able to see finally the Myanmar version of the Parallel Vote Tabulation application “PVT_5”.

For playing with my PVT_5 application, you need to get PVT_5.pen and PVT_5.pff files. You also need to have the CSEntry for Android installed on your Android phone or tablet. CSEntry could be downloaded from the Google Play store and:

PVT_5.pen is available here.
PVT_5.pff is available here.

For a little tip on getting started with using such data collection applications on Android see my earlier post Yan Can Cook or More fun with PVT. The present post must have been quite dry, because I can't include screen shots of my application unlike in my earlier one. The fact is that between my wife and I, an old couple, we have three hand-phones, and two tablets and yet none of them could display the Myanmar 3 font.

My hasty search on the Web shows that I might need to “root” my Android phones to allow for installing Myanmar 3 font. It's something like “jail breaking” of Apple phones and tablets, and not risky they say. I still have to think about it and would like to learn more. But while I have been developing this application, I saw it worked on a young friend's Android phone that could display the Myanmar 3 font.


One last thing. I am far from being able to make everything perfect on my PVT application in Myanmar language. It is yet crude, incomplete, and some English entries remain unconverted. That's also the reason I call it a Myanglish application in a sense different from the standard concept of Myanglish of my grandson and others. Nevertheless, we weren't that different in having to make do with whatever we have at the moment, either for lack of knowledge on our part, or for the lack of command over resources, or for great many other things, or for just all of them.  

Thursday, September 24, 2015

R vs. XXX


Comparing R with some well-known commercial statistical software like SPSS, SAS, or Stata comes up time to time. From answers on stackoverflow five years ago, I like this one by Greg Snow and it still looks relevant:

When talking about user friendlyness of computer software I like the analogy of cars vs. busses:

Busses are very easy to use, you just need to know which bus to get on, where to get on, and where to get off (and you need to pay your fare). Cars on the other hand require much more work, you need to have some type of map or directions (even if the map is in your head), you need to put gas in every now and then, you need to know the rules of the road (have some type of drivers licence). The big advantage of the car is that it can take you a bunch of places that the bus does not go and it is quicker for some trips that would require transfering between busses.

Using this analogy programs like SPSS are busses, easy to use for the standard things, but very frustrating if you want to do something that is not already preprogrammed.
R is a 4-wheel drive SUV (though environmentally friendly) with a bike on the back, a kayak on top, good walking and running shoes in the pasenger seat, and mountain climbing and spelunking gear in the back.

R can take you anywhere you want to go if you take time to leard how to use the equipment, but that is going to take longer than learning where the bus stops are in SPSS.”

And he added:

There are GUIs for R that make it a bit easier to use, but also limit the functionality that can be used that easily. SPSS does have scripting which takes it beyond being a mere bus, but the general phylosophy of SPSS steers people towards the GUI rather than the scripts.”

The rest of the discussions you could read on, but my favorite is this one from a student I'd quoted in one of my earlier posts in connection with using R for econometrics. I am repeating here the answer that appeared on Quora in 2014.

Karem Tuzcuoglo, a PhD candidate in economics at Columbia explains:
"One-Click" Programs ((almost) no coding required, results obtained by one click)
STATA: Most of the econ undergrad programs use STATA. It is the best program (even at the PhD level) if you want to estimate panel data (i.e., where the data hava both cross sectional and time series dimension. Typical examples are surveys and international trade data sets).Eviews: Less famous than Stata, but provides much better time series analysis. If you don't want to do time series forget about Eviews.SPSS: I don't have much information about it. But I can tell that it's not widely used.
"Semi-Coding" Programs
SAS: It used to be a big deal 10-20 years ago. Right now not as famous as before - though there are some companies that still strictly prefer using SAS.R: Maybe the most popular program nowadays. First of all it's free! R network and R packages (pre-written algorithms by others) are getting larger and larger. Actually, R can be listed in the next section as well because one can definitely code everything in R. However, the fact that there are so many ready-to-use packages in R makes it also Semi-Coding program if one wants to.
"Pure-Coding" Programs
MATLAB: The most famous program among (high level) econometricians. Many applied economics have been done by Matlab. A lot of researchers put their Matlab codes online. It has a good Econometrics package - one still needs to code though.PYTHON: It's more powerful and faster than Matlab. However, it's a very new language; it's still developing. C++: If one wants to do hardcore coding, then C++ is the ultimate program. It's extremely fast in terms of computation (once, my simulation took 25 hours in Matlab, whereas C++ ran the code in 3-4 hours).FORTRAN: Professors above 55+ age will know this program. It's (almost) not used anymore - though we should show some respect to the Father of Coding Programs!
BONUS: There are several other programming languages of course. If you are in UK (especially in Oxford), you will end up using a program called Ox., which is an optimized program for matrix algebra and, thus, for econometrics. Gretl is an extremely easy to use - but less to offer- program.

Among all of the options, I would suggest you to learn R regardless of whether you want to work in academia or in industry (more and more companies begin using R by the way). But if you want to stay away from coding, then go for STATA.

A heated debate on R vs. SAS started in August 2012 in Cross-Validated with the question “R vs SAS, why is SAS preferred by private companies?" The last entry was on April 2014. Among the posters Frank Harrell is the maintainer of the Hmisc package in R. “The package contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX code, and recoding variables.”

Below I've biased myself by selecting only Frank Harrell's comments from the debate and have them taken out of context. However, I just wanted them to serve as teasers for the whole discussion.

  • I'm not sold by @PeterFlom 's point either. There are about 4000 packages in R. Not all have to be of the highest quality for add-on packages to have a net positive value. The number of reliable add-on packages exceeds the capabilities of SAS by a huge margin. (Aug 6 '12 at 20:06)
  • True, but it's hard to penalize a statistical computing system for its comprehensiveness. Or to say it another way, R's way of doing something is better than another system's way of not doing it. (Aug 7 '12 at 12:29)
  • I think these comments are not correct. In the server world, open source rules, and the Apache web server is the most popular web server. (Aug 11 '12 at 13:47)
  • I'm hoping that the 2nd edition of RMS will be available in just over a year. (Aug 12 '12 at 13:48)
  • I'm not familiar with that world but I suspect that scientists have more freedom than they think. (Aug 12 '12 at 13:49)
  • There is nothing that needs to be done to redo regulatory approval for the sake of switching to R. (Aug 11 '12 at 18:58)
  • What is the alternative to downloading a package that provides new capabilities (as most R packages do)? Is it to home grow those capabilities? Is that more reliable? (Aug 11 '12 at 18:59)
  • SAS comes with the same warranty as R: none. (Jan 8 '13 at 13:26)
  • Yes, people have to some extent discover R on their on. But much of the issue comes down to inertia of learning a new language. New languages are always coming out that have advantages over older languages yet users cling to the old languages (witness COBOL). Programming in SAS is hugely inefficient, requiring perhaps double the number of programmers to do the same job as R, but SAS experts are happy to hum along on their merry way and companies are afraid of the kind of disruption that would save them millions of dollars in salaries. (Jan 8 '13 at 13:33)
  • I don't follow your reasoning. The amount of money wasted paying programmers to program in an archaic language (SAS) vs. modern free languages is stunning. (Apr 15 '13 at 15:25)
  • Having used SAS for 23 years and S-Plus/R for 22 years I can say that a highly experienced SAS programmer can be highly productive, but that an experienced R programmer can be easily three times as productive. (Jun 10 '13 at 3:18)
Among the comments the following anti open source remark from the SAS representative was most remarkable and received an uproar. You should not miss reading the links provided.

... I think the worst anti open source quote I've heard was from SAS saying soemthing like 'would you trust a jumbo jet designed in open source, an engine might drop off'(PaulHurleyuk, Aug 7 '12 at 12:37)

@PaulHurleyuk: +1 The quote was “We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.” by a SAS marketing director in this New York Times article on R. The SAS representative clarified her remarks in a later blog post. (jthetzel, Aug 7 '12 at 12:53)

Note: The first link in above quote doesn't work. I looked for and found it here.

As usual, I would say the best way will be to read all these and more and then make up your mind on your own. As poor small guys we hardly need a second look.




Tuesday, September 22, 2015

Thakhin spirit and expansion to thousand lights


Only yesterday a friend told me that some young people from his nonpartisan research unit recently have the opportunity to learn SPSS. SPSS, as you know, is the Statistical Package for the Social Sciences and is close to a household word in Myanmar with people in some way connected with social sciences, or surveys, or statistics. Most of the time when I mentioned R, a complete environment for statistical computing, or CSPro, a survey data processing software, they would just ask: how does it compare with SPSS?

The point of the story is that this friend told me that the training was offered by some professionals from an industrial powerhouse nation in Asia, entirely free, and I suppose—with no strings attached. However, I was a bit worried. We have seen that when we began opening up not too long ago, a lot of swindlers disguised as businessmen came in by swarms (to make it a little more dramatic). I don't know how much they've squeezed out of our people, but later searching on the Web with the clues I heard of, I could identify that they were using the pyramid and ponzi schemes; may be more. I suspect they are still lurking somewhere.

Anyway I was carried away talking about our people being cheated. In the present context relating to our friend I am positive they are receiving a genuine transfer of know-how. Nevertheless, my concern with it is sustainability and the prospect for an expansion of the knowledge base. We should be aware that while learning to use SPSS or any other commercial software could be made free, the software itself is not free and successive sharing of knowledge would call for proportional expansion of financial resources. While such software could be free in the sense that some funding agency undertakes to license it for you for the time being, clearly the agency could not do it indefinitely, or respond to ever growing needs.

So, you have two options: (A) when the honey moon is over, or when you want to expand a significant number of computers on which SPSS has been installed initially, you could resort to using pirated copies of the software, or (B) use some open-source or free software from the beginning. Looking at the price list of SPSS just now, I found a lot of complicated arrangements for licensing. As far as I could understand, there are four package configurations with starting prices in (US dollars per user): Base ($1140) ; Standard ($2530); Professional ($5090); and Premium ($7590). With each of them it seems you get software support for only 12 months, but you could use the software indefinitely. Here, it should be noted that for complex samples, which virtually is for all sample surveys, you need to use the complex samples module for data analysis. Among the packages mentioned, only the Premium package came included with it. For others, license per user for indefinite time will cost an extra US$1450.

Let's do a little bit of calculation. Let's assume also that you need at least 10 of your computers installed with SPSS to be realistically operational. Then we have (i) Base + Complex Samples, total cost US$25,900 and (ii) Premium, total cost US$75,900. Well, these don't look much for an international donor, right? But think again. If you are mandated to share, or you intend by yourselves to share knowledge with people outside of your organization, that won't work.

This reminds me of a political joke of many years ago about a zawgyi, a mythical magician who at the height of his powers could fly in the air or bore through the earth. Apparently, this particular rookie zawgyi couldn't quite finish building up his magic. So he ended up neither flying nor walking but hovering at a man's height in midair.


For R, base module and any or all packages is free. For example, the “Survey” package is a dedicated package for the analysis of complex samples and you could download it any time you want to.

Obviously, my choice is for the option(B) and I have been advocating the R statistical environment in my earlier posts:

  • Spigot algorithm for calculation of pi (in Teashop PI-I)
  • Pigeon half, five-for-duck, quarter a-sparrow
  • An Unclaimed CD on Psychometrics with R or Intro to Anything with R
  • Big data: small guys could do it?
  • Big data: hands-on correlation, old and new
  • Correlates of labor productivity growth
  • Blind leading the 20/20
  • Econometrics for the Masses, Blind Boy, and Courage
  • Fooling around and having fun with PVT

I call the option (B) mentioned earlier as “Thakhin spirited” and the open-source model an expansion to thousand lights model, as lighting thousand candles from a single source would only increase the sum total of available lights and won't take away the light from the donor candle.

On the other hand, if you like option (A), take it, and then you may like to name it yourselves.  

Sunday, September 20, 2015

Myanmar Land tidbits: Did we miss the boat?


We used to have our headquarters on the middle floor of the Gandhi Hall building on the corner of Merchant Street and Bo Aung Gyaw Street in downtown Yangon. Apart from my other duties, I had to take care of the library. It is not much of a library though and consisted only of four or five large book cases lined across the hallway. Those days I was quite familiar with the contents of these bookcases but would be at a loss to describe them now, save for one.

It was a lather bound book three fingers thick and the title on the book says “Torrens System”. It was in the style of leather binding we find with religious books the book binders on the steps of our great Shwedagon pagoda used to make. I have no doubt therefore that it must have been a priced collection some time before, probably when the British Settlement Officers or the Commissioner were still in office. I went through the book and thought I was able to grasp and appreciate the idea behind the Torrens System. Thinking about its contents now, I was impressed most of all with the spirit of Torrens to make land transactions easy in contrast to the deeds registration system, its principle of indefeasibility of the title, and the attending cadastral system that could accurately reconstruct the boundaries of a given land holding on the ground in case of disputes.

Talking of the Torrens system, I was really surprised when a senior agricultural economist turned political economist, a Myanmar living in Down Under, told me that we have in fact the Torrens system in Myanmar. In my experience of working in the government agency that specifically deals with land administration including assessment of agricultural land tax, maintaining cadastral maps and registers, collecting agricultural statistics and handling land disputes I had never heard of or read about our cadastral system being seen as a Torrens system. I had worked there for 26 years, half of that in the districts and the other half at our headquarters in Yangon.

This friend told me that I could find the reference to the Torrens system in Maung Htin's well known work “Myanma-le-yar-myay-sanit”(Agricultural land system of Myanmar) and if I heard him right, he said this system was used particularly in the “Colony lands”. I was doubly surprised because I am quite familiar with this work and I was definite I didn't notice anything about the Torrens system in there. Afterward I looked for Maung Htin's book, read through it carefully, and yet couldn't find anything of Torrens!

Later, looking for the possible source of reference for Torrens system in Myanmar I found the following in Housing, Land, and Property Rights in Burma, 2004, by Nancy Hudson-Rodd:

The Land Records and Settlement Department in Burma adopted a modified Torrens System of land registration, for all areas settled by the colonial state. British. Burma was conquered in two stages, 1826 Lower Burma and 1886 in Upper Burma, becoming a colony of the British Empire. To suit these different jurisdictions, the Land and Revenue Act 1874 and the Upper Burma Land Revenue Act 1889 were two acts that effected the imposition of a tax to cover the cost of administration and governance by the British colonial government on settled and alienated land in both Lower and Upper Burma. Legal control and classification of land in Burma was initiated by the British in 1876 as part of their introduction of a revenue collection and taxation system. Cadastral surveys were conducted to classify all land according to ownership and use.” (p. 18)

Consulting resources on Torrens system on the Web, as of now, shows that Thailand, Malaysia, Singapore, and Philippines are using the system. A survey on the earlier adoption of the system by J E Hogg entitled Registration of the Title to Land Throughout the Empire, 1920, cited 17 statutes including that of “Federated Malay States”. However, there was nothing on “Burma” as I hastily looked through it.

Going back to Nancy Hudson-Rodd's statement, historical evidence of Myanmar shows that cadastral surveys initiated earlier on holding basis were superseded in 1878 “by field to field surveys on professional lines followed up by regular settlements.” According to Wikipedia entry on “Torrens Title”, the system originated in 1858 in South Australia:

A boom in land speculation and a haphazard grant system resulted in the loss of over 75% of the 40,000 land grants issued in the colony (now state) of South Australia in the early 1800s. To resolve the deficiencies of the common law and deeds registration system, Robert Torrens, a member of the colony's House of Assembly, proposed a new title system in 1858, and it was quickly adopted. The Torrens title system was based on a central registry of all the land in the jurisdiction of South Australia, embodied in the Real Property Act 1886 (SA).”

Recalling that by the time cadastral surveys on professional lines were adopted in Myanmar in 1878, the Torrens system had already been in place in South Australia for 20 years, and so it seems unthinkable that the colonial professionals taking care of cadastral surveys in Myanmar would have been entirely ignorant of the Torrens system. However, it is truly odd that as far as I can ascertain, no historical documents on land revenue administration in Myanmar ever mentioned the Torrens system. Besides, the cadastral system in Myanmar has not been significantly changed from those days till now. From my personal experience, I had never known any of my seniors or juniors ever discussing anything on the Torrens system and I may safely boast that I could have been the only one around that time who had looked through the Torrens book I talked about.

Perhaps Hudson-Rodd was passing her judgment on the characteristics of the rural land registration system in Myanmar as “Torrens like” and not meant to say about its origins. Perhaps my elder economist friend, a collaborator of Hudson-Rodd, has misread Maung Htin. Or was it a quirk of memory lapse?


To me, the real issue is that whether we would call the current system “Torrens like”, “Embryonic Torrens”, or by any other name, we should be doing a reality check. Should we not critically examine the successes of the Torrens system as practiced in Thailand, Malaysia, Singapore, and Philippines to see if we have missed the boat and act accordingly?   

Friday, September 18, 2015

Myanmar Land tidbits


When I came across the claim that “The British did not gave full proprietorship title to land therefore they called the dues collected on land as Land Revenue instead of Land Tax” I wasn't satisfied. I though it must have been just a play of words. To me revenue sounded like referring to what the government got out of the taxation process, while tax is the burden that fall on taxpayers. Nevertheless, it was the conventional wisdom among fellow officers, based on that assertion, that land ownership recognized by the British Government had been some form of inferior ownership. It was some twenty-five years since I had left that government agency, yet some of the papers written by my younger friends in recent times still carried that assertion, without scrutiny, as truth.

I thought I found this assertion in a booklet or a report by some high official of the Land Nationalization Department while I was a government employee. I am not sure, though. Looking back, I wonder if it carries the overtones of the ruling BSPP (Burmese Socialist Programme Party). Too bad I didn't discuss the merit of this assertion with my seniors. Anyway, most of us would have been wise enough those days not to be inquisitive.

By luck I happened to call up one of my retired younger co-workers a few days ago and he was able to email me a scanned page of the source for that assertion. The following excerpt was from the booklet explaining the prevailing settlement procedures for fixing rates of land revenue with the sponsorship of the Revolutionary Government. The booklet was distributed by the Settlements and Land Records Department in 1966.


It reads:
2. At the times of the British government, they did not give full possession of land (proprietorship) to the people. It was rather the right to hold land (Land Holder's Right). If it were proprietorship, the dues collectable on land has to be “Tax on Land”. If the people were treated only as tenants, the dues has to be called “Rent”. As the right on land given to people by the British government was not as good as proprietary, but still better than the mere rights of tenants, neither the term “Tax on Land” nor “Rent” was used and the compromise “Land Revenue” was coined. That was how land revenue came to exist.”

The concluding words of the excerpt seems to say that the the term Land Revenue originated in Myanmar, thanks to the ingenuity of the British administrators. However the British had used this term in India before us for the purpose of land taxation. This is from Full text of "Report Of The Land Revenue Commission Bengal Vol I".

14. ... All Governments in India have considered themselves entitled to a share of the produce, and 
this share of the produce, whether collected direct, or through farmers of revenue, or through 
subordinates or intermediate landlords, is called "land revenue".

On the other hand, the notion of Land Revenue as some halfway concept is easily disproved by this exceprt [The Land and Revenue Act (India Act II, 1876), in The Lower Burma Land Revenue Manual, 1945]:


Here we can see that instead of collecting land revenue for taungya-cultivation (slash and burn cultivation) “tax” will be collected. In this connection, we could easily see that tenure for the land used for taungya was hardly held with a full proprietorship, yet it was called “tax”.

As for whether Land Holder's Right is a proprietorship title would have been a deep and controversial topic. I guess it would have been hotly debated by Myanmar intellectuals and activists at least in the latter part of colonial rule and particularly after Myanmar's independence. Students of Myanmar land systems and historians would have something concrete to say on this topic.

Here, I am aware of the accounts in which Buddhist monks contested the King's confiscation of their religious lands in the Bagan period and won at the courts or tribunals. These episodes could be found in historian Dr Than Tun's Studies in Burmese History Number One, 1969 (pp. 164-166). They seem to signify that the King is not “The Lord paramount over, and the chief proprietor of, the soil” in Myanmar, at least in relation to religious lands.

A stronger judgment is from a younger generation of Myanmar historians. This is from Google Book entry on Thant Myint-U's “The making of modern Burma”:


I assume “A structure of genuinely private ownership, entirely free of gentry or aristocratic control or involvement” effectively means “allodial title”, may be with some restrictions.

In contrast, in page-148 of BSPP's publication “History of Myanmar Land, vol. 1” of 1970, it was stated that “right to ownership in land in reality was a mere right to hold land”.


While in page-85, it was stated that “Nevertheless, under the 1876 Land and Revenue Act, squatters who had worked their lands for 12 years without interruption are entitled to possess the land. So long as they pay land revenue regularly, government cannot evict them. Moreover, such an owner has the right to treat the land as his or her privately owned land and use it as he or she wishes.”


To me, these two statements look contradictory.

Sunday, August 30, 2015

Yan Can Cook or More fun with PVT


I've been a great fan of Martin Yan since 1985 or 86 when I was lucky enough to get a chance to visit the U.S. and watched his TV program "Yan Can Cook". Then there was a lapse for a decade or so when I returned home. After that I was again able to watch him at Marshall Islands in the Pacific for a year or two. Now, I am not sure if I have watched his old or new programs on this side of the new millennium. Anyway, not so long ago I was suddenly nudged by this curiosity to know if the familiar Yan accent is for real or not. I looked that up on the Web. Well, for now I'll leave it to you to find out what I found, or to guess it. That's up to you.
                                              
In one of Yan's program, I was really amazed watching him separate meat from bone and cut up a whole chicken in a snap, just with his big chopper. In another one he showed how to slice onions really fast with this big chopper again. Anyone would have been scared stiff with the idea of slicing onions with a big, heavy, and razor sharp chopper, but in reality the big broad blade itself is the key to superfast slicing while keeping your fingers safe!

Then, after watching so much of Yan, did I learn to cook or slice vegetables with a chopper like him? No, simply because there is someone with me all the time to handle genuine Myanmar day-to-day cuisine really well, or not so. Anyway, if I try to emulate Yan would I do well? Honestly, I don't think so. Yet, I did pick up Yan's philosophy for good: If Yan can cook, so can you.

With these words of Yan's encouragement I tried recently to start learning about creating data collection applications with mobile phones. Among the different software options available, I picked CSEntry because I know a bit of CSPro the mother software of which CSEntry is the data entry module. With CSPro you could develop data entry application for Windows platform or for Android.

The idea is to develop and test the CAPI (computer assisted personal interviewing) application with CSPro software that is running on a Windows computer. For Android phone data collection you need to develop the CSEntry application with the CSPro software version 6.1. Then you would do most of the testing on the Windows machine and finalize the application going back and forth between your desktop and the phone.

After that you compile the data entry application on the Windows machine to get the pen file (say xxx.pen). When you test runs the pen file on the Windows machine, you will get a pff file (say xxx.pff). These two files are all you need to run a data collection application on your Android phone or tablet. Of course you need have the CSEntry program for Android installed on the phone or tablet in the first place.

       The required CSPro 6.1 software and manuals could be downloaded from the U.S. Bureau of Census website here.
       CSEntry for Android could be downloaded to your phone from the Google Play Store.
       Visit the CSPro Users website for goodies on CSPro and CSEntry for Android.

This is how I worked. To make head or tail out of a CAPI application, I played with the "simpleCAPI" application that comes with CSEntry for Android. After graduating from it, I worked through data entry application in the "Examples\CAPI" folder installed with the CSPro 6.1 program on my PC (I was lucky to have some experience working on regular data entry applications on the PC). Then I tried developing a PVT CAPI data entry application for Android on my own. Here, as I have already been posting about parallel vote tabulation on my Bayanathi blog, I felt that a PVT data collection application would not be too hard to do.

As the idea for the exercise is to get a working model for PVT mobile data collection and not much more, I based my application content almost entirely on the PVT sample observer forms given in pages 89 – 90 of the handbook for quick count/PVT by NDI (The Quick Count and Election Observation: An NDI Handbook for Civic Organizations and Political Parties, Estock, Nevitte, and Cowan, 2002).

Here are some screen shots of my PVT_1 application on Android phone.


Working on an Android CSEntry application gives you some refreshing experience you don't get with the desktop application. The checkbox for inputting multiple answers to a single question as in the screen shots above is a beautiful example. Below you can see how it worked the same as using paper questionnaires, but greatly more convenient because it could give you instantly the data you have previously entered that you want to look up.


In the first screen on the left of the screen shots above, the question was "Which parties contested the vote counting results?" For an earlier question, the list of political parties present at vote counting has already been entered. The program performed a check on answers to these two questions and returned the message shown in the middle screen shot. Now tapping the CSEntry logo on top-left corner of the screen brings up the list of all questions and answers entered (known as the Case Tree) and there you can find the previous answer. Then you can correct either or both of the answers as necessary.                                                                     

If you want to try out my application follow these steps:
  1. Install CSEntry on your Android phone/tablet.
  2. By doing so you will also get the application "Simple CAPI" installed in the folder "csentry" on the SD card.
  3. Download pen file from this link: PVT_capi.pen.
  4. Download pff file from this link: PVT_capi.pff.
  5. If you've opened my blog post with your Android phone/tablet, both files will normally be stored in the "Download" folder. Cut and paste them into your "csentry" folder on your Android phone/tablet. Now, if you run CSEntry on your phone/tablet you will see my application "PVT_1". Tap on it and you are on your way to Start New Case and enter data.

I have created this application for fun (and may be some use).

I don't know if it works perfectly or not. I simply don't have the expertise to guarantee anything. Learn CSPro and try to do things on your own, or pick the software of your choice from among other free/open source software available for mobile data-collection application development.

Now it's my turn to say: If Bayanathi can do it, so can you.