It must have been a not-so-intelligent
autodidact's typical approach to something new for him.
There's this website posting some data
in JSON format. I stumbled upon it. More accurately it was a series
of stumbling. I tried to find out more about open data on the web;
stumbled upon APIs (application programming interface); couldn't
understand much; looked for examples; found more about JSON; stumbled
upon Data.gov website; looked for ways to access data in JSON format;
found and installed jasonlite an
R package; found some example scripts for accessing JSON with
jasonlite; got a couple of successes with simple JSON data like the
one on biodiversity of New York State demonstrated in my last post.
Well, that wasn't the end of my story.
From the API Resources for Federal Agencies available here,
I downloaded the individual_apis.json file.
The following is a part of the file opened by notepad and you can
read and understand everything written there.
All this seems
fine. So I tried to write the whole of “x” to a comma separated
value (csv) text file.
#
write csv file
write.csv(x,
file = "individual_apis.csv", row.names = FALSE)
It couldn't be done. The error message
was:
Error
in .External2(C_writetable, x, file, nrow(x), p, rnames, sep, eol, :
unimplemented
type 'list' in 'EncodeElement'
I tried looking for possible solution
on the web and tried out many that seemed promising. But failed
miserably. Finally I found a post by Mark Needham (R:
write.csv – unimplemented type ‘list’ in ‘EncodeElement’)
that showed the remedy for a problem with the error message that
exactly matches mine. He advised that:
“If
we do have a list that we want to add to the data frame we need to
convert it to a vector first so we don’t run into this type of
problem”
In our case, as well as his, the data
frame consisted of two columns and the second column happened to be a
list instead of a vector. The difference between his data frame and
mine was that his second column consisted of data elements when
converted to a vector has the same number of elements as the first
column, whereas my second column consisted of a list of data frames
each with two columns and variable number of rows. And I don't have
enough knowledge of R to be able to adapt his method to my more
complex situation.
Then as my second and third screenshots
above showed, the whole of my data frame could be displayed on the
screen properly. So in a typical dummy way I realized that if I could
transfer that output to a text file it would be the solution. The
only way I know of for that is to use the “sink” function and
after some struggle I succeeded. My script runs:
Then the saved text file is read in two
parts, cleaned, combined into one data frame, and csv file
written and is done (credit due to f3lix for the answer to How
to trim leading and trailing whitespace in R? on Stack Overflow).
.
However, I couldn't be triumphant for
long because when I compared contents of the csv file with the
output on the console I found at once a discrepancy. The console
displayed for the 8th row:
Cropland
Data Layer, Quick Stats API,
http://www.nass.usda.gov/research/Cropland/sarsfaqs2.html#_Cropscape1.2,
http://quickstats.nass.usda.gov/api
while individual_apis.csv gives:
Cropland
Data Layer, Quick Stats API,
http://www.nass.usda.gov/research/Cropland/sarsfaqs2.html.
|
So it missed “#_Cropscape1.2,
http://quickstats.nass.usda.gov/api”
and I couldn't get that right. Besides there could be
lot more misses or some other errors which means I'll have to look
for a better way.
Nov. 7: Found unreadable piece of code after the first screenshot (has to be replaced with an image); and missing third screenshot. Corrected. My apologies.
ReplyDelete