Big data has been called a data tsunami. It has been
described as data exhaust or found data. Perhaps the key distinction between
big data and little data is that in the latter you have the option to make your
data represent the population you are targeting your research. Like for example
in a sample survey.
A lazy way to get some idea of how little data measure up to
big data (with all the hype) is to do a Google search, I suppose.
Google Search: little data vs big
data
|
||
All
time
|
Past
year
|
Past
month
|
David Vs. Goliath: Why Little Data Will Win Over Big Data |
David Vs. Goliath: Why Little Data Will Win Over Big Data |
Little Data vs. Big Data: Does Size Matter? | 6Sense |
Market Research - Little Data vs. Big Data: Nine Types of ... |
Market Research - Little Data vs. Big Data: Nine Types of ... |
What's Holding Us Back From Big Data? Daniel Burrus ... |
Small data vs big data: the battle that never was ... |
Small data vs big data: the battle that never was ... |
Big data - Wikipedia, the free encyclopedia |
You May Not Need Big Data After All - HBR |
Little Data vs. Big Data: Does Size Matter? | 6Sense |
The Big Buzz About Big Data | UKFast Blog |
Forget big data, small data is the real revolution | News | The ... |
Big Data vs. Small Data - Is there a Difference ... |
Microsoft vs US.gov, Internet of Stuff, Big Data - The Channel |
Little Data vs. Big Data: Does Size Matter? | 6Sense |
Why Companies Need to Focus on 'Little Data' - WSJ Blogs |
Our Future: Free Will vs. Predictions with Data - Lutz Finger |
Is Little Data The Next Big Data? | Jonah Berger | LinkedIn |
Little privacy in the age of big data - The Guardian |
AllAnalytics - Matthew Brodsky - Big Chief Data, Little Chief ... |
Is Little Data The Next Big Data? | Jonah Berger |
Forget Big Data. Use Little Data for Incremental Self ... |
6sense
| LinkedIn
|
Big Data vs. Small Data - Is there a Difference ... |
Big Data vs. CRM: How Can They Help Small Businesses? |
Hype vs Reality regarding Big Data? | James McGovern ... |
Big data - Wikipedia, the free encyclopedia |
Big Data vs Little Data - Sales Initiative |
Data Informed | Big Data and Analytics in the Enterprise |
These were the first pages of search results for three
different time frames and without looking at their contents, I felt the idea
that little data could hold its ground would be quite the dominant opinion.
That insight could have been quite wrong, based upon just the titles from first
pages of information that is a product of big data! So I better go non-committal
and say "use both, suitably".
For that matter the title "Small data vs big data: the
battle that never was" of the June 2, 2014 post by Pam Baker in
FierceBigData site makes me feel like I've found a sympathizer of this view.
However, she was thinking about little data as subsets of big data:
Every so often media reports come blasting the
message that little data wins over big data. Give it a minute and more media
reports will come out saying the opposite. So which is winning in the business
arena--big or small data? Neither. This is the battle that never was. There is
a time for big data and a time for little data. Further, big data is made of
little data and it's ridiculous to pit the piece against the whole and declare
one the all-occasion winner. Further still, one almost always drills down to
little data after gaining the big data, big picture insight. Why would one step
be superior to another in the same process?
To use another metaphor to make the point: when
you pit small data against big data you are not comparing apples to oranges but
a bushel of apples against a planet of orchards.
And all the search results as well as her post were talking
about business applications, while we are interested in the use of big data for
development.
On the other hand, it is said that the future of big data is
all about predictions. Time and again we learn that the sheer size of data is
no substitute for relevant data. Lutz Finger in "Our Future: Free Will vs.
Predictions with Data" contrasted one example of big data times against a
prediction in ancient past:
But often it is not the amount of data that matters to
create a good prediction. For example, the Incas predicted the best time to
plant crops. Their dataset might have been as little as 3560 data points (= 10
years) – nothing in our big data world. 500 years later we have companies like
Google that measure a lot about our online behavior. But despite all this data,
predictions are not necessarily easy. For example, New York Times bestselling business
author Carol Roth once complained
in her blog that Google infers that she is a male over age 65, when in
fact she is a woman decades younger.
Why is this? Because not all of the data Google has aggregated
is really helpful for the specific prediction they try to make.
Back to our theme, traditionally, data for development comes
from the research community and official statistics and comprises experimental
data, observational data or survey data and administrative records. These are
the little data I'm thinking about
and I may simply say that little data is the kind of data we have before big
data came around and most people may have been getting aware of big data only after
2011 or so.
So, before the "big buzz about big data" there had
been the little data and it was long
recognized as the basis for evidence-based policymaking and monitoring in all
countries, especially for developing countries. In the area of little
data, the Paris21 consortium is a partnership
of policymakers, analysts, and statisticians from all countries of the world,
focusing on promoting high-quality statistics, making these data meaningful,
and designing sound policies. It was established in November 1999 in response
to the UN Economic and Social Council resolution on the goals of
the UN Conference on Development. A significant project of Paris21,
currently, is the Informing
a Data Revolution (IDR) funded by a grant from the Bill and
Melinda Gates Foundation. Paris21 asked "Are developing countries ready
for the data revolution?"
Are we ready for the data revolution? In the old days we
would jokingly answer—"it's good; spicy hot, though". Now, I remember my days as a youngster
fascinated by little whirlwinds we call lay-bway.
You can't guess where it is going and it is this that makes them so
fascinating. If one brushes you with all the leaves, dust and sand floating
around it sting your eyes. I remember one of our writers of the old generation,
Ze-ya, imaginatively called it one-legged little wind, which we would
have expected from a writer like Dagon-taya
and not from him.
But what's this data revolution anyway? In their report
"A World that Counts: Mobilising the Data Revolution for Sustainable
Development" of November 2014
(http://www.undatarevolution.org/wp-content/uploads/2014/11/A-World-That-Counts.pdf),
the Independent Expert Advisory Group gives the rationale:
As the
world embarks on an ambitious project to meet new Sustainable Development Goals
(SDGs), there is an urgent need to
mobilise the data revolution for all people and the whole planet in order to
monitor progress, hold governments accountable and foster sustainable
development. More diverse, integrated, timely and trustworthy information can
lead to better decision-making and real-time citizen feedback. (Executive
summary, p. 3)
And defines data revolution this way.
The
data revolution is:
□
An explosion in the volume of data, the speed with which data are
produced, the number of producers of data, the dissemination of data, and the
range of things on which there is data, coming from new technologies such as
mobile phones and the “internet of things”, and from other sources, such as
qualitative data, citizen-generated data and perceptions data;
□
A growing demand for data from all parts of society.
After all it reads like what you see in any writing about big
data these days. May be I could summarize it for the dummies: (i) Let there be
big data, and (ii) Witness the surge in demand for data.
Then they link data revolution with sustainable development
goals. There were three bullets, but seems to me that the first is the one that
is essential.
The
data revolution for sustainable development is:
□
The
integration of these new data with traditional data to produce high-quality
information that is more detailed, timely and relevant for many purposes and
users, especially to foster and monitor sustainable development;
So now, (iii) Let's arrange a marriage of the little data
with the most eligible big data.
I am glad that that is what I arrived at vaguely (or more
plainly, through guesswork) and I am not sure if that is not a marriage of
convenience. But how you actually get the little data married to the big data
(I guess they may just have been working on match-making), and specifically for
the stewardship of sustainable development?
The executive summary gives how data revolution for
sustainable development could be used: (i) directly through enabling to "monitor progress", and
(ii) complementarily through "... hold(ing) governments accountable ... (and getting) real-time citizen
feedback." Here the second part could be seen also as a revolution for equality between the data
rich and the data poor:
... the
data revolution can be a revolution for equality. More, and more open, data can
help ensure that knowledge is shared, creating a world of informed and empowered
citizens, capable of holding decision-makers accountable for their actions. (p.
8)
But where's this eye-stinging part? Seems like nations with
a lot of catch up to do could find coping with data revolution a bit spicy-hot. Particularly, those governments
with creaking national data infrastructures will have to face quite formidable
tasks like these:
National
statistical offices, the traditional guardians of public data for the public
good, will remain central to the whole of government efforts to harness the
data revolution for sustainable development. To fill this role, however, they
will need to change, and more quickly than in the past, and continue to adapt,
abandoning expensive and cumbersome production processes, incorporating new
data sources, including administrative data from other government departments,
and focusing on providing data that is human and machine-readable, compatible
with geospatial information systems and available quickly enough to ensure that
the data cycle matches the decision cycle.
Anyway when you open your windows and this sudden gust of lay-bway hits your face and sting your
eyes, you need not panic. Think of that as ventilation a bit stronger than
usual.
Things need to be done have to be done somehow and as usual the
UN post-2015 development agenda does not come without a package to assist—partnership to catalyze global solidarity
for sustainable development in this case. Also, you could look for
technical assistance from projects like Informing a Data
Revolution (IDR) and others.
We are glad to know that Myanmar already has good relations
with Paris-21. It is one from eleven countries of Southeast Asia, South Asia,
and North Asia which has successfully completed the first National Strategy for
the Development of Statistics (NSDS) Training Course in the Asian Region in December
2014 organized by PARIS21 in collaboration with the Statistical Institute for
Asia and the Pacific (SIAP).
Paris-21 informed on their website of the opportunity for
the voice of developing countries to be heard in the debate on data revolution which
we should at least be aware of:
In the months leading up to
September 2015 there will be a comprehensive process to involve as many people
as possible in discussions about the data revolution, what it should do, who
should be involved and how it should be put into action. It is essential that
the voice of developing countries is heard in this debate and that the
discussion is not hijacked by special interests or those with the deepest
pockets.
No comments:
Post a Comment