Comparing R with some well-known
commercial statistical software like SPSS, SAS, or Stata comes up
time to time. From answers on stackoverflow
five years ago, I like this one by Greg Snow and it still looks
relevant:
“When
talking about user friendlyness of computer software I like the
analogy of cars vs. busses:
Busses
are very easy to use, you just need to know which bus to get on,
where to get on, and where to get off (and you need to pay your
fare). Cars on the other hand require much more work, you need to
have some type of map or directions (even if the map is in your
head), you need to put gas in every now and then, you need to know
the rules of the road (have some type of drivers licence). The big
advantage of the car is that it can take you a bunch of places that
the bus does not go and it is quicker for some trips that would
require transfering between busses.
Using
this analogy programs like SPSS are busses, easy to use for the
standard things, but very frustrating if you want to do something
that is not already preprogrammed.
R
is a 4-wheel drive SUV (though environmentally friendly) with a bike
on the back, a kayak on top, good walking and running shoes in the
pasenger seat, and mountain climbing and spelunking gear in the back.
R
can take you anywhere you want to go if you take time to leard how to
use the equipment, but that is going to take longer than learning
where the bus stops are in SPSS.”
And he added:
“There
are GUIs for R that make it a bit easier to use, but also limit the
functionality that can be used that easily. SPSS does have scripting
which takes it beyond being a mere bus, but the general phylosophy of
SPSS steers people towards the GUI rather than the scripts.”
The rest of the discussions you could
read on, but my favorite is this one from a student I'd quoted in one
of my earlier posts in connection with using R for econometrics. I am
repeating here the answer that appeared on Quora in 2014.
Karem Tuzcuoglo, a PhD candidate in
economics at Columbia explains:
"One-Click"
Programs ((almost)
no coding required, results obtained by one click)
STATA: Most of the econ undergrad programs use STATA. It is the best program (even at the PhD level) if you want to estimate panel data (i.e., where the data hava both cross sectional and time series dimension. Typical examples are surveys and international trade data sets).Eviews: Less famous than Stata, but provides much better time series analysis. If you don't want to do time series forget about Eviews.SPSS: I don't have much information about it. But I can tell that it's not widely used.
"Semi-Coding" Programs
SAS: It used to be a big deal 10-20 years ago. Right now not as famous as before - though there are some companies that still strictly prefer using SAS.R: Maybe the most popular program nowadays. First of all it's free! R network and R packages (pre-written algorithms by others) are getting larger and larger. Actually, R can be listed in the next section as well because one can definitely code everything in R. However, the fact that there are so many ready-to-use packages in R makes it also Semi-Coding program if one wants to.
"Pure-Coding" Programs
MATLAB: The most famous program among (high level) econometricians. Many applied economics have been done by Matlab. A lot of researchers put their Matlab codes online. It has a good Econometrics package - one still needs to code though.PYTHON: It's more powerful and faster than Matlab. However, it's a very new language; it's still developing. C++: If one wants to do hardcore coding, then C++ is the ultimate program. It's extremely fast in terms of computation (once, my simulation took 25 hours in Matlab, whereas C++ ran the code in 3-4 hours).FORTRAN: Professors above 55+ age will know this program. It's (almost) not used anymore - though we should show some respect to the Father of Coding Programs!
BONUS: There are several other programming languages of course. If you are in UK (especially in Oxford), you will end up using a program called Ox., which is an optimized program for matrix algebra and, thus, for econometrics. Gretl is an extremely easy to use - but less to offer- program.
Among all of the options, I would suggest you to learn R regardless of whether you want to work in academia or in industry (more and more companies begin using R by the way). But if you want to stay away from coding, then go for STATA.
STATA: Most of the econ undergrad programs use STATA. It is the best program (even at the PhD level) if you want to estimate panel data (i.e., where the data hava both cross sectional and time series dimension. Typical examples are surveys and international trade data sets).Eviews: Less famous than Stata, but provides much better time series analysis. If you don't want to do time series forget about Eviews.SPSS: I don't have much information about it. But I can tell that it's not widely used.
"Semi-Coding" Programs
SAS: It used to be a big deal 10-20 years ago. Right now not as famous as before - though there are some companies that still strictly prefer using SAS.R: Maybe the most popular program nowadays. First of all it's free! R network and R packages (pre-written algorithms by others) are getting larger and larger. Actually, R can be listed in the next section as well because one can definitely code everything in R. However, the fact that there are so many ready-to-use packages in R makes it also Semi-Coding program if one wants to.
"Pure-Coding" Programs
MATLAB: The most famous program among (high level) econometricians. Many applied economics have been done by Matlab. A lot of researchers put their Matlab codes online. It has a good Econometrics package - one still needs to code though.PYTHON: It's more powerful and faster than Matlab. However, it's a very new language; it's still developing. C++: If one wants to do hardcore coding, then C++ is the ultimate program. It's extremely fast in terms of computation (once, my simulation took 25 hours in Matlab, whereas C++ ran the code in 3-4 hours).FORTRAN: Professors above 55+ age will know this program. It's (almost) not used anymore - though we should show some respect to the Father of Coding Programs!
BONUS: There are several other programming languages of course. If you are in UK (especially in Oxford), you will end up using a program called Ox., which is an optimized program for matrix algebra and, thus, for econometrics. Gretl is an extremely easy to use - but less to offer- program.
Among all of the options, I would suggest you to learn R regardless of whether you want to work in academia or in industry (more and more companies begin using R by the way). But if you want to stay away from coding, then go for STATA.
A heated debate on R vs. SAS
started in August 2012 in Cross-Validated with the question “R
vs SAS, why is SAS preferred by private companies?" The
last entry was on April 2014. Among the posters Frank
Harrell is the maintainer of the Hmisc package in R. “The
package contains many functions useful for data analysis, high-level
graphics, utility operations, functions for computing sample size and
power, importing and annotating datasets, imputing missing values,
advanced table making, variable clustering, character string
manipulation, conversion of R objects to LaTeX code, and recoding
variables.”
Below I've biased myself by selecting
only Frank Harrell's comments from the debate and have them taken out
of context. However, I just wanted them to serve as teasers for the
whole discussion.
- I'm not sold by @PeterFlom 's point either. There are about 4000 packages in R. Not all have to be of the highest quality for add-on packages to have a net positive value. The number of reliable add-on packages exceeds the capabilities of SAS by a huge margin. (Aug 6 '12 at 20:06)
- True, but it's hard to penalize a statistical computing system for its comprehensiveness. Or to say it another way, R's way of doing something is better than another system's way of not doing it. (Aug 7 '12 at 12:29)
- I think these comments are not correct. In the server world, open source rules, and the Apache web server is the most popular web server. (Aug 11 '12 at 13:47)
- I'm hoping that the 2nd edition of RMS will be available in just over a year. (Aug 12 '12 at 13:48)
- I'm
not familiar with that world but I suspect that scientists have more
freedom than they think. (Aug 12
'12 at 13:49)
- There is nothing that needs to be done to redo regulatory approval for the sake of switching to R. (Aug 11 '12 at 18:58)
- What is the alternative to downloading a package that provides new capabilities (as most R packages do)? Is it to home grow those capabilities? Is that more reliable? (Aug 11 '12 at 18:59)
- SAS comes with the same warranty as R: none. (Jan 8 '13 at 13:26)
- Yes, people have to some extent discover R on their on. But much of the issue comes down to inertia of learning a new language. New languages are always coming out that have advantages over older languages yet users cling to the old languages (witness COBOL). Programming in SAS is hugely inefficient, requiring perhaps double the number of programmers to do the same job as R, but SAS experts are happy to hum along on their merry way and companies are afraid of the kind of disruption that would save them millions of dollars in salaries. (Jan 8 '13 at 13:33)
Among the comments the following anti
open source remark from the SAS representative was most remarkable
and received an uproar. You should not miss reading the links
provided.
...
I think the worst anti open source quote I've heard was from SAS
saying soemthing like 'would you trust a jumbo jet designed in open
source, an engine might drop off'(PaulHurleyuk, Aug 7 '12 at 12:37)
@PaulHurleyuk:
+1 The quote was “We have customers who build engines for aircraft.
I am happy they are not using freeware when I get on a jet.” by a
SAS marketing director in this
New York Times article on R. The SAS representative clarified
her remarks in a later blog post. (jthetzel, Aug 7 '12 at 12:53)
Note: The first link in above quote
doesn't work. I looked for and found it here.
As usual, I would say the best way will
be to read all these and more and then make up your mind on your own.
As poor small guys we hardly need a second look.