I have embarked on a journey to learn Data Science with CodeAcademy. During my search/research I stumbled upon lots of discussions on choosing Python or R for learning Data Science. It seems to be a lot more important to choose either of those.
But as a newbie, want to understand, experts view? Why it’s important? I heard from people that R is an important language for Data Science to learn. Is that true?
Which one to choose? Why? If I learn Python, will I have to learn R also, or vice versa?
Any guidelines and help pls.
There’s so much out there on the web about this. If there is anything approaching a consensus, it’s that R is for you if you are have a good understanding of statistics, or find your studies carrying you in that direction. Most people agree that the native statistical ability of R is the best of the two.
For a gentler learning curve, but still with great packages for visualization (and, yes, statistics) in a general-purpose language in which you can also learn object-oriented programming, I’d go for Python.
I am not an expert in data science, so from a non-expert: Begin with Python. The basics of coding are the same in any language (it’s a basic theorem of computer science that all programming languages are fundamentally equivalent) , and it will be a while until you attain the proficiency you’ll need to delve into data. By then, R will come easily to you if you decide that you need it; you’ll have lost nothing by having studied beginning and intermediate Python.
As someone who knows both Python and R and used to work as a data scientist of sort, I fully agree with what patrickd314 says.
R is designed for statistical analysis. Academic researchers in political science (and increasingly in economics) are using R for their data analysis.
Python is a general purpose language, allowing you to do many things other than statistical analysis. And it is designed to make it easy to learn and less prone to errors.
Personally I found Python easier to learn than R. One problem with learning R is that there are several alternative ways to do the same thing, and for beginners this can be very confusing.
If you choose R, I’d recommend Garrett Grolemund & Hadley Wickham’s R for Data Science, available for free online at https://r4ds.had.co.nz/
Unlike many other R tutorials, it starts with data visualization. So you know how to produce final outcomes first, which will motivate you to learn all the rather tedious data cleaning parts.
Thank you Patrick. Sincerely appreciate.
Yes, it’s true that, there’s a lot on Web about it, but it’s more authentic to hear it from Codeacademy users, who explored the space on their own. It helps.
Thanks again for inputs. Very helpful.
Thank you masakudamatsu.
Sincerely appreciate. So it seems that it’s good, to begin with, Python and keep on exploring space with R, esp when it comes to Statistics.
Thanks a ton for help and the link to the book too.