Skip to content

Should I Learn Python or R?

Sharing This Article Won't Cost You Nothing Except Love :)

This is with no doubt one of the greatest dilemma of every person that wishes to get into the data science realm. You just don’t know which tool will get you there. I remember myself experiencing the same, back when I was trying to find my way into data analytics. I spent months (and literally a lot of months) wondering which programming language or tool would take me to the industry. The good thing however is that, I finally got myself into the industry. And good enough, I got to learn data analytics using both of the two main giants of the data analytics world – Python and R. I have used both of them. The best thing is that Today in this article, I am going to give you my honest review on which language is good for you. If you want to learn data analytics and you are wondering whether you should learn Python or R, this article is for you.

The way that I want to help you make your decision is that I will give you the power of each language – from my own experience of using the two languages. And then you will decide which one suits your needs.

To give you a hint of the power of each of the two languages, we shall look into the following:

  1. Access to learning resources – materials and learning network
  2. Purpose of learning – statistical analysis, machine learning & data science in general
  3. The environment to working with the tool itself (IDEs)
  4. Their visualization power
  5. Easiness of the language

Alright. From my own experience, I believe the best language for you to learn depends on how easily you can access its learning resources. Let me expound on that point below.

1. Access to learning resources

In terms of learning resources, I am talking about the open support that you can find when you are learning the tool. Python and R. Which one can give you enough resources to learn from? When I was getting into data analytics, I learnt Python first, and then R later. From my experience of learning the two languages, I found that Python gave me more access to resources than I did with R. I had an easy access to openly available learning materials with Python as compared to R. Even in terms of community, there are more people that are using Python than R. So, that gave me a fast learning curve in learning Python because I had a large active community around the world that I could learn with.

There are a lot of Python forums in almost every social media platform and on the internet. You will find regional groups, institutional groups and many groups for learning Python than there are for R. And as I found out later, this makes sense. It makes sense because, Python happens to be an older language in the industry. And even before gaining its use in data analytics, Python language was still there in Software, System and Web Development. That means, the language has been in use for a longer time. Which therefore means, it has been growing and growing over time. With R however, it was only used in Mathematics and Statistics back then. And because there wasn’t much demand for it (given that there are always fewer Mathematicians than Developers), it was not getting enough development and improvements as a programming language. So that basically means, Python has grown more enough than R. Hence, Python has more learning materials, resources, and community as compared to R.

Okay. You now know that there are more resources with Python than R. But that’s not enough to make you decide which language you should learn for data analytics. So let’s look at something even more intuitive. What is your goal in learning data analytics? What exactly do you want to focus on?

2. What’s Your Purpose of Learning?

Is it for statistical analysis? Is it for machine Learning? Or just data science in general? So when I was learning data analytics first, I was more into data science in general and machine learning. And as I have already told you, I was using Python at the time. Later on, when I was taking my statistics classes in my junior years of college, I got to use R. And from that time on, I found myself actively using the two languages interchangeably. So I will tell you the power of each of the two languages in terms of each of the 3 areas (let’s just focus on these 3 for now).

When it comes to just handling simple data and doing simple data analysis, the two languages have almost the same power. So if you have been analyzing your data in Excel and you want to learn how you can do that in a more advanced software or tool, you should not debate on whether to go for R or Python. Any of them will help you on that. If you are more advanced in data analytics however, you might want to focus on a specific area of data analytics. And that, I believe, is why you are debating on whether you should learn Python or R.

If you want to get into data science and machine learning in general,

I find that Python has more tools for handling that. There are more well-developed tools for doing big data analytics in Python than there are with R. For example, Python has two of its well-developed libraries for big data analysis and manipulation – Pandas and NumPy. R also has a big data analysis and manipulation package called Tidyverse. However, I find the Python libraries more robust and just up to the task when it comes to data analysis and manipulation. If you want to get into machine learning, I will confidently say to you that Python will be your language for you to go. Python is more powerful when it comes to doing Machine Learning. It has a huge library called Scikit-Learn, built specifically for Machine Learning. R is not just good enough for machine learning because it’s still on its development stage when it comes to the machine learning industry.

Now what if we look at statistical analysis?

Which one will be suitable for you? From being a Math and Statistics student at some point, I find that R has more tools for mathematical and statistical processes. And this just makes enough sense to me. Because R was initially designed by mathematicians and statisticians for carrying out their statistical and mathematical operations. So I will give this fully to R. No debate.

3. The Working Environment for the Tool itself

Every programming or coding tool has an environment where you will be doing your coding and programming. In technical terms, we call this an Integrated Development Environment (IDE). For example if you want to code in HTML, you will use an IDE like Sublime Text, Visual Studio or even your PC’s Notepad. R and Python therefore have their own IDEs. The most common IDE for R is Rstudio. Python has more than one IDEs where you can code from. However, when it comes to data analytics, the most commonly used Python IDE is called Jupyter Notebook. If you have never heard of this IDE thing, please just bear with me (at least read the words without thinking – it will save you the headache).

So, which tool is better when it comes to its working environment – Python or R? Well, I will show you screenshots of the two languages and their workspaces. And then explain to you how their workspaces are designed.

RStudio Workspace VS Jupyter Notebook Workspace

R codes inside RStudio Workspace
Python codes inside Jupyter Notebook Workspace

From the above comparisons, you can see that the RStudio (IDE for R) workspace has four windows to work with. Basically, what you are seeing in the screen is a window for writing your codes, another window that executes the codes, a window for showing all the codes you have run inside your RStudio, and a window for displaying your visualizations. As you get to use RStudio yourself, you will realize that each of the four windows has different tabs inside. I will not go further into that in this article. For now, let’s move to the Jupyter Notebook (IDE for Python) workspace. You must have noticed that its workspace is just showing everything all in one space. It’s like a single file or document showing all your work in the same space – your codes, your graphs, and everything you are doing. In short the Jupyter Notebook shows all your input and output in one single space.

From my own experience working with both languages,

I find working with the Jupyter Notebook much interactive and easier to work with. This is because I love tracking my work progress, and the Jupyter Notebook just makes it easy for me to document my work. Another reason why I prefer the Jupyter Notebook to RStudio is when it comes to work sharing. Because it’s easy to document your work on Jupyter Notebook, it just makes it easy to share your work with anybody you like to. And even when you have somebody else that you are guiding or teaching how to code in Python, or just a friend you are learning together with, it’s much easier to do that with Jupyter Notebook. You can just send them your work, they open it with their Jupyter Notebook, and follow through with ease (as if it were some PDF tutorials).

Now I don’t mean you can not document your work if you are coding in R. You can. It’s just that when you are starting out, Jupyter Notebook is easier to work with than Rstudio. It might take you some time before you know how to produce a well-documented file in RStudio that can match a single Jupyter Notebook. The way the Jupyter Notebook is designed by default, is that everything appears like it is a PDF document. Therefore, for an easier to work with environment, I prefer Python language to R language.

4. The Power of Visualization

I intentionally didn’t want to include this in the second point above, because the truth is data visualization is one of the key areas in each of the data analytics use-cases. Whether you are doing simple data analysis, statistical analysis, machine learning or data science in general – you are always going to visualize data. And the truth is, data visualization is one area that makes your data analytics world more fun and intuitive. So let’s look at the power of data visualization. Which between R and Python is more powerful and better for visualization?

I feel like both R and Python are great for data visualization. They are powerful enough especially in visualizing big data. I however love the Rstudio data visualization interface, as compared to Python’s Jupyter Notebook interface. Rstudio has a separate window inside it for displaying visualizations. And inside the window, you can also easily download your visualizations as images in case you want to share them somewhere else. Maybe for presentation purposes or in preparing your reports. Even when you are just inside the workspace, you can zoom out the image and it pops up on your computer’s screen as a window. So if you are to present the visualization alone to your friends or any interested party, you can do exactly that with a single click from inside the workspace. I just find this tedious and less convenient with Python’s Jupyter Notebook. A lot of the times, saving your visualizations requires you to write some Python’s codes or go to the long process of saving image from your browser.

So, in terms of data visualizations, I find R better than Python.

5. Which Language is Easy to Learn and Understand?

If I am just to give you a simple, short and honest answer to that, I will say none. From my own experience of learning the two languages, I find that R and Python are almost the same. One, both languages are high-level languages. And they are also both dynamic languages, if I am to be technical. However, just to tell you in a layman’s language, I find both Python and R commands almost similar. I will not tell you much about that now. All I want you to do now is decide on which language you would love to learn first, based on the previous 4 comparisons above. Go through the comparisons and choose what works best for you. I only gave you my opinions from my own experience. And what I find not good for me might be good for you. So take what is best for you. The truth is – If I were to go back in time and decide on which one is best for me, I would still start with Python.

Now, I want to tell you something interesting about the two languages and why they are both giants in the data analytics space. What makes R and Python great for data analytics, is that they can handle big and huge data. I’m talking of millions and millions of rows that you cannot open with Excel or SPSS. They can also handle different kinds of data – numbers, texts, images, videos. All of them. And because there is a lot of these kind of data generated today (especially with rise in social media), there are not just better options for analyzing this data except R and Python. They are your go-to tools for big data analytics. So if you want to know how they work, go forward and start learning your preferred choice. I would recommend picking one language that you are comfortable to start with as a beginner. Later on when you are okay with your chosen language, you can start learning the other one. At that time you will know why I am telling you Python and R are almost the same.

Alright.

I hope this article helped you decide on which tool you can start learning, to get you into the data analytics world. I know it was a long article, but it was worth sharing with you what you had to know.

If you know someone that might need this article, tell them not to overthink anymore, because this is exactly what they have to know. Share the article with them, and they will love you for that.

Leave a Reply

Your email address will not be published. Required fields are marked *