Learn Basics of Data Analysis with R
Introduction
Once someone decides to do a career shift to Data Analysis or even start a new career in this field, he gets rained with suggestions of using Python. I was one of those who believed that there is no other software or tool that tops Python when it comes to Data Analysis.
On the other hand, It's known that R is a programming language used for statistical computing and graphics and is so useful for data miners, statisticians, and data analysts.
Why do I need to learn R?
Being a data analyst doesn't mean always sticking to one tool, but having the ability and potential to work on multiple tools and software to get your work done. For example, say that you worked for company A as an entry-level data analyst using Python or SQL (for Databases) and decides to move to another company for a higher salary or to gain more experience, but company B you are willing to move to is using SQL (for Databases which you're already familiar with and R, what would you do? Of course, you need to adapt yourself to the company's framework, especially if you're going to work in a team. That's why scratching the surface of R and learning a little bit of its syntax is a big plus for you as an Analyst. Moreover, R offers easy and simple syntax so that with one short line of code, you can understand much.
Short comprehension between Analysis with Python & R
While Python provides you with many libraries and packages to perform specific tasks, R also gives you that chance. For example, the most well-known data manipulation package in Python is "Pandas". In R, It's much similar to "Dplyr".
It's a package that offers various functions and techniques to not only discover your data but also get some insights from it with probably similar and much simple syntax than Python which makes the way pretty easier if you are already familiar with Python.
Examples
For example, let's see a dataset stored in a data frame named "df", so first, we will load the necessary libraries, read the data and look at the first 5 rows:
Python
________________________________________
R
You see! very similar and quite easier. Let's try another one. Assume that the dataset "df" contains data about products and has a column for the unit price of each product and another for the quantity sold. The management has asked you to perform a task to get data where the unit price is greater than or equal to $5 and the quantity is greater than 2 units. Take a look at how we handle the task in both languages:
Python
________________________________________
R
I guess you will agree with me that the R code line is easy to comprehend and simple to write. You just have to write the data frame name, followed by (%>%) "called a pipe", and then, pass your conditions separated by commas.
Other features
As you know, Managers don't have the time to go through every number and calculation to figure out whether the business is doing well. R also gives you the opportunity to steal the show from anyone else in the room with an awesome package for Data visualization called 'ggplot2', similar to 'matplotlib' in Python.
Python has a familiar IDE (Integrated Development Environment) known as "Jupyter Notebook". R also has a special IDE to write R codes as well as HTML & CSS codes as we mentioned above known as "R Studio".
Unlike Python, R enables you to create customizable R Markdown files, which allows you to build readable, flexible, and inspiring reports about your data in HTML or PDF form by combing HTML and CSS codes with R codes in the same file.
So, Is R better than Python now?
The answer is "it depends". Obviously, it depends on where you work, what fits you best and, more. The bottom line is that you need to try everything in your path as a data analyst to build a strong background that enables you to quickly adapt and improve wherever you go.
Do you remember when we mentioned the case of moving from Company A to Company B? If you are one of those who stick to one tool and that's it for you, certainly, you will get most of your work done, but you will be restricted and limited if you tried to move to Company B. It's going to be a bit complicated and frustrating for you.
On the other hand, if you gained knowledge about the various tools in the business, the process of moving from Python to R won't be as hard as in the first scenario. It's going to be interesting and challenging for you.
Thank You.
- Ahmed Yasser
- Mar, 27 2022