Overview of Python in Data analytics
Data Analytics

Overview of Python in Data analytics

Introduction:

Python is one of the most important programing languages which is used in the data analytics nowadays, Many data scientists prefer it for its simplicity and it’s helpful libraries like pandas, Matplotlib, SciKit-Learn, BeautifulSoup, and PyTorch. It helps in processing complex data in this era where the data is in petabyte.

Essential Python libraries introductory:

NumPy:

Is the short for Numerical Python it is important in the scientific computing analysis, provides narray objects efficiently, and linear algebra operations

Pandas:

Pandas’ name came from panel data. It helps in making structuring data easier by providing data structure and functions made especially for it.

Matplotlib:

Is the best known Python library for providing interactive plots as well as many 2D data

Visualizations.

IPython:

It provides a robust and productive environment for interactive and exploratory computing,it provides a mathematica to connect IPython through a web browser, and  an infrastructure for interactive parallel and distributed computing.

SciPy:

A packages addressing different scientific computing issues

 

Python can be used in web applications as well as the desktop applications it is well known for its simplicity all thank to its libraries. It is object-oriented, high-level programming language with dynamic semantics.


Python in Data analytics applications:

RFM:

Recency, frequency, and mandatory is important in the business analytics industry.

Recency helps in determining the last time the costumer purchased.

Frequency helps in determining how often the costumer purchases.

Mandatory helps in determining how much the costumer spends when he is purchasing from us.

The RFM helps the business analytics to know their costumers more thus increase the profits, The customers are then ranked according to their RFM values.

Python helps in calculating the RFM with its libraries ( pandas, DateTime, and NumPy)

Pandas helps in reading the data , DateTime helps in calculating the difference between the dates to determine the recency and the frequency, and NumPy helps in ranking the customers in order to determine the most and least loyal ones and after determining them the business analytics can decide how to gain more customers.


Web scraping:


Is a process of collecting raw data from the Web using automated method, But some webs forbid scrapping and they have their good reasons to protect their data.

Python provide easy ways to make the web scrapping more powerful.

Urllib is python standard library which helps in dealing with links to help accessing the web we want to scrap, BeautifulSoup helps in scraping the information from the web


Market basket analysis:


It is one of the best applications in retrial industry, industries need to mine and analyst their database to understand the data’s pattern, Correlation Relationships among the data is very helpful in transactions, decision making and recognizing the customer’s behavior in a large data set.

To determine the history of:

Products that are likely purchased together

Products that are likely sequentially purchased

Products that are purchased seasonally  

It helps in choosing the best promotion, increase revenue and decrease the expenses

First we need to calculate the support by using the sum of the two items together then dividing them by the total number of all the items.

And the confidence of item one to item two by taking the sum of the two items showing together and dividing them by the total of item 1 showing.

After that we will be able to calculate the lift by dividing the confidence of item 1 to item 2 by item1 divided by item two

Limitations:

1-It takes long time to be implemented and may require regression and decision tree analysis skills and other more.

2-Sometimes hard to determine the product groupings

3-Complexity grows exponentially with size 

Association Rule for Market basket Analysis:

The market places use the association rule to know which application is more likely to be purchased after the other (antecedent    ,         consequent)

Association rule have associated population which consists of instances.

MBA by python:

We use the pandas, NumPy and apyori (used as API).

  • Salma Allam
  • Mar, 28 2022

Add New Comments

Please login in order to make a comment.

Recent Comments

Be the first to start engaging with the bis blog.