SciPy for Data Science
What is Sci-Py?
SciPy is pronounced as "Sigh Pi". SciPy is a Python open-source library for resolving challenges in mathematics, science, engineering, and technology. It gives users the ability to alter and view data using a selection of high Python commands. SciPy is founded on the NumPy Python extension.
SciPy Origin History
SciPy Installation
Using the pip command, we can install the SciPy library. Pip stands for 'Pip Installs Packages,' which is a recursive acronym. It is a common package manager that can be found on almost all operating systems. Run the following command in the terminal to install:
#Install SciPy
pip install SciPy
Checking SciPy Version
Once you have installed the library, to check the version of the library code is below.
#SciPy Version
Import scipy
print(scipy.__version__)
Numpy VS SciPy Difference
Numpy | SciPy |
Numpy has a quicker processing speed. | SciPy has slower execution speed. |
There are a lot of functions available, however they aren't all well defined. | Contains fully featured versions of comprehensive versions of functions such as linear algebra. |
NumPy Arrays are multi-dimensional arrays containing objects of the same type, sometimes known as homogenous objects. | SciPy, on the other hand, does not contain any array ideas since it is more functional. It is not bound by any homogeneity limitations. |
SciPy Graph
Graphs are an important type of data structure.
For working with such data structures, SciPy includes the scipy.sparse.csgraph module.
Adjacency Matrix
A graph can be represented as a matrix of booleans (0s and 1s) using an adjacency matrix. On a computer, a finite graph may be represented as a square matrix, with the boolean value of the matrix indicating whether there is a direct path between two vertices.
For example, we have a graph below.
The connections in a graph like this, with components A, B, and C, are:
Weight 1 is linked to A & B.
Weight 2 is linked to A and C.
C and B are not linked.
This is how the Adjency Matrix might look:
A B C
A:[0 1 2]
B:[1 0 0]
C:[2 0 0]
Dijkstra
Dijkstra's algorithm is a method for determining the shortest pathways between nodes in a graph, which might be used to depict road networks. Edsger W. Dijkstra, a computer scientist, created it in 1956 and had it published three years later. Let’s see code for the Dijkstra.
Example
Find the shortest path from element 1 to 2:
Code:
import numpy as np
from scipy.sparse.csgraph import dijkstra
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(dijkstra(newarr, return_predecessors=True, indices=0))
Output:
Floyd Warshall
Floyd's algorithm, Roy-Floyd algorithm, Roy-Warhshall algorithm, or WFI algorithm are all names for the Floyd-Warhshall algorithm.
The Floyd-Warshall Algorithm is a weighted graph algorithm that finds the shortest path between all pairs of vertices. This approach works for weighted graphs that are both directed and undirected. It does not, however, operate for graphs with negative cycles (where the sum of the edges in a cycle is negative). To identify the shortest pathways, this technique uses a dynamic programming approach.
Let’s see code for the Flyod Warshall
Example
Find the shortest path between all pairs of elements:
Code:
import numpy as np
from scipy.sparse.csgraph import floyd_warshall
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(floyd_warshall(newarr, return_predecessors=True))
Output:
Bellman Ford
The method was initially presented by Alfonso Shimbel (1955), but is instead called after Richard Bellman and Lester Ford Jr., who published it in 1958 and 1956, respectively. Edward F. Moore also published a variant of the method in 1959, and for this reason it is sometimes named the Bellman–Ford–Moore algorithm. The Bellman–Ford algorithm finds the shortest pathways in a weighted digraph from a single source vertex to all of the other vertices. It is slower than Dijkstra's method for the same issue, but it is more adaptable since it can handle graphs with some negative edge weights.
Example
Find the shortest path from element 1 to 2 with given graph with a negative weight:
Code:
import numpy as np
from scipy.sparse.csgraph import bellman_ford
from scipy.sparse import csr_matrix
arr = np.array([
[0, -1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(bellman_ford(newarr, return_predecessors=True, indices=0))
Output:
Depth First Order
The depth-first search (DFS) technique is used to traverse or explore data structures such as trees and graphs. The algorithm starts from the root node (in the case of a graph, any random node can be used as the root node) and examines each branch as far as feasible before retracing.
In the 19th century, French mathematician Charles Pierre Trémaux[1] investigated a version of depth-first search as a maze-solving strategy.
Example
Transverse the graph depth first for given adjacency matrix:
Code:
import numpy as np
from scipy.sparse.csgraph import depth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(depth_first_order(newarr, 1))
Output:
Breadth First Order
The breadth-first search (BFS) technique is used to search a tree data structure for a node that meets a set of criteria. It begins at the root of the tree and examines all nodes at the current depth level before moving on to nodes at the next depth level. To maintain track of the child nodes that have been encountered but not yet inspected, more memory, generally a queue, is required.
Example
Transverse the graph breadth first for given adjacency matrix:
Code:
import numpy as np
from scipy.sparse.csgraph import breadth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(breadth_first_order(newarr, 1))
Output:
- Avani Popat
- Mar, 11 2022