Python Programming

Python Pandas Series basics with Examples

Python Pandas:

pandas is a Python library that serves fast, flexible, and eloquent data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the basic high-level building block for doing practical, real world data analysis in Python. The two constitutional data structures of Python Pandas, Series (one-dimensional) and DataFrame (two-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. Pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other third-party libraries. Pandas is well suited for inserting and deleting columns from DataFrame, for easy handling of missing data (represented as NaN), explicitly aligning data to a set of labels, converting data in other Python and NumPy data structures into DataFrame objects, intelligent label-based slicing, indexing, and subsetting of large data sets, merging and joining of data sets, and flexible reshaping. Additionally, it has robust input/output tools for loading data from CSV files, Excel files, databases, and other formats. You have to import a Pandas library to make use of various functions and data structures defined in Python Pandas.

 import pandas as pd

Python Pandas is usually renamed as pd.


Amazon Purchase Links:

Top Gaming Computers

Best Laptops

Best Graphic Cards

Portable Hard Drives

Best Keyboards

Best High Quality PC Mic

Computer Accessories

*Please Note: These are affiliate links. I may make a commission if you buy the components through these links. I would appreciate your support in this way!

Python Pandas Series:

Series is a 1-dimensional labeled array adept of holding any data type (integers, strings, floating-point numbers, Python objects, etc.). The axis labels are accordingly referred to as the index. Python Pandas Series is created using series() method and its syntax is,

s = pd.Series(data, index=None)

Here, s is the Pandas Series, data can be a Python dict, a ndarray, or a scalar value (like 5). The passed index is a list of axis labels. Both integer and label-based indexing are supported. If the index is not arranged, then the index will default to range(n) where n is the length of data. For example, Create Series from ndarrays

>>> import numpy as np
>>> import pandas as pd
>>> s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
>>> type(s)
<class 'pandas.core.series.Series'>
>>> s
a -0.367740
b 0.855453
c -0.518004
d -0.060861
e -0.277982
index
dtype: float64
>>> s.index
Index(['a', 'b', 'c', 'd', 'e'], dtype='object')
>>> s.values
array([-0.367740, 0.855453, -0.518004, -0.060861, -0.277982])
>>> pd.Series(np.random.randn(5))
0 0.334947
1 -2.184006
2 -0.209440
3 -0.492398
4 -1.507088
dtype: float64

Import NumPy and Pandas libraries. Create a series using ndarray which is NumPy’s array class using Series() method  which returns a Pandas Series type s. You can also specify axis labels for index, i.e., index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’]. When data is a ndarray, the index must be the same length as data. In series s, by default the type of values of all the elements is dtype: float64. You can find out the index for a series using index attribute. The values attribute returns a ndarray  containing only values, while the axis labels are removed. If no labels for the index is passed, one will be created having a range of index values [0,…, len(data) – 1].



Python Pandas Create Series from Dictionaries

>>> import numpy as np
>>> import pandas as pd
>>> d = {'a' : 0., 'b' : 1., 'c' : 2.}
>>> pd.Series(d)
a 0.0
b 1.0
c 2.0
dtype: float64
>>> pd.Series(d, index=['b', 'c', 'd', 'a'])
b 1.0
c 2.0
d NaN
a 0.0
dtype: float64

Series can be created from the dictionary. Create a dictionary and pass it to Series() method. When a series is created using dictionaries, by default the keys will be index labels. While creating series using a dictionary, if labels are passed for the index, the values corresponding to the labels in the index will be pulled out. The order of index labels will be preserved. If a value is not associated for a label, then NaN is printed. NaN (not a number) is the standard missing data marker used in pandas.


Create Series from Scalar data

>>> import numpy as np
>>> import pandas as pd
>>> pd.Series(5., index=['a', 'b', 'c', 'd', 'e'])
a 5.0
b 5.0
c 5.0
d 5.0
e 5.0
dtype: float64

You can create a Python Pandas Series from scalar value. Here scalar value is five. If data is a scalar value, an index must be arranged. The value will be repeated to match the length of the index.

 

Python Pandas Series Indexing and Slicing

>>> import numpy as np
>>> import pandas as pd
>>> s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
>>> s
a 0.481557
b 2.053330
c -1.799993
d -0.396880
e -1.270751
dtype: float64
>>> s[0]
0.48155677569897515
>>> s[1:3]
b 2.053330
c -1.799993
dtype: float64
>>> s[:3]
a 0.481557
b 2.053330
c -1.799993
dtype: float64
>>> s[s > .5]
b 2.05333
dtype: float64
>>> s[[4, 3, 1]]
e -1.270751
d -0.396880
b 2.053330
dtype: float64
>>> s['a']
0.48155677569897515
>>> s['e']
-1.270750548062543
>>> 'e' in s
True
>>> 'f' in s
False

You can provide index or slice data by index numbers in a Python Pandas Series. You can also specify a Boolean array indexing for Pandas Series. Multiple indices are specified as a list in. The index can be an integer value or a label _. Values associated with labeled index are extracted and displayed _– . Check for the presence of a label in Series using in operator .


Python Pandas Working with Text Data

The Pandas Series supports a set of string processing methods that make it easy to operate on each element of the array. These methods are accessible via the str attribute and they generally have the same name as that of the built-in Python string methods.

>>> import numpy as np
>>> import pandas as pd
>>> empires_ds = pd.Series(["Vijayanagara", "Roman", "Chola", "Mongol",
"Akkadian"])
>>> empires_ds.str.lower()
0 vijayanagara
1 roman
2 chola
3 mongol
4 akkadian
dtype: object
>>> empires_ds.str.upper()
0 VIJAYANAGARA
1 ROMAN
2 CHOLA
3 MONGOL
4 AKKADIAN
dtype: object
>>> empires_ds.str.len()
0 11
1 5
2 5
3 6
4 8
dtype: int64
>>> tennis_ds = pd.Series([' Seles ', ' Graph ', ' Williams '])
>>> tennis_ds.str.strip()
0 Seles
1 Graph
2 Williams
dtype: object
>>> tennis_ds.str.contains(' ')
0 True
1 True

2 True
dtype: bool
>>> marvel_ds = pd.Series(['Thor_loki', 'Thor_Hulk', 'Gamora_Storm'])
>>> marvel_ds.str.split('_')
0 [Thor, loki]
1 [Thor, Hulk]
2 [Gamora, Storm]
dtype: object
>>> planets = pd.Series(["Venus", "Earth", "Saturn"])
>>> planets.str.replace("Earth", "Mars")
0 Venus
1 Mars
2 Saturn
dtype: object
>>> letters_ds = pd.Series(['a', 'b', 'c', 'd'])
>> letters_ds.str.cat(sep=',')
'a,b,c,d'
>>> names_ds = pd.Series(['Jahnavi', 'Adelmo', 'Pietro', 'Alejandro'])
>>> names_ds.str.count('e')
0 0
1 1
2 1
3 1
dtype: int64
>>> names_ds.str.startswith('A')
0 False
1 True
2 False
3 True
dtype: bool
>>> names_ds.str.endswith('O')
0 False
1 False
2 False
3 False
dtype: bool
>>> names_ds.str.find('J')
0 0
1 -1
2 -1
3 -1
dtype: int64

Various string methods to operate with Python Pandas Series is discussed.

Engr Fahad

My name is Shahzada Fahad and I am an Electrical Engineer. I have been doing Job in UAE as a site engineer in an Electrical Construction Company. Currently, I am running my own YouTube channel "Electronic Clinic", and managing this Website. My Hobbies are * Watching Movies * Music * Martial Arts * Photography * Travelling * Make Sketches and so on...

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button