Pandas Simple Programs

Customarily, we import pandas as follows
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: import matplotlib.pyplot as plt
See the top & bottom rows of the frame using head() and tail()

Describe shows a quick statistic summary of your data

For getting a cross section using a label

df.loc[dates[0]]

Select via the position of the passed integers

In [32]: df.iloc[3]

By integer slices, acting similar to numpy/python

In [33]: df.iloc[3:5,0:2]

pandas primarily uses the value np.nan to represent missing data. It is by default not included in computations. 

Examples:

Example1:

import pandas as pd
data=pd.read_csv("Pandas.csv")
data.drop_duplicates()
mark=data["marks"]
marmean=mark.mean()
data.fillna(marmean,inplace=True)
data['rollno'].describe()
data['rollno'].value_counts().head(10)
data.corr()

Output:

              rollno               marks
rollno  1.000000         -0.028862
marks -0.028862         1.000000


Example2:

import pandas as pd
data=pd.read_csv("Pandas.csv")
#mark=data[["marks","rollno"]]
#type(mark)
#print(mark.head())
#print(data)
xy=data.iloc[1:5]
xy

Output:


      name      rollno    marks
1    John       1206       15.0
2    Jane       1207       18.0
3    Kane      1250       16.0
4    William  1236       13.0

Example3:

import pandas as pd
data=pd.read_csv("Pandas.csv")
#data[data["marks"]>=13]
#cond=data["rollno"]==1236)
#cond
data[data["rollno"]<=1210].head(10)
#con=data.iloc[1:5]
#print(con)

Output:



Example4:


import pandas as pd
data=pd.read_csv("Pandas.csv")
data[(data["rollno"]==1201) & (data["marks"]>=15)]


Output:


        name      rollno   marks
0      Shashi   1201.0   20.0
10    Shashi   1201.0   20.0




Example5:


import pandas as pd
data=pd.read_csv("Pandas.csv")
data.dropna()
data.drop_duplicates(inplace=True)
#data[data["name"].isin(["Shashi","Jane"])]
#data[(data["marks"]>15) & (data["age"]>17)]
def topper(x):
 if(x<15.0 and x>18.0):
    return "Good"
 elif(x<18.0):
    return "Excellent"
 elif(x>15.0):
    return "Improve Next"
data["Suggestions"]=data["marks"].apply(topper)
data


Output:




Example6:

import pandas as pd
import matplotlib.pyplot as plt
data=pd.read_csv("Pandas.csv")
data.dropna()
data.drop_duplicates(inplace=True)
#data.plot(kind="bar",x="rollno",y="marks",title="rollno vs marks")
data["marks"].plot(kind="hist",title="marks")
data["age"].plot(kind="box")


Output: