Hello everyone,
Let's continue with data frame techniques using pandas in python.
Let's consider the example of Student marks discussed previously.
If you have not visited the part1 of data frame techniques please go through this link
https://spacewidget.blogspot.com/2021/09/python-pandas-data-frame-techniques-for.html
df=pd.DataFrame(data)
df
1. To get the information of the data frame
Syntax:
dataframevariable.info()
df.info()
Output:
It returns the summary of the data frame that includes the number of rows in the data (entries) with their index range, number of columns, the datatype of each column, and memory usage.
2. To get the number of rows and columns of the data frame
a)Syntax:
dataframevariable.shape
df.shape
It returns the number of rows and columns in the data frame. The first element in the result indicates the number of rows, and the second element indicates the number of columns in the data frame.
Output:
(9, 5)
Here 9 indicates the number of rows, and 5 indicates the number of columns in the data frame.
b) Accessing the first element in df.shape --> Rows in data frame
df.shape[0]
Output:
9
c) Accessing the second element in df.shape --> Columns in data frame
df.shape[1]
Output:
5
d) Number of rows in the data frame can also be obtained using len()
len(df)
Output:
9
e) Number of columns in the data frame can also be obtained using len()
len(df.columns)
Output:
5
3. To get the total number of elements in the data frame
Syntax:
dataframevariable.size
df.size
Output:
45
Number of elements= Number of rows *Number of columns
Here we have 9 rows and 5 columns, so the size of a data frame is 9*5=45. Therefore the data frame has 45 elements.
4. To access a particular column in the data frame
Syntax:
dataframevariable['column name']
Ex:
Accessing the Maths marks of all the students
df['Maths']
Output:
5. To access two or more columns in the data frame
Syntax:
dataframevariable[['column name1','column name2']]
Ex:
Accessing the Maths and English marks of all the students
df[['Maths','English']]
Output:
6. To get the unique values of a column in the data frame
Syntax:
dataframevariable['column name'].unique()
unique() returns all the unique values in the specified column. That means it does not show the repeated values in that column.
Ex: Finding the unique values in Maths marks
df['Maths'].unique()
Output:
array([52, 80, 65, 94, 36, 96, 72], dtype=int64)
In Maths marks column 52 and 94 was repeated two times (Check the data frame at the start of the page), as unique() returns only unique values, these values were not repeated again in the result.
7. To get the number of unique elements in the column of the data frame
Syntax:
dataframevariable['column name'].nunique()
Ex: Finding the unique number of elements in the Maths marks column
df['Maths'].nunique()
Output:
7
Here we have 7 unique values and those values were [52, 80, 65, 94, 36, 96, 72] which was given by df['Maths'].unique()
8. To get the total number of elements in the column of the data frame
Syntax:
dataframevariable['column name'].count()
Ex: Finding the total number of elements in the Maths marks column
df['Maths'].count()
Output:
9
9. To get the values and its frequency of occurrence of a column in the data frame
Syntax:
dataframevariable['column name'].value_counts()
Ex: Finding the frequency of observation of values in the Maths marks column
df['Maths'].value_counts()
Output:
10. To get the statistics of a particular column in the data frame
a) Finding the minimum value in a column of a data frame
Syntax:
dataframevariable['column name'].min()
Ex: Finding the minimum marks obtained in Maths subject
df['Maths'].min()
Output:
36
b) Finding the maximum value in a column of a data frame
Syntax:
dataframevariable['column name'].max()
Ex: Finding the maximum marks obtained in Social subject
df['Social'].max()
Output:
98
c) Finding the average value in a column of a data frame
Syntax:
dataframevariable['column name'].mean()
Ex: Finding the average marks obtained in Science subject
df['Science'].mean()
Output:
59.888888888888886
d) Finding the standard deviation of a column in the data frame
Syntax:
dataframevariable['column name'].std()
Ex: Finding the standard deviation of marks obtained in Maths subject
df['Maths'].std()
Output:
21.643577440997237
Hope this was helpful for beginners on how to start analyzing the data. Keep going...
Comments
Post a Comment