Python Tutorial

Python Introduction Python History & Versions Python Installation Python Interactive Shell Python Self Help Python3.8 Features & Updates Python Variables Python Data types Python Control Statements Python If Statements Python For loop Python While Loop Python Break Statement Python Continue Statement Python Functions Python Functions Python Default Arguments Python Keyword Arguments Python Positional Arguments Python Arbitrary Arguments Python Lambda Expression Python Variable Scopes Python Data Structures Python List Python List Comprehension Python Tuple Python Nested Tuple Python Set Python FrozenSet Python Dictionary Python miscellaneous Topics Python Operators Python Del Statement Python String Python String Formatting Python Date Python Regex Python Exception Handling Python Programming Exercise Learn Python Programs Python Library Learn Pandas Pandas Interview Questions Python DataBase Handling Python MySQL Connectivity Install MongoDB Python MongoDB Connectivity Python Built-In Methods/Functions Python built-in methods

How To Delete Rows From A Pandas DataFrame Based On A Conditional Expression?

A DataFrame is the primary data structure of the Pandas library in Python and is commonly used for storing and working with tabular data.

A common operation that could be performed on such data is to conditionally delete rows from the DataFrame in order to remove unwanted information.

To start working with Pandas, we first need to import it :

import pandas as pd

Running Example

Let us understand this operation with the help of an example. Consider the following DataFrame containing 3 students with names A, B and C and their corresponding marks (out of 10) for two subjects, Mathematics and Physics.

Delete Rows From A Pandas DataFrame

Code snippet for generating the above DataFrame :

Python 3 Code :

import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7, 9, 8]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Printing the DataFrame
print(df)

Here, data is a dictionary we created to initialize the DataFrame. For this, we use the DataFrame() function of the Pandas library which takes the dictionary as an argument and returns the required DataFrame.

Now, let’s say we need to delete the rows where the corresponding students have 8 marks in Physics.

Since the student with the name as C meets this condition, the corresponding row would be deleted. The resulting DataFrame would look like this :

Delete Rows From A Pandas DataFrame

Let us look at different ways of performing this operation on a given DataFrame :

Using the DataFrame.drop() function to delete dataframe rows

In this method, we use the DataFrame.drop() function to delete rows from a given DataFrame based on a specified condition.

Here, we want to delete the corresponding row of the students that have exactly 8 marks in physics. We do this by passing the corresponding index values that are obtained by using the DataFrame.index property as a parameter to this function.

The required index values are extracted using the condition inside square brackets along with the name of the DataFrame.

Here the condition becomes df[‘Physics’] == 8. df[‘Physics’] is used to access the column containing the physics marks of the students that are stored as a Pandas Series.

By default, the update in the DataFrame does not occur in an inplace manner so reassignment of the resulting DataFrame is required.

Let us take a look at the corresponding code snippet and generated output for this method :

# Importing pandas
import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7, 9, 8]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Performing the operation
df = df.drop(df[df['Physics'] == 8].index)

# Printing the resultant DataFrame
print(df)

Output :

Delete Rows From A Pandas DataFrame

The above operation can also be performed in an inplace manner, by setting the inplace parameter of the DataFrame.drop() function as True.

This method does not require reassignment of the DataFrame. Let us take a look at the corresponding code snippet and generated output for this method:

# Importing pandas
import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7, 9, 8]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Performing the operation
df.drop(df[df['Physics'] == 8].index, inplace = True)

# Printing the resultant DataFrame
print(df)

Output :

Delete Rows From A Pandas DataFrame

Instead of using df[‘Physics’] to access the physics column of the DataFrame as shown earlier, we can also use df.Physics to access it instead and get the same results.

Let us take a look at the corresponding code snippet and generated output for this method :

# Importing pandas
import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7, 9, 8]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Performing the operation
df = df.drop(df[df.Physics == 8].index)

# Printing the resultant DataFrame
print(df)

Output :

Delete Rows From A Pandas DataFrame

Using filters to delete dataframe rows in pandas

In this method, we use the concept of filters in pandas to delete rows from a given DataFrame based on a specified condition.

Here we want to delete the corresponding rows of the students with exactly 8 marks in physics.

This means that we want to retain all the students whose marks in physics were not equal to 8.

We can specify this using the condition df[‘Physics’] != 8 where df[‘Physics’] is used to access the physics column of the DataFrame.

# Importing pandas
import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7, 9, 8]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Performing the operation
df = df[df['Physics'] != 8]

# Printing the resultant DataFrame
print(df)

Output :

Delete Rows From A Pandas DataFrame

Instead of using df[‘Physics’] to access the physics column of the DataFrame as shown earlier, we can also use df.Physics to access it instead and get the same results. Let us take a look at the corresponding code snippet and generated output for this method :

# Importing pandas
import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7, 9, 8]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Performing the operation
df = df[df.Physics != 8]

# Printing the resultant DataFrame
print(df)

Output :

Delete Rows From A Pandas DataFrame

Conclusion

In this topic, we have learned to delete rows from an existing Pandas DataFrame based on certain conditions, following a running example of test scores of students in different subjects, thus giving us an intuition of how this concept could be applied in real-world situations. Feel free to reach out to info.javaexercise@gmail.com in case of any suggestions.

Trending

Python | Pandas DataFrame add_prefix() Method

Python Program To Get Product Of List Elements

Python Program to Check Palindrome Number

How to Create a List of Lists or Nested List in Python?