Javaexercise.com

How to Convert Floats To Ints In Pandas Dataframe

A DataFrame is the primary data structure of the Pandas library and is commonly used for storing and working with tabular data. A common operation that could be performed on such data is to convert entries of a column of floats data type to int data type in order to add more information to it.

To start working with Pandas, we first need to import it to the Python code :

Python 3 Code :

import pandas as pd

Running Example

Let us understand this operation with the help of an example. Consider the following DataFrame containing 3 students with names A, B, and C and their corresponding marks (out of 10) for two subjects, Mathematics and Physics. Note that the entries in the column Physics are all of data type float.

Convert Floats To Ints In Pandas

Code snippet for generating the above DataFrame : 

Python 3 Code : 

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7.0, 9.0, 8.0]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Printing the DataFrame
print(df)

Here, data is a dictionary we created to initialize the DataFrame. For this, we use the DataFrame() function of the Pandas library which takes the dictionary as an argument and returns the required DataFrame.

Now, let’s say we need to convert the entries in the column Physics from float data type to int data type. The resulting DataFrame would look like this :

Convert Floats To Ints In Pandas

Let us look at different ways of performing this operation on a given DataFrame : 

Convert Dataframe column from float to int data type Using the Series.astype() function

This method is pretty straightforward and is the most commonly used one. In this method we use the Series.astype() function with the required data type, int here, passed as a parameter.

Here, df[‘Physics’] is used to access the column with the label Physics in the dataframe. The changes are not made in place so we need to reassign the column.

Let us look at the Python code and corresponding output for this method.

Python 3 Code : 

import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7.0, 9.0, 8.0]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Converting the data type of column Physics from float to int
df['Physics'] = df['Physics'].astype('int')

# Printing the new dataframe
print(df)

Output : 

Convert Floats To Ints In Pandas

Another way to perform the same operation would be to use df.Physics to access the column labeled as Physics instead of using df[‘Physics’]. Let us look at the Python 3 code and corresponding output for this method - 

Python 3 Code : 

import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7.0, 9.0, 8.0]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Converting the data type of column Physics from float to int
df.Physics = df.Physics.astype('int')

# Printing the new dataframe
print(df)

Output : 

Convert Floats To Ints In Pandas

Convert Dataframe column from float to int data type Using the Series.apply() function

This method is the most commonly used one. In this method we use the Series.apply() function with the required data type, int here, passed as a parameter.

Here, df[‘Physics’] is used to access the column with the label Physics in the dataframe. The changes are not made in place so we need to reassign the column.

Let us look at the Python 3 code and corresponding output for this method.

Python 3 Code : 

import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7.0, 9.0, 8.0]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Converting the data type of column Physics from float to int
df['Physics'] = df['Physics'].apply(int)

# Printing the new dataframe
print(df)

Output : 

Convert Floats To Ints In Pandas

Another way to perform the same operation would be to use df.Physics to access the column labeled as Physics instead of using df[‘Physics’]. Let us look at the Python code and corresponding output for this method.

Python 3 Code : 

import pandas as pd

# Dictionary for our data
data = {'Name' : ['A', 'B', 'C'], 'Mathematics' : [8, 5, 10], 'Physics' : [7.0, 9.0, 8.0]}

# DataFrame for the dictionary
df = pd.DataFrame(data)

# Converting the data type of column Physics from float to int
df.Physics = df.Physics.apply(int)

# Printing the new dataframe
print(df)

Output : 

Convert Floats To Ints In Pandas

Conclusion

In this topic, we have learned to convert entries of a column of floats data type to int data type in an existing DataFrame, following a running example of test scores of students in different subjects, thus giving us an intuition of how this concept could be applied in real-world situations. Feel free to reach out to info.javaexercise@gmail.com in case of any suggestions.