Let's say you have a Pandas data frame holding some columns with values and you want to perform some calculations or want to update the data in one of those columns.
There are many ways to achieve this, but my personal preference is to create a function with data manipulation logic and then make a call to this function for the required column.
First, let’s take a look at the data that I’m considering for this article. My sample data is three rows with columns StudentName and Marks, along with an index column. This is how it looks:
The idea is to perform some calculation on Marks column. To keep this simple, I’ll just double the marks of every student.
Below is the function to double the marks:
def Multiplier(val):
return val * 2;
Once the function is ready, the next step is simply the matter of calling that function on a column. That can be done as shown below:
import pandas as pd
df = pd.read_csv(‘Sample.csv’)
df[‘Marks’] = df[‘Marks’].apply(Multiplier)
print(df)
The print function will produce the output as expected with double the marks:
I hope you find this tip useful.
Do not forget to check out the recording of this article on my YouTube channel.