首页 > 解决方案 > Update Dataframe column value based on other Dataframe column value

问题描述

I have a pandas Dataframe which has a couple of columns. I want to get the first 3 elements out of the Information Column based on a value in Protocol.

For example: I want the first 3 elements in Information IF the protocol is TCP.

Using below code I can separate the columns needed for my operation. But I have no clue how to adapt the next piece of code to this.

chunk[['Protocol', 'Information']] = chunk[['Protocol', 'Information']]

EDIT:

I wish to update the values. Not separate them.

标签: pythonpandas

解决方案


你可以使用这样的东西:

import pandas

data = data = {'Name':['first', 'second', 'third', 'fourth'],
        'Age':[27, 27, 22, 32],
        'Address':['New York', 'ABC', 'XYZ', 'Nowhere'],
        'Qualification':['Msc', 'MA', 'MA', 'Phd']}

# Make a dataframe object
df = pandas.DataFrame(data)

# Your condition
# for example we want to get the rows with `Qualitication=='MA'
is_MA_qualified = df['Qualification'] == 'MA'

# Now to filter your data
MA_qualified = df[is_MA_qualified]

# You can use `head(n)` to get first three rows
first_three_MA_qualified = MA_qualified.head(3)

# And finally, to get any desired columns
first_three_MA_qualified[['Age','Address']]

更新:要更新单元格,您可以遍历行,然后更改满足条件的单元格的值:

...
for index, row in df.iterrows():
    if row['Age'] >= 18:
        df.at[index, 'Qualification'] = 'Verified'

推荐阅读