首页 > 解决方案 > How to replace values in all pandas rows by list

问题描述

I have a list:

a = [a,te,re,edf,c,sa,da,wq,rw...]

And DF 5888 len:

name  sex snps1 snps2 snps3 snps4 ... snps338
aas   M    a     te    re    dd   ... ...
aab   M    a     ga    re    af   ... ...
...

I need to replace values based on list.

First value in list is first SNPS for dataframe etc. So I need to compare first value in list with whole column "snps1". And replace values on True/False.

Expected result:

  name  sex snps1 snps2 snps3 snps4 ... snps338
 sample1   M  TRUE  TRUE   TRUE  FALSE   ... ...
 sample2   M  TRUE  FALSE  TRUE  FALSE   ... ...
     ...

I wrote a code:

two for loop. First for j in len(list), second for df len, and if statements... but it means that i will be looped 5888x338 times. And it takes too much time.

How can I do it in better way? I tried found a solution but all founded posts didn't fit for my problem.

Can somoeone help mi with it?

标签: pythonpandasdataframe

解决方案


您可以使用isin,例如:

import pandas as pd


data = [['aas', 'M', 'a', 'te', 're', 'dd'],
        ['aab', 'M', 'a', 'ga', 're', 'af']]

df = pd.DataFrame(data=data, columns=['name', 'sex', 'snps1', 'snps2', 'snps3', 'snps4'])

a = ['a', 'te', 're', 'edf', 'c', 'sa', 'da', 'wq', 'rw']
columns = ['snps1', 'snps2', 'snps3', 'snps4']

lookup = { key : (value,) for key, value in zip(columns, a) }
df.loc[:, columns] = df.loc[:, columns].isin(lookup)
print(df)

输出

  name sex snps1  snps2 snps3  snps4
0  aas   M  True   True  True  False
1  aab   M  True  False  True  False

推荐阅读