首页 > 解决方案 > 带有关键字参数的熊猫应用方法

问题描述

我有一个带字符串的数据框。我使用两列作为外部 python 模块的输入,该模块采用命令行输入标志。如何将标志作为参数传递给 .apply 函数?该方法在从 python 调用时有效,但我不想生成大量中间文件。相反,我想向 dataFrame 添加新列。

#Working code
rec = ('VH', 'EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQAPGKGLEWVARIYPTNGYTRYADSVKGRFTISADTSKNTAYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGTLVTVSS')

kabRaw, align, hit = anarci([rec], scheme = "kabat", output = False, assign_germline = True)

print hit


#Non-working pandas implementation
df_result = pd.DataFrame({'Sample':['VH'], 'Protein Sequence':['EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQAPGKGLEWVARIYPTNGYTRYADSVKGRFTISADTSKNTAYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGTLVTVSS']})

df_result[['kabRaw', 'align', 'hit']] = df_result[['Sample', 'Protein Sequence']].apply(anarci, args= ('-s kabat output= False assign_germline= True'))

print df_result.head()




When called directly as python module this is the output
[[['id', 'description', 'evalue', 'bitscore', 'bias', 'query_start', 'query_end'], ['human_H', '', 1.9e-60, 193.6, 0.5, 0, 120], ['pig_H', '', 7.9e-59, 188.5, 1.4, 0, 120], ['rat_H', '', 8.8e-58, 184.8, 0.4, 0, 120], ['mouse_H', '', 3.8e-57, 182.8, 0.1, 0, 120], ['rabbit_H', '', 1.9e-49, 158.0, 0.4, 2, 120], ['rhesus_H', '', 1.7e-41, 132.0, 1.0, 1, 120]]]

This is the error from the pandas version
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 6014, in apply
    return op.get_result()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/apply.py", line 318, in get_result
    return super(FrameRowApply, self).get_result()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/apply.py", line 142, in get_result
    return self.apply_standard()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/apply.py", line 248, in apply_standard
    self.apply_series_generator()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/apply.py", line 277, in apply_series_generator
    results[i] = self.f(v)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/apply.py", line 74, in f
    return func(x, *args, **kwds)
TypeError: ('anarci() takes at most 12 arguments (45 given)', u'occurred at index Sample')


What are the 45 arguments?  How do I pass arguments in the context of pandas apply function?

标签: pythonpandaspython-2.7dataframe

解决方案


推荐阅读