首页 > 解决方案 > 根据字符串生成数字

问题描述

我想根据数据框中列中的字符串生成数字。我想创建数字来表示每个唯一的字符串。

下面是一个示例和期望的结果。

String  Desired outcome
   A    1
   A    1
   B    2
   C    3
   D    4

下面的代码不起作用,因为它创建了许多列。

dummies = pd.get_dummies(df['String'])

标签: python

解决方案


You can use the ord() function to get the ascii value of a character such as:

ord('A')

The above command returns 65. If you want the characters to start from one, a simple method like ordFromOne(character) works fine:

def ordFromOne(c):
    return ord(c) - 64

Then you just run that over each of your characters. If your example characters are actually strings you can of course just map the function:

map(ordFromOne, example)

推荐阅读