python - 如何解决python中不可散列的错误问题?
问题描述
我有两个文本数据集,所以我进行了清理过程,然后我想根据name
列将它们分成几组,但是在运行代码后我得到了错误unhashable type: 'list'
def cleanDataD(path='data1.csv'):
df = pd.read_csv(path, encoding = "ISO-8859-1")
df['name'] = df['name'].fillna(' ')
df['name'] = df['name'].apply(lambda x: remove_punct(x) )
df['name'] = df['name'].apply(lambda x: tokenizer.tokenize(x.lower()) )
df['name'] = df['name'].apply(lambda x: remove_stopWords(x) )
df['name_CV'] = df['name'].apply(lambda x: word_lemmatiser(x) )
df['name_CV'] = df['name_CV'].apply(lambda x: ['none'] if (len(x)== 0) else x)
df['city'] = df['city'].fillna(' ')
df['city'] = df['city'].apply(lambda x: remove_punct(x) )
df['city'] = df['city'].apply(lambda x: tokenizer.tokenize(x.lower()) )
df['city'] = df['city'].apply(lambda x: remove_stopWords(x) )
df['city_CV'] = df['city'].apply(lambda x: word_lemmatiser(x) )
df['city_CV'] = df['city_CV'].apply(lambda x: ['none'] if (len(x)== 0) else x)
df = df.fillna(0)
return df
def cleanDataH(path='data2.csv'):
df = pd.read_csv(path, encoding = "utf_8")
df['name'] = df['name'].fillna(' ')
df['name'] = df['name'].apply(lambda x: remove_punct(x) )
df['name'] = df['name'].apply(lambda x: tokenizer.tokenize(x.lower()) )
df['name'] = df['name'].apply(lambda x: remove_stopWords(x) )
df['name_CV'] = df['name'].apply(lambda x: word_lemmatiser(x) )
df['name_CV'] = df['name_CV'].apply(lambda x: ['none'] if (len(x)== 0) else x)
df['city'] = df['city'].fillna(' ')
df['city'] = df['city'].apply(lambda x: remove_punct(x) )
df['city'] = df['city'].apply(lambda x: tokenizer.tokenize(x.lower()) )
df['city'] = df['city'].apply(lambda x: remove_stopWords(x) )
df['city_CV'] = df['city'].apply(lambda x: word_lemmatiser(x) )
df['city_CV'] = df['city_CV'].apply(lambda x: ['none'] if (len(x)== 0) else x)
df = df.fillna(0)
return df
df_D = cleanDataD(path='data1.csv')
df_H = cleanDataH(path='data2.csv')
indexer =rl.Index()
indexer.block('name')
ff = indexer.index(df_D, df_H)
TypeError Traceback (most recent call last)
<ipython-input-35-c9ee905d6674> in <module>
----> 1 ff = indexer.index(df_H, df_D)
TypeError: unhashable type: 'list'
如何修复此错误?
解决方案
推荐阅读
- r - 如何让 R 将这些单元格读取为空白而不是 NA?
- ruby - ruby 编码有什么问题?
- openssl - 通过 HTTPS 为 NiFi 生成自签名证书
- mysql - MySQL 一对多关系将值从一个表添加到另一个表
- php - 使用PHP,当我只知道“aaa”时,如何从“string1 string2 aaa-bbb string3”中获取子字符串“aaa-bbb”?
- python - 为什么当我告诉 django 这样做时我没有创建令牌?
- javascript - d3-cloud - 如何缩放单词的权重?
- three.js - 如何在 Three.js 中加载各种 obj 3D 模型?
- java - Spring Boot Thymeleaf 显示不工作
- cx-freeze - 构建编译的 sagemath 脚本