python - TypeError: 'int' 和 'str' 的实例之间不支持'<'。标记化字符串 100% 整数
问题描述
Traceback (most recent call last):
File "Users", line 50, in <module>
length = len_c / (len_a_b - len_c)
File "\venv\lib\site-packages\pandas\core\ops\common.py", line 65, in new_method
return method(self, other)
File "\venv\lib\site-packages\pandas\core\arraylike.py", line 97, in __sub__
return self._arith_method(other, operator.sub)
File "\venv\lib\site-packages\pandas\core\series.py", line 4994, in _arith_method
self, other = ops.align_method_SERIES(self, other)
File "\venv\lib\site-packages\pandas\core\ops\__init__.py", line 147, in align_method_SERIES
left, right = left.align(right, copy=False)
File "\lib\site-packages\pandas\core\series.py", line 4220, in align
return super().align(
File "\venv\lib\site-packages\pandas\core\generic.py", line 8825, in alig
return self._align_series(
File "\venv\lib\site-packages\pandas\core\generic.py", line 8934, in _align_series
join_index, lidx, ridx = self.index.join(
File "\venv\lib\site-packages\pandas\core\indexes\range.py", line 690, in join
return self._int64index.join(other, how, level, return_indexers, sort)
File "\venv\lib\site-packages\pandas\core\indexes\base.py", line 3669, in join
return this.join(other, how=how, return_indexers=return_indexers)
File "\venv\lib\site-packages\pandas\core\indexes\base.py", line 3679, in join
return self._join_monotonic(
File "\venv\lib\site-packages\pandas\core\indexe\base.py", line 4014, in _join_monotonic
join_index, lidx, ridx = self._outer_indexer(sv, ov)
File "\venv\lib\site-packages\pandas\core\indexes\base.py", line 219, in _outer_indexer
return libjoin.outer_join_indexer(left, right)
File "pandas\_libs\join.pyx", line 556, in pandas._libs.join.outer_join_indexer
TypeError: '<' not supported between instances of 'int' and 'str'
Process finished with exit code 1
问题出在以 dict1= 开头的行中
b = df2.apply(set)
a = df1.apply(set)
#print('a', a.columns)
c = pd.concat([b.apply(lambda x : s.intersection(x)) for s in a], axis=1)
len_a_b = b.apply(lambda x : len(x) + len(a))
len_c = c.apply(lambda x : len(x))
dict1 = {'length' : len_c / (len_a_b - len_c) , 'b' : b , 'c' : c}
这是数据框的样子:
0 [Tom, eats, pineapple]
1 [Tom, eats, pineapple]
2 [Eva, eats, apple]
3 [Eva, eats, pineapple]
Name: all, dtype: object
0 [Tom, eats, pineapple]
1 [Tom, eats, pineapple]
2 [Eva, eats, apple]
3 [Eva, eats, pineapple]
Name: sentence, dtype: object
打印(len_c):长度:550,数据类型:int64
打印(len_a_b):长度:6646,数据类型:int64
正如您在标记化之后看到的那样,我们在这里有 100% 的整数,但 python 仍然说它没有。当数据不是两个完整的数据帧时,相同的函数可以处理数据。
解决方案
而不是这个:
len_c = c.apply(lambda x : len(x))
用这个:
len_c =c.apply(lambda x : len(x)).reset_index(drop=True)
最后:
dict1 = {'length' : len_c / (len_a_b - len_c) , 'b' : b , 'c' : c}
推荐阅读
- c++ - 创建CRUD C++更新函数的难点
- angular - 使用角度 ssr 9 两次加载图像和闪烁的站点
- android - 如何检测 seekbar 是否没有被移动并且没有值(然后发送 toast)
- c++ - 无法调用返回 char 的类中的函数
- c# - 在属性中使用输入参数(Asp Core 3.1)
- logging - 通过 SYSLOG_IDENTIFIER 获取 Docker 日志到 Fluentd
- json - 在给定时区中为时间戳创建 Postgresql 索引
- mysql - 如何在 Join 中运行两个 where 子句?
- angular - 角度形式有效不工作
- android - Play Store 上的三星 Galaxy S10+ 崩溃报告