python - 输入类型不支持 ufunc 'isfinite',并且无法安全地将输入强制转换为支持的类型 - 想要降级整数列
问题描述
我正在尝试通过将 dtype 'int64' 的列向下转换为 'unsigned' 来优化我的数据框,但实际上具有负值的一列除外。对于那一列,我只想将它从“int64”单独转换为“int8”。但是,当我使用以下代码完成此操作时,出现错误:
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
我正在使用的代码是:
def mem_usage(pandas_obj):
if isinstance(pandas_obj,pd.DataFrame):
usage_b = pandas_obj.memory_usage(deep=True).sum()
else: # we assume if not a df it's a series
usage_b = pandas_obj.memory_usage(deep=True)
usage_mb = usage_b / 1024 ** 2 # convert bytes to megabytes
return "{:03.2f} MB".format(usage_mb)
dtypes = data_pitch_2017_1.drop('launch_angle',axis=1).dtypes
data_pitch_2017_1_int = data_pitch_2017_1.select_dtypes(include=['int'])
converted_int = data_pitch_2017_1_int.apply(pd.to_numeric,downcast='unsigned', errors ='ignore')
print(mem_usage(data_pitch_2017_1_int))
print(mem_usage(converted_int))
compare_ints = pd.concat([data_pitch_2017_1_int.dtypes,converted_int.dtypes],axis=1)
compare_ints.columns = ['before','after']
compare_ints.apply(pd.Series.value_counts)
我收到的完整错误代码是:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-77-8486ea5172f1> in <module>
8 dtypes = data_pitch_2017_1.drop('launch_angle',axis=1).dtypes
9 data_pitch_2017_1_int = data_pitch_2017_1.select_dtypes(include=['int'])
---> 10 converted_int = data_pitch_2017_1_int.apply(pd.to_numeric,downcast='unsigned', errors ='ignore')
11 print(mem_usage(data_pitch_2017_1_int))
12 print(mem_usage(converted_int))
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
7766 kwds=kwds,
7767 )
-> 7768 return op.get_result()
7769
7770 def applymap(self, func, na_action: Optional[str] = None) -> DataFrame:
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py in get_result(self)
183 return self.apply_raw()
184
--> 185 return self.apply_standard()
186
187 def apply_empty_result(self):
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py in apply_standard(self)
274
275 def apply_standard(self):
--> 276 results, res_index = self.apply_series_generator()
277
278 # wrap results
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py in apply_series_generator(self)
288 for i, v in enumerate(series_gen):
289 # ignore SettingWithCopy here in case the user mutates
--> 290 results[i] = self.f(v)
291 if isinstance(results[i], ABCSeries):
292 # If we have a view on v, we need to make a copy because
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py in f(x)
108
109 def f(x):
--> 110 return func(x, *args, **kwds)
111
112 else:
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/tools/numeric.py in to_numeric(arg, errors, downcast)
182 for dtype in typecodes:
183 if np.dtype(dtype).itemsize <= values.dtype.itemsize:
--> 184 values = maybe_downcast_to_dtype(values, dtype)
185
186 # successful conversion
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in maybe_downcast_to_dtype(result, dtype)
203 return PeriodArray(result, freq=dtype.freq)
204
--> 205 converted = maybe_downcast_numeric(result, dtype, do_round)
206 if converted is not result:
207 return converted
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in maybe_downcast_numeric(result, dtype, do_round)
284 return new_result
285 else:
--> 286 if np.allclose(new_result, result, rtol=0):
287 return new_result
288
<__array_function__ internals> in allclose(*args, **kwargs)
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol, equal_nan)
2187
2188 """
-> 2189 res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
2190 return bool(res)
2191
<__array_function__ internals> in isclose(*args, **kwargs)
/Volumes/DATASTORE_L/opt/anaconda3/lib/python3.7/site-packages/numpy/core/numeric.py in isclose(a, b, rtol, atol, equal_nan)
2286
2287 xfin = isfinite(x)
-> 2288 yfin = isfinite(y)
2289 if all(xfin) and all(yfin):
2290 return within_tol(x, y, atol, rtol)
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
我确实尝试包含代码errors ='ignore'
以忽略可能导致错误的任何 Nan 值或非数字值。
但我仍然没有成功。
数据框信息为:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 721244 entries, 0 to 721243
Data columns (total 85 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 pitcher 721244 non-null int64
1 key_retro 721244 non-null object
2 pitch_type 718006 non-null object
3 Game_Date 721244 non-null datetime64[ns]
4 release_speed 717883 non-null Float64
5 release_pos_x 717862 non-null Float64
6 release_pos_z 717862 non-null Float64
7 player_name 721244 non-null object
8 batter 721244 non-null Int64
9 events 184565 non-null object
10 description 721244 non-null object
11 zone 717862 non-null Int64
12 des 721244 non-null object
13 game_type 721244 non-null object
14 stand 721244 non-null object
15 p_throws 721244 non-null object
16 home_team 721244 non-null object
17 away_team 721244 non-null object
18 type 721244 non-null object
19 hit_location 161307 non-null Int64
20 bb_type 127555 non-null object
21 balls 721244 non-null Int64
22 strikes 721244 non-null Int64
23 game_year 721244 non-null Int64
24 pfx_x 717862 non-null Float64
25 pfx_z 717862 non-null Float64
26 plate_x 717862 non-null Float64
27 plate_z 717862 non-null Float64
28 on_3b 67052 non-null Int64
29 on_2b 132754 non-null Int64
30 on_1b 220388 non-null Int64
31 outs_when_up 721244 non-null Int64
32 inning 721244 non-null Int64
33 inning_topbot 721244 non-null object
34 hc_x 124661 non-null Float64
35 hc_y 124661 non-null Float64
36 fielder_2 719473 non-null Int64
37 vx0 717862 non-null Float64
38 vy0 717862 non-null Float64
39 vz0 717862 non-null Float64
40 ax 717862 non-null Float64
41 ay 717862 non-null Float64
42 az 717862 non-null Float64
43 sz_top 717862 non-null Float64
44 sz_bot 717862 non-null Float64
45 hit_distance_sc 188244 non-null Int64
46 launch_speed 199437 non-null Float64
47 launch_angle 199441 non-null Int64
48 effective_speed 716330 non-null Float64
49 release_spin_rate 703603 non-null Int64
50 release_extension 717862 non-null Float64
51 game_pk 721244 non-null Int64
52 pitcher.1 721244 non-null Int64
53 fielder_2.1 719473 non-null Int64
54 fielder_3 719473 non-null Int64
55 fielder_4 719473 non-null Int64
56 fielder_5 719473 non-null Int64
57 fielder_6 719473 non-null Int64
58 fielder_7 719473 non-null Int64
59 fielder_8 719473 non-null Int64
60 fielder_9 719473 non-null Int64
61 release_pos_y 717862 non-null Float64
62 estimated_ba_using_speedangle 125403 non-null Float64
63 estimated_woba_using_speedangle 125403 non-null Float64
64 woba_value 184565 non-null Float64
65 woba_denom 184565 non-null Int64
66 babip_value 184565 non-null Int64
67 iso_value 184565 non-null Int64
68 launch_speed_angle 125403 non-null Int64
69 at_bat_number 721244 non-null Int64
70 pitch_number 721244 non-null Int64
71 pitch_name 718006 non-null object
72 home_score 721244 non-null Int64
73 away_score 721244 non-null Int64
74 bat_score 721244 non-null Int64
75 fld_score 721244 non-null Int64
76 post_away_score 721244 non-null Int64
77 post_home_score 721244 non-null Int64
78 post_bat_score 721244 non-null Int64
79 post_fld_score 721244 non-null Int64
80 if_fielding_alignment 717029 non-null object
81 of_fielding_alignment 717029 non-null object
82 spin_axis 717862 non-null Int64
83 delta_home_win_exp 721243 non-null Float64
84 delta_run_exp 721121 non-null Float64
dtypes: Float64(26), Int64(40), datetime64[ns](1), int64(1), object(17)
memory usage: 1.1 GB
有人可以帮助解决问题以及如何在代码中解决问题吗?
谢谢你。
解决方案
推荐阅读
- html - 如何使用 express.js 获取选择选项的值?
- c# - 如何组合多个 Func<> 委托
- php - 如何在 Laravel web.php 路由中使用 php 箭头函数
- javascript - 如何将等待与“$.get”一起使用?
- php - 当代码在云中运行时,我下载文件的 php 脚本只是在浏览器中显示它,但在我的本地服务器上运行良好
- ios - 如何使用 SwiftyJSON 解析类型不明确的数据?
- r - 如何在R中绑定多个列
- python - 根据另一个数据集中的元素位置过滤熊猫数据帧的快速方法
- javascript - 从每月或每周或每天或议程更改视图会使应用程序对大日历做出反应
- java - 反应堆跳转到错误的调度程序?