python - 为什么熊猫显示“?” 而不是 NaN
问题描述
我正在学习熊猫,当我显示数据框时,它正在显示?而不是 NaN。为什么会这样?
代码 :
import pandas as pd
url = "https://archive.ics.uci.edu/ml/machine-learning-
databases/autos/imports-85.data"
df = pd.read_csv(url, header=None)
print(df.head())
headers = ["symboling", "normalized-losses", "make", "fuel-type",
"aspiration",
"num-of-doors", "body-style", "drive-wheels", "engine-location",
"wheel-base", "length", "width", "height", "curb-weight",
"engine-type", "num-of-cylinders", "engine-size", "fuel-system",
"bore", "stroke", "compression-ratio", "hoursepower", "peak-rpm",
"city-mpg", "highway-mpg", "price"]
df.columns=headers
print(df.head(30))
解决方案
在数据中缺少由 表示的值?
,因此可以使用参数进行转换na_values
,也可以通过列表添加列中的names
参数read_csv
,因此不需要分配:
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data"
headers = ["symboling", "normalized-losses", "make", "fuel-type", "aspiration",
"num-of-doors", "body-style", "drive-wheels", "engine-location",
"wheel-base", "length", "width", "height", "curb-weight",
"engine-type", "num-of-cylinders", "engine-size", "fuel-system",
"bore", "stroke", "compression-ratio", "hoursepower", "peak-rpm",
"city-mpg", "highway-mpg", "price"]
df = pd.read_csv(url, header=None, names=headers, na_values='?')
print(df.head(10))
symboling normalized-losses make fuel-type aspiration \
0 3 NaN alfa-romero gas std
1 3 NaN alfa-romero gas std
2 1 NaN alfa-romero gas std
3 2 164.0 audi gas std
4 2 164.0 audi gas std
5 2 NaN audi gas std
6 1 158.0 audi gas std
7 1 NaN audi gas std
8 1 158.0 audi gas turbo
9 0 NaN audi gas turbo
num-of-doors body-style drive-wheels engine-location wheel-base ... \
0 two convertible rwd front 88.6 ...
1 two convertible rwd front 88.6 ...
2 two hatchback rwd front 94.5 ...
3 four sedan fwd front 99.8 ...
4 four sedan 4wd front 99.4 ...
5 two sedan fwd front 99.8 ...
6 four sedan fwd front 105.8 ...
7 four wagon fwd front 105.8 ...
8 four sedan fwd front 105.8 ...
9 two hatchback 4wd front 99.5 ...
engine-size fuel-system bore stroke compression-ratio hoursepower \
0 130 mpfi 3.47 2.68 9.0 111.0
1 130 mpfi 3.47 2.68 9.0 111.0
2 152 mpfi 2.68 3.47 9.0 154.0
3 109 mpfi 3.19 3.40 10.0 102.0
4 136 mpfi 3.19 3.40 8.0 115.0
5 136 mpfi 3.19 3.40 8.5 110.0
6 136 mpfi 3.19 3.40 8.5 110.0
7 136 mpfi 3.19 3.40 8.5 110.0
8 131 mpfi 3.13 3.40 8.3 140.0
9 131 mpfi 3.13 3.40 7.0 160.0
peak-rpm city-mpg highway-mpg price
0 5000.0 21 27 13495.0
1 5000.0 21 27 16500.0
2 5000.0 19 26 16500.0
3 5500.0 24 30 13950.0
4 5500.0 18 22 17450.0
5 5500.0 19 25 15250.0
6 5500.0 19 25 17710.0
7 5500.0 19 25 18920.0
8 5500.0 17 20 23875.0
9 5500.0 16 22 NaN
[10 rows x 26 columns]
此信息在这里:
https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.names:
- 缺少属性值:(用“?”表示)
推荐阅读
- kotlin-multiplatform - kotlin-multiplatform 时间戳以毫秒为单位(unix 时间)
- c# - 我的 .exe 在其他计算机上打开和关闭
- c# - Parse 方法抛出异常
- python - pandas 从累积的多类别列中计算每日总数
- postgresql - PostgreSql 查询大表时的问题
- filter - 如何在logstash中使用聚合函数来计算“经过”时间字段之间的差异?
- vue.js - VueX:如何在一个循环中提交?
- python - 让用户输入 10 个数字,然后在 python 中求和
- vba - 使用选择基于实例更改 CATIA 用户定义属性
- validation - 如何验证文本字段的 0-9 位数字和颤动中的一个字符?