首页 > 解决方案 > 我的数据集有超过 10,000 个项目,为什么这没有反映在我的列表中?

问题描述

# open, read, close output.txt (smaller version)
f = open("10thousand.txt", "r")
data = f.read()
f.close

# clean the data
data = data.replace('\n', '\t')
data = data.split('\t')

ageList = []

# append the data (ages) into the list
for i in data:
    ageList.append(i)

data.sort()

# print(ageList)

gen1 = []
gen2 = []
gen3 = []
gen4 = []
gen5 = []

# cycle through, add ages to our generation groups
for i in range(len(ageList)):
    if i >= 16 and i < 18:
        gen1.append(i)
    elif i > 17 and i < 34:
        gen2.append(i)
    elif i > 33 and i < 54:
        gen3.append(i)
    elif i > 53 and i < 73:
        gen4.append(i)
    elif i > 72 and i <= 101:
        gen5.append(i)
    else:
        pass

即使我的输入文件有超过 10,000 个,我的列表在每个列表中也只显示 10-30 个数据点。我正在为学校期末考试写这篇文章,但我似乎无法弄清楚问题出在哪里。

标签: pythonlist

解决方案


您正在检查列表的索引并将其附加到世代。您需要调整代码以引用和附加年龄本身:

#append the data (ages) into the list
for i in data:
    ageList.append(int(i))

...

#cycle through, add ages to our generation groups
for age in ageList:
    if age >= 16 and age < 18:
        gen1.append(age)
    elif age > 17 and age < 34:
        gen2.append(age)
    elif age > 33 and age < 54:
        gen3.append(age)
    elif age > 53 and age < 73:
        gen4.append(age)
    elif age > 72 and age <= 101:
        gen5.append(age)
    else:
        pass

推荐阅读