首页 > 解决方案 > 尽管对其执行了操作,但仍显示列表为空

问题描述

实际上,我需要绘制仅在 2012 年 10 月发生的所有变化,因此我正在计算 30 行,以便我可以在 xlim 中使用它们进行绘图。

import pandas as pd
from pandas import Series,DataFrame
import numpy as np
poll_df=pd.read_csv('http://elections.huffingtonpost.com/pollster/2012-general-election-romney-vs-obama.csv')
row_in=0
xlimit=[]
poll_df=poll_df[poll_df['Start Date'].str[:7] == '2012-10']
for date in poll_df['Start Date']:
    if date[0:7] == '2012-10':
        xlimit.append(row_in)
        row_in += 1
    else:
        row_in+=1
print(min(xlimit))
print(max(xlimit))

但我不明白为什么 xlimit 尽管对其进行了操作,但它仍然是空的。

标签: python-3.xpandas

解决方案


通过下载该 URL,我可以加载它np.genfromtxt

In [232]: data = np.genfromtxt('../Downloads/2012-general-election-romney-vs-oba
     ...: ma.csv',dtype=None,delimiter=',',names=True,invalid_raise=False,encodi
     ...: ng=None)
/usr/local/bin/ipython3:1: ConversionWarning: Some errors were detected !
    Line #77 (got 13 columns instead of 17)
    Line #238 (got 13 columns instead of 17)
    Line #460 (got 18 columns instead of 17)
    Line #488 (got 18 columns instead of 17)
    Line #493 (got 13 columns instead of 17)
    Line #507 (got 18 columns instead of 17)
    Line #515 (got 18 columns instead of 17)
    Line #538 (got 18 columns instead of 17)
    Line #550 (got 18 columns instead of 17)
  #!/usr/bin/python3

它不像pandas处理较短/较长长度的线时那么宽容。

In [233]: data.shape
Out[233]: (577,)
In [234]: data.dtype
Out[234]: dtype([('Pollster', '<U56'), ('Start_Date', '<U10'), ('End_Date', '<U10'), ('Entry_DateTime_ET', '<U20'), ('Number_of_Observations', '<i8'), ('Population', '<U26'), ('Mode', '<U15'), ('Obama', '<f8'), ('Romney', '<f8'), ('Undecided', '<f8'), ('Other', '<f8'), ('Pollster_URL', '<U113'), ('Source_URL', '<U189'), ('Partisan', '<U11'), ('Affiliation', '<U5'), ('Question_Text', '?'), ('Question_Iteration', '<i8')])

start_date 字段如下所示:

In [235]: data['Start_Date'][:10] Out[235]: array(['2012-11-04', '2012-11-03', '2012-11-03', '2012- 11-03'、'2012-11-03'、'2012-11-03'、'2012-11-03'、'2012-11-01'、'2012-11-02'、'2012-11- 02'], dtype='

我可以用 搜索它where。我astype用来将字段限制为 7 个字符。

In [236]: np.where(data['Start_Date'].astype('U7')=='2012-10')[0]
Out[236]: 
array([18, 19, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
       36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
       53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
       70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,
       87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

我可以usecols用来绕过可变的行长度 - 假设“坏”行在后面的字段中只是不同。

In [237]: data = np.genfromtxt('../Downloads/2012-general-election-romney-vs-oba
     ...: ma.csv',dtype=None,delimiter=',',names=True,invalid_raise=False,encodi
     ...: ng=None,usecols=range(10))
In [238]: data.shape
Out[238]: (586,)
In [239]: np.where(data['Start_Date'].astype('U7')=='2012-10')[0]
Out[239]: 
array([ 18,  19,  21,  22,  23,  24,  25,  26,  27,  28,  29,  30,  31,
        32,  33,  34,  35,  36,  37,  38,  39,  40,  41,  42,  43,  44,
        45,  46,  47,  48,  49,  50,  51,  52,  53,  54,  55,  56,  57,
        58,  59,  60,  61,  62,  63,  64,  65,  66,  67,  68,  69,  70,
        71,  72,  73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,
        84,  85,  86,  87,  88,  89,  90,  91,  92,  93,  94,  95,  96,
        97,  98,  99, 100])

我可以通过像您这样的迭代搜索获得相同的列表:

In [244]: alist = []
In [245]: for i,date in enumerate(data['Start_Date']):
     ...:     if date[:7] == '2012-10':
     ...:         alist.append(i)
     ...:         
In [246]: len(alist)
Out[246]: 82
In [247]: np.array(alist)
Out[247]: 
array([ 18,  19,  21,  22,  23,  24,  25,  26,  27,  28,  29,  30,  31,
        32,  33,  34,  35,  36,  37,  38,  39,  40,  41,  42,  43,  44,
        45,  46,  47,  48,  49,  50,  51,  52,  53,  54,  55,  56,  57,
        58,  59,  60,  61,  62,  63,  64,  65,  66,  67,  68,  69,  70,
        71,  72,  73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,
        84,  85,  86,  87,  88,  89,  90,  91,  92,  93,  94,  95,  96,
        97,  98,  99, 100])

推荐阅读