首页 > 解决方案 > List of string "integers" to integers accounting for "non-numeric" strings Python

问题描述

I am fetching data from an online database. It returns dates and numeric values as strings in a list. i.e. ['87', '79', '50', 'M', '65'] (this is the values for a y axis plot and the x axis values are years associated with those values i.e. ['2018', '2017', '2016', '2015', '2014']. Before I can plot these values I first need to convert them into integers. I have accomplished this by simply using maxT_int = list(map(int,maxTList), the problem exists however, where sometimes data is missing and is indicated as missing by the 'M' as in the example above.

What I would like to do is remove the 'M' or account for it somehow and be able to plot the values.

I can plot the values just fine when I have no 'M' in the list. Any suggestions on how best to deal with this issue?

My full code is listed below

import urllib
import datetime
import urllib.request
import ast
from bokeh.plotting import figure
#from bokeh.io import output_file, show, export_png
import numpy as np



# Get user input for day
# in the format of mm-dd
print("Enter a value for the day that you would like to plot.")
print("The format should be mm-dd")
dayofmonth = input("What day would you like to plot? ")


# testing out a range of years
y = datetime.datetime.today().year

# get starting year
ystart = int(input("What year would you like to start with? "))
# get number of years back
ynum = int(input("How many years would you like to plot? "))
# calculate the number of years back to start from current year
diff = y - ystart
#assign values to the list of years
years = list(range(y-diff,y-(diff+ynum), -1))

start = y - diff
endyear = y - (diff+ynum)

i = 0
dateList=[]
minTList=[]
maxTList=[]
for year in years:
    sdate = (str(year) + '-' + dayofmonth)
    #print(sdate)

    url = "http://data.rcc-acis.org/StnData"

    values = {
    "sid": "KGGW",
    "date": sdate,
    "elems": "maxt,mint",
    "meta": "name",
    "output": "json"
    }

    data = urllib.parse.urlencode(values).encode("utf-8")


    req = urllib.request.Request(url, data)
    response = urllib.request.urlopen(req)
    results = response.read()
    results = results.decode()
    results = ast.literal_eval(results)

    if i < 1:
        n_label = results['meta']['name']
        i = 2
    for x in results["data"]:
            date,maxT,minT = x
            #setting the string of date to datetime

            date = date[0:4]
            date_obj = datetime.datetime.strptime(date,'%Y')
            dateList.append(date_obj)
            minTList.append(minT)
            maxTList.append(maxT)

maxT_int = list(map(int,maxTList))


# setting up the array for numpy
x = np.array(years)
y = np.array(maxT_int)


p = figure(title="Max Temps by Year for the day " + dayofmonth + " " + n_label, x_axis_label='Years',
           y_axis_label='Max Temps', plot_width=1000, plot_height=600)

p.line(x,y,  line_width=2)
output_file("temps.html")
show(p)

标签: pythonnumpyweather

解决方案


You could use numpy.nan and a function:

import numpy as np

lst = ['87', '79', '50', 'M', '65']

def convert(item):
    if item == 'M':
        return np.nan
    else:
        return int(item)

new_lst = list(map(convert, lst))
print(new_lst)

Or - if you are into list comprehensions:

new_lst = [int(item) if item is not 'M' else np.nan for item in lst]


Both will yield

[87, 79, 50, nan, 65]

推荐阅读