首页 > 解决方案 > 以指定格式填充缺失值 - Python

问题描述

我遇到了一个明确要求我不要使用 numpy 或 pandas 的问题。

问题:

给定一个带有数字和'_'(缺失值)符号的字符串,您必须'_'按照说明替换符号

Ex 1: _, _, _, 24 ==> 24/4, 24/4, 24/4, 24/4 i.e we. have distributed the 24 equally to all 4 places 

Ex 2: 40, _, _, _, 60 ==> (60+40)/5,(60+40)/5,(60+40)/5,(60+40)/5,(60+40)/5 ==> 20, 20, 20, 20, 20 i.e. the sum of (60+40) is distributed qually to all 5 places

Ex 3: 80, _, _, _, _  ==> 80/5,80/5,80/5,80/5,80/5 ==> 16, 16, 16, 16, 16 i.e. the 80 is distributed qually to all 5 missing values that are right to it

Ex 4: _, _, 30, _, _, _, 50, _, _  
==> we will fill the missing values from left to right 
    a. first we will distribute the 30 to left two missing values (10, 10, 10, _, _, _, 50, _, _)
    b. now distribute the sum (10+50) missing values in between (10, 10, 12, 12, 12, 12, 12, _, _) 
    c. now we will distribute 12 to right side missing values (10, 10, 12, 12, 12, 12, 4, 4, 4)

对于具有逗号分隔值的给定字符串,该字符串将具有两个缺失值数字,例如 ex: "_, _, x, _, _, " 您需要填充缺失值 Q:您的程序读取一个字符串,例如 ex: " , _ , x, _, _, _" 并返回填充序列 Ex:

Input1: "_,_,_,24"
Output1: 6,6,6,6

Input2: "40,_,_,_,60"
Output2: 20,20,20,20,20

Input3: "80,_,_,_,_"
Output3: 16,16,16,16,16

Input4: "_,_,30,_,_,_,50,_,_"
Output4: 10,10,12,12,12,12,4,4,4

我正在尝试使用 split 函数将字符串拆分为列表。然后,我尝试检查左侧的空白并计算此类空白的数量,然后一旦遇到非空白,我将该数字除以总数,即(在数字和之前遇到的 no.blanks数字本身)并传播值并替换空白留下了数字

然后我检查两个数字之间的空格,然后应用相同的逻辑,然后对右侧的空格执行相同的操作。

但是,我在下面分享的代码会引发各种错误,我相信我在上面分享的逻辑存在差距,因此希望对解决这个问题有见解

def blanks(S):

  a= S.split()
  count = 0
  middle_store = 0
  #left
  for i in range(len(a)):
    if(a[i]=='_'):
      count = count+1  #find number of blanks to the left of a number
    else:
      for j in range(0,i+1):
        #if there are n blanks to the left of the number speard the number equal over n+1 spaces
        a[j] = str((int(a[i])/(count+1)))
        middle_store= i
    break  

  #blanks in the middle
  denominator =0
  flag = 0
  for k in len(middle_store+1,len(a)):
    if(a[k] !='_'):
      denominator = (k+1-middle_store)
      flag=k
    break

  for p in len(middle_store,flag+1):
    a[p] = str((int(a[p])/denominator))

  #blanks at the right 
  for q in len(flag,len(a)):
    a[q] = str((int(a[q])/(len(a)-flag+1)))

S=  "_,_,30,_,_,_,50,_,_"
print(blanks(S))

标签: pythonpython-3.x

解决方案


模块化解决方案

# takes an array x and two indices a,b. 
# Replaces all the _'s with (x[a]+x[b])/(b-a+1)
def fun(x, a, b):
    if a == -1:
        v = float(x[b])/(b+1)
        for i in range(a+1,b+1):
            x[i] = v
    elif b == -1:
        v = float(x[a])/(len(x)-a)
        for i in range(a, len(x)):
            x[i] = v
    else:
        v = (float(x[a])+float(x[b]))/(b-a+1)
        for i in range(a,b+1):
            x[i] = v
    return x

def replace(text):
    # Create array from the string
    x = text.replace(" ","").split(",")
    # Get all the pairs of indices having number
    y = [i for i, v in enumerate(x) if v != '_']
    # Starting with _ ?
    if y[0] != 0:
        y = [-1] + y
    # Ending with _ ?
    if y[-1] != len(x)-1:
        y = y + [-1]    
    # run over all the pairs
    for (a, b) in zip(y[:-1], y[1:]): 
        fun(x,a,b)          
    return x

# Test cases
tests = [
    "_,_,_,24",
    "40,_,_,_,60",
    "80,_,_,_,_",
     "_,_,30,_,_,_,50,_,_"]

for i in tests:
    print (replace(i))

推荐阅读