首页 > 解决方案 > 如何在 python 中的 Dataframe 上使用 StringIO 函数?

问题描述

我有一个包含两列(服务名称,端口号)的数据框,其中服务名称中的值是一个对象,端口号是一个 int 值。当我尝试将它们转换为 StringIO 格式时,我收到 TypeError:initial_value 必须是 str 或 None,而不是 DataFrame。

我尝试将数据框转换为字符串,使用str(data)StringIo 转换值,但是当我尝试循环时,我收到以下错误 ValueError: no enough values to unpack (expected 2, got 1)。

这是我文件中的第 12 行。

Service Name    Port Number
Port_0  0
tcpmux  1
compressnet 2
compressnet 3
Unassigned  4
rje 5
Unassigned  6
echo    7
Unassigned  8
discard 9
Unassigned  10
systat  11
Unassigned  12

所以我要运行的循环

#converting the "-" into a range and adding back to the data frame

import csv

def extend_ports(file, delim=','):
   handle = csv.reader(file, delimiter=delim)
   yield next(handle)  # skip header
   for row in handle:
      try:
         service_name, port_number = row
      except ValueError:
          print(f"Could not parse line '{row}'")
          raise
      if '-' not in port_number:
         yield [service_name, port_number]  # simple result
      else:
         start, end = map(int, port_number.split('-'))
         for port in map(str, range(start, end+1)):
            yield [service_name, port]  # expanded result

# get the result
result = list(extend_ports(data3))

此代码是将“-”符号转换为一个范围,将所有端口号添加回一个数据帧,其服务名称即 272-276 映射到“portx”将扩展为 272,273,274,275,276 并映射到“portx”。

我认为当我尝试循环时的错误消息比这里的代码更重要。

我以艰难的方式解决了这个问题。我给的输入..

from io import StringIO

data = StringIO("""\
Service Name,Port Number
pt-tls,271
pt-tls,271
Unassigned,272-279
http-mgmt,280
http-mgmt,280
personal-link,281
personal-link,281
cableport-ax,282
cableport-ax,282
rescap,283
rescap,283
corerjd,284
corerjd,284
Unassigned,285
fxp,286
fxp,286
k-block,287
k-block,287
Unassigned,288-307
novastorbakcup,308
novastorbakcup,308
""")

使用上面的代码,我得到的结果为

['Service Name', 'Port Number']
['pt-tls', '271']
['pt-tls', '271']
['Unassigned', '272']
['Unassigned', '273']
['Unassigned', '274']
['Unassigned', '275']
['Unassigned', '276']
['Unassigned', '277']
...
['Unassigned', '306']
['Unassigned', '307']
['novastorbakcup', '308']
['novastorbakcup', '308']

上面的结果是我想要从数据框中得到的。提前致谢。

标签: python-3.xstring

解决方案


也适用于 csv 文件。

演示:

import csv

def extend_ports(file, delim=','):
   handle = csv.reader(file, delimiter=delim)
   yield next(handle)  # skip header
   for row in handle:
      try:
         service_name, port_number = row
      except ValueError:
          print("Could not parse line '{row}'")
          raise
      if '-' not in port_number:
         yield [service_name, port_number]  # simple result
      else:
         start, end = map(int, port_number.split('-'))
         for port in map(str, range(start, end+1)):
            yield [service_name, port]  # expanded result

# get the result
result = list(extend_ports(open(filename, "r")))  #Open file for read. 
print(result)

输出:

[['Service Name', 'Port Number'],
 ['pt-tls', '271'],
 ['pt-tls', '271'],
 ['Unassigned', '272'],
 ['Unassigned', '273'],
 ['Unassigned', '274'],
 ['Unassigned', '275'],
 ['Unassigned', '276'],
 ['Unassigned', '277'],
 ['Unassigned', '278'],
 ['Unassigned', '279'],
 ['http-mgmt', '280'],
 ['http-mgmt', '280'],
 ['personal-link', '281'],
 ['personal-link', '281'],
 ['cableport-ax', '282'],
 ['cableport-ax', '282'],
 ['rescap', '283'],
 ['rescap', '283'],
 ['corerjd', '284'],
 ['corerjd', '284'],
 ['Unassigned', '285'],
 ['fxp', '286'],
 ['fxp', '286'],
 ['k-block', '287'],
 ['k-block', '287'],
 ['Unassigned', '288'],
 ['Unassigned', '289'],
 ['Unassigned', '290'],
 ['Unassigned', '291'],
 ['Unassigned', '292'],
 ['Unassigned', '293'],
 ['Unassigned', '294'],
 ['Unassigned', '295'],
 ['Unassigned', '296'],
 ['Unassigned', '297'],
 ['Unassigned', '298'],
 ['Unassigned', '299'],
 ['Unassigned', '300'],
 ['Unassigned', '301'],
 ['Unassigned', '302'],
 ['Unassigned', '303'],
 ['Unassigned', '304'],
 ['Unassigned', '305'],
 ['Unassigned', '306'],
 ['Unassigned', '307'],
 ['novastorbakcup', '308'],
 ['novastorbakcup', '308']]

推荐阅读