首页 > 解决方案 > 是否有可能将 NaN 值作为空白写入 txt 文件?

问题描述

我有以下带有列的df:

    DueDate
0   <cbc:DueDate>2020-10-18</cbc:DueDate>
1   <cbc:DueDate>2020-01-08</cbc:DueDate>
2   NaN
3   NaN

     Streetname
0    <cbc:StreetName>Xerox GmbH</cbc:StreetName>            
1    <cbc:StreetName>Rompslomp.nl B.V.</cbc:StreetName>     
2    <cbc:StreetName>STAS picture</cbc:StreetName>          
3    <cbc:StreetName>Rex International B.V.</cbc:StreetName>

     PostalAdress
0    </cac:PostalAddress>
1    </cac:PostalAddress>
2    </cac:PostalAddress>
3    </cac:PostalAddress>
Name: PostalAdressClose, dtype: object

当我尝试使用以下代码将其写入文本文件时:

# xml document to be expanding with per row details
fac_doc_template = """<?xml version="1.0"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/maindoc/UBL-Invoice-2.1.xsd">
  <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
  <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns010:ver2.0:extended:urn:www.peppol.eu:bis:peppol4a:ver2.0:extended:urn:www.simplerinvoicing.org:si:si-ubl:ver1.1.x</cbc:CustomizationID>
  <cbc:ProfileID>urn:www.cenbii.eu:profile:bii04:ver2.0</cbc:ProfileID>
  {fac_details}"""

# per row details
# todo: expand for all of the column values you want
fac_details_xml_template = """{Streetname}
  {DueDate}
  """

然后我遍历列以使用以下代码将每个列写入单独的文件:

def series_to_fac_details_xml(s):
    return fac_details_xml_template.format(**s)

for index, row in df3.iterrows():
    details = series_to_fac_details_xml(row)
    with open(fr"C:\Users\Max12\Desktop\xml\pdfminer\UiPath\output\{index}.xml", "w") as f:
        f.write(fac_doc_template.format(fac_details=details))

我有一个问题..我希望当值为 NaN 时跳过 NaN,但是当我使用以下方法将 NaN 转换为空字符串时:

df3 = df3.replace(np.nan, '', regex=True)

我在输出文件中得到白线。当发生 NaN 时,所需的输出是下一列写入文件(不带空格)的立即延续。你能帮助我吗?

标签: pythonregexpandasnan

解决方案


假设你有这个 DataFrame:

import numpy as np
import pandas as pd
df = pd.DataFrame({'DueDate':   ['2020-01-01','2020-01-02',np.nan], 
                   'Streetname':['Main Street 1', 'Main Street 2', 'Main Street 3']
                  })

df
>>>
      DueDate     Streetname
0  2020-01-01  Main Street 1
1  2020-01-02  Main Street 2
2         NaN  Main Street 3

比你可以NaN像你一样替换df = df.replace(np.nan,'', regex=True).

之后我建议你执行一个apply函数并创建一个新的系列来形成你的阵型。

z = df.apply(lambda x: x['Streetname'] + ' ' + x['DueDate'], axis=1)

稍后您可以调用z.to_string(index=False)并将其写入您的文件。如果您不喜欢换行符,您可以使用来替换它们z.to_string(index=False).replace('\n','')。我认为这会稍微清理您的代码,因为您不必遍历所有行。

我真的希望这对您有所帮助,并回答您的问题。


推荐阅读