首页 > 解决方案 > 不使用熊猫的python字符串操作

问题描述

如何在不使用 pandas 包的情况下使用 python 来操作这个数据集。我可以使用 pandas 来做到这一点,但这是一个新的字符串操作,我不知道该怎么做

    text = """
        series_id                       year    period         value    footnote_codes

        LASST180000000000003            1971    M01          6.6    R

        LASST180000000000003            1971    M02          6.6    R

        LASST180000000000003            1977    M03          6.5    R

        LASST180000000000003            1976    M04          6.3    R

        LASST180000000000003            1978    M05          6.0    R

        LASST180000000000003            1979    M06          5.8    R

        LASST180000000000003            1976    M07          5.7    R

        """

##### do not use pandas ####

### 1. replace the footnote_codes column by the month_year column
# holds a string that has the month year combination.  For example, if a row has 
# the month at 06 and the year at 2007, 
# this column should have the following string: “06_2007”     
# ### 2. only keep the data from 1976 to 1979
   

标签: pythonstring

解决方案


我不确定您要查找的确切内容,但此代码将使用字典,键是列标题,值是列中值的列表。

它还创建脚注列。

text = """
    series_id                       year    period         value    footnote_codes

    LASST180000000000003            1971    M01          6.6    R

    LASST180000000000003            1971    M02          6.6    R

    LASST180000000000003            1977    M03          6.5    R

    LASST180000000000003            1976    M04          6.3    R

    LASST180000000000003            1978    M05          6.0    R

    LASST180000000000003            1979    M06          5.8    R

    LASST180000000000003            1976    M07          5.7    R

    """

values = text.split()

headers = values[0:5]

columns = {col_name:[value for value in values[idx+5::5]] for idx, col_name in  enumerate(headers[:-1])}

columns['footnotes'] = [period[1:]+'_'+year for year, period in zip(columns['year'], columns['period'])]

print(columns)

推荐阅读