首页 > 解决方案 > 将字典中的 numpy 数组添加到数据框

问题描述

我正在尝试比较两个时间序列过滤器的结果。一个是霍德里克-普雷斯科特。我在这段代码中成功使用了那个

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
plt.style.use(['seaborn-paper'])
import quantecon as qe
import statsmodels.api as sm
from statsmodels.tsa.api import VAR
from sklearn.preprocessing import PolynomialFeatures 
from statsmodels.tsa.base.datetools import dates_from_str
import datetime as dt

hp1 = {}
for reg in regions_only:
    for var in variables_some1:
        x = seasonal[f'{var}sa_{reg}'].seasadj
        temp_cycle, temp_trend = sm.tsa.filters.hpfilter(np.log(x).dropna(), lamb = 1600)
        hp1[f'{var}sa_{reg}_c'] = temp_cycle

for var in variables_some1:
    x = seasonal[f'{var}sa_espana'].seasadj
    temp_cycle, temp_trend = sm.tsa.filters.hpfilter(np.log(x).dropna(), lamb = 1600)
    hp1[f'{var}sa_espana_c'] = temp_cycle

# This seasonally adjusts all variables available to some countries       
hp2 = {}
for reg in regions_only_some:
    for var in variables_some2:
        x = seasonal[f'{var}sa_{reg}'].seasadj
        temp_cycle, temp_trend = sm.tsa.filters.hpfilter(np.log(x).dropna(), lamb = 1600)
        hp2[f'{var}sa_{reg}_c'] = temp_cycle

for var in variables_some2:
    x = seasonal[f'{var}sa_espana'].seasadj
    temp_cycle, temp_trend = sm.tsa.filters.hpfilter(np.log(x).dropna(), lamb = 1600)
    hp2[f'{var}sa_espana_c'] = temp_cycle

hp_ir = {}
for reg in regions_only:
    x = seasonal[f'irsa_{reg}']
    temp_cycle, temp_trend = sm.tsa.filters.hpfilter(x.dropna(), lamb = 1600)
    hp_ir[f'irsa_{reg}_c'] = temp_cycle    

hp_filtered = {**hp1, **hp2, **hp_ir}  

# We transform our dictionary onto a dataframe for ease of manipulation
hp_filtered = pd.DataFrame.from_dict(hp_filtered)

生成一个数据框,我可以稍后以图形方式探索。我尝试用类似的代码做同样的事情

import xlsxwriter
import quantecon as qe
hp1h = {}
for reg in regions_only:
    for var in variables_some1:
        x = seasonal[f'{var}sa_{reg}'].seasadj
        temp_cycle, temp_trend = qe.hamilton_filter(np.log(x), 8,4)
        hp1h[f'{var}sa_{reg}_c'] = temp_cycle

for var in variables_some1:
    x = seasonal[f'{var}sa_espana'].seasadj
    temp_cycle, temp_trend = qe.hamilton_filter(np.log(x), 8,4)
    hp1h[f'{var}sa_espana_c'] = temp_cycle

# This seasonally adjusts all variables available to some countries       
hp2h = {}
for reg in regions_only_some:
    for var in variables_some2:
        x = seasonal[f'{var}sa_{reg}'].seasadj
        temp_cycle, temp_trend = qe.hamilton_filter(np.log(x), 8,4)
        hp2[f'{var}sa_{reg}_c'] = temp_cycle

for var in variables_some2:
    x = seasonal[f'{var}sa_espana'].seasadj
    temp_cycle, temp_trend = qe.hamilton_filter(np.log(x), 8,4)
    hp2h[f'{var}sa_espana_c'] = temp_cycle


hp_irh = {}
for reg in regions_only:
    x = seasonal[f'irsa_{reg}']
    temp_cycle, temp_trend = qe.hamilton_filter(x, 8,4)
    hp_irh[f'irsa_{reg}_c'] = temp_cycle  

hp_datah = {**hp1h, **hp2h, **hp_irh}  

我成功地创建了一个 numpy 数组字典,而不是一个数据帧字典,我真的很难弄清楚如何将它们合并到过滤字典中。特别是我看不到如何附加我在以前的字典中拥有的相同日期时间索引。

到目前为止,我设法通过

workbook = xlsxwriter.Workbook(path_to_data + 'HamiltonFiltered.xlsx')
worksheet = workbook.add_worksheet()

row = 0
col = 0
worksheet.write(0, 0, 'variable')
worksheet.write(0, 1, 'data')
for key in hp_datah.keys():
    row += 1
    worksheet.write(row, col, key)
    for item in hp_datah[key]:
        worksheet.write(row, 0, key)
        worksheet.write(row, col + 1, np.nan_to_num(item))
        row += 1

workbook.close()

但这需要以后进行大量的编辑工作。

谢谢

标签: pythonpandasnumpystatsmodels

解决方案


推荐阅读