首页 > 解决方案 > numpy_financial irr 函数返回“ValueError:输入必须是 rank-1 数组”

问题描述

我有一个通过 mySQL 查询填充的数据框。我将其转换为数组以尝试计算 irr。排序和 groupby 函数按预期工作,但我似乎无法解决这个错误。我查看了这个问题,但它似乎不适用于我的情况。

我试图传递一个列表、DataFrame 和 pandas 表,但该函数正在寻找一个数组,所以这就是我发送的内容。对此的任何帮助表示赞赏。

谢谢!

This is the dataset produced as the flows_by_year variable
                     sum
EffectiveDate           
2017          -3660000.0
2018          -5520000.0
2019          -2460213.0
2020           1600000.0


import pandas as pd
import sqlite3
from sqlite3 import Error
import numpy_financial as npf

def IRR(fund, pool):
    database = r'C:\Transaction_History.db'
    conn = create_connection(database)
    sql_flows = ('''SELECT EFFECTIVEDATE, TRANSACTIONAMOUNT FROM Trans_Hist WHERE FUND=''' + 
                 '"' + fund + '"' + ' AND POOL=' + '"' + pool + '"' + 
                 ' AND (TRANSACTIONTYPE = "Funding" OR TRANSACTIONTYPE = "Cash");')
    flows = pd.read_sql_query(sql_flows, conn, parse_dates=['EffectiveDate'])
    flows.sort_values(by=['EffectiveDate'], inplace=True, ascending=True)
    flows_by_year = flows.groupby(flows['EffectiveDate'].dt.year)['TransactionAmount'].agg(['sum'])
    print(flows_by_year)
    irr = round(npf.irr(flows_by_year.values), 4) * 100
    irr = f'{irr:,.2f}'
    print(irr)
    return irr

标签: pythonnumpy

解决方案


docs中,该函数numpy_financial.irr()使用一个类似数组的一维对象。但是,您正在传递flows_by_year.values的是二维的。使用np.squeeze(flows_by_year.values)flows_by_year['sum']代替。

演示:

In [193]: import pandas as pd
     ...: import numpy_financial as npf
     ...: import numpy as np

In [194]: data = np.asarray([(2017, -3660000.0),
     ...:                    (2018, -5520000.0),
     ...:                    (2019, -2460213.0),
     ...:                    (2020, 1600000.0)])

In [195]: df = pd.DataFrame(data, columns=['EffectiveDate', 'sum'])

In [196]: flows_by_year = df.set_index('EffectiveDate')

In [197]: flows_by_year.values
Out[197]: 
array([[-3660000.],
       [-5520000.],
       [-2460213.],
       [ 1600000.]])

In [198]: np.squeeze(flows_by_year.values)
Out[198]: array([-3660000., -5520000., -2460213.,  1600000.])

In [199]: flows_by_year['sum']
Out[199]: 
EffectiveDate
2017.0   -3660000.0
2018.0   -5520000.0
2019.0   -2460213.0
2020.0    1600000.0
Name: sum, dtype: float64

In [200]: round(npf.irr(np.squeeze(flows_by_year.values)), 4) * 100
Out[200]: -66.25

In [201]: round(npf.irr(flows_by_year['sum']), 4) * 100
Out[201]: -66.25

推荐阅读