python - 将行添加到数据框中,这些行是数据框的乘积,python
问题描述
我有一个 4x10 维度的数据框,行代表工作类别 1-10,所有工作都属于其中一个类别。该表说明了数据库中的人将工作 1-10 作为第一份工作、第二份工作等的概率:
prob_all_dict = {'prob_1': {1.0: 0.03409090909090909,
2.0: 0.022727272727272728,
3.0: 0.045454545454545456,
4.0: 0.5340909090909091,
5.0: 0.06818181818181818,
6.0: 0.011363636363636364,
7.0: 0.13636363636363635,
8.0: 0.06818181818181818,
9.0: 0.045454545454545456,
10.0: 0.03409090909090909},
'prob_2': {1.0: 0.045454545454545456,
2.0: 0.011363636363636364,
3.0: 0.03409090909090909,
4.0: 0.4659090909090909,
5.0: 0.11363636363636363,
6.0: 0.045454545454545456,
7.0: 0.1590909090909091,
8.0: 0.045454545454545456,
9.0: 0.03409090909090909,
10.0: 0.045454545454545456},
'prob_3': {1.0: 0.1111111111111111,
2.0: 0,
3.0: 0.06349206349206349,
4.0: 0.3968253968253968,
5.0: 0.07936507936507936,
6.0: 0,
7.0: 0.19047619047619047,
8.0: 0.1111111111111111,
9.0: 0,
10.0: 0.047619047619047616},
'prob_4': {1.0: 0,
2.0: 0,
3.0: 0.043478260869565216,
4.0: 0.391304347826087,
5.0: 0.13043478260869565,
6.0: 0,
7.0: 0.08695652173913043,
8.0: 0.2608695652173913,
9.0: 0,
10.0: 0.08695652173913043}}
prob_all = pd.DataFrame.from_dict(prob_all_dict)
从“prob_all”中,数据框“out”是通过将一些单元格与其他单元格相乘来创建的:我已经将第一个作业的概率作为数据框中的第一行以及第二个作业的条件概率,具体取决于哪个作业人们在第一份工作中所做的类别,例如拥有工作类别 2 的概率,前提是工作 1 类别是 3 等。
out=[prob_all['prob_1']]+[prob_all['prob_2']*prob_all['prob_1'].iloc[x] for x in range(0,10)]
out=pd.concat(out,axis=1)
out=(out.join(pd.concat([prob_all['prob_3']*out.iloc[x,1] for x in range(0,10)],axis=1))
.join(pd.concat([prob_all['prob_3']*out.iloc[x,2] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,3] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,4] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,5] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,6] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,7] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,8] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,9] for x in range(0,10)],axis=1),rsuffix='x')
.join(pd.concat([prob_all['prob_3']*out.iloc[x,10] for x in range(0,10)],axis=1),rsuffix='x')
).values
out=pd.DataFrame(out).T
在第三步中,我想根据一个人在第一份、第二份和第三份工作中所做的工作来确定工作类别 1-10 的概率。我在第三个代码块中手动执行此操作,但希望对所有 1000 个组合“自动”执行此操作:
out.iloc[11,0]*prob_all['prob_4'][1]
out.iloc[11,0]*prob_all['prob_4'][2]
out.iloc[11,0]*prob_all['prob_4'][3]
out.iloc[11,0]*prob_all['prob_4'][4]
out.iloc[11,0]*prob_all['prob_4'][5]
out.iloc[11,0]*prob_all['prob_4'][6]
out.iloc[11,0]*prob_all['prob_4'][7]
out.iloc[11,0]*prob_all['prob_4'][8]
out.iloc[11,0]*prob_all['prob_4'][9]
out.iloc[11,0]*prob_all['prob_4'][10]
out.iloc[11,1]*prob_all['prob_4'][1]
out.iloc[11,1]*prob_all['prob_4'][2]
out.iloc[11,1]*prob_all['prob_4'][3]
out.iloc[11,1]*prob_all['prob_4'][4]
out.iloc[11,1]*prob_all['prob_4'][5]
out.iloc[11,1]*prob_all['prob_4'][6]
out.iloc[11,1]*prob_all['prob_4'][7]
out.iloc[11,1]*prob_all['prob_4'][8]
out.iloc[11,1]*prob_all['prob_4'][9]
out.iloc[11,1]*prob_all['prob_4'][10]
out.iloc[11,2]*prob_all['prob_4'][1]
out.iloc[11,2]*prob_all['prob_4'][2]
out.iloc[11,2]*prob_all['prob_4'][3]
out.iloc[11,2]*prob_all['prob_4'][4]
out.iloc[11,2]*prob_all['prob_4'][5]
out.iloc[11,2]*prob_all['prob_4'][6]
out.iloc[11,2]*prob_all['prob_4'][7]
out.iloc[11,2]*prob_all['prob_4'][8]
out.iloc[11,2]*prob_all['prob_4'][9]
out.iloc[11,2]*prob_all['prob_4'][10]
out.iloc[11,3]*prob_all['prob_4'][1]
out.iloc[11,3]*prob_all['prob_4'][2]
out.iloc[11,3]*prob_all['prob_4'][3]
out.iloc[11,3]*prob_all['prob_4'][4]
out.iloc[11,3]*prob_all['prob_4'][5]
out.iloc[11,3]*prob_all['prob_4'][6]
out.iloc[11,3]*prob_all['prob_4'][7]
out.iloc[11,3]*prob_all['prob_4'][8]
out.iloc[11,3]*prob_all['prob_4'][9]
out.iloc[11,3]*prob_all['prob_4'][10]
out.iloc[11,4]*prob_all['prob_4'][1]
out.iloc[11,4]*prob_all['prob_4'][2]
out.iloc[11,4]*prob_all['prob_4'][3]
out.iloc[11,4]*prob_all['prob_4'][4]
out.iloc[11,4]*prob_all['prob_4'][5]
out.iloc[11,4]*prob_all['prob_4'][6]
out.iloc[11,4]*prob_all['prob_4'][7]
out.iloc[11,4]*prob_all['prob_4'][8]
out.iloc[11,4]*prob_all['prob_4'][9]
out.iloc[11,4]*prob_all['prob_4'][10]
out.iloc[11,5]*prob_all['prob_4'][1]
out.iloc[11,5]*prob_all['prob_4'][2]
out.iloc[11,5]*prob_all['prob_4'][3]
out.iloc[11,5]*prob_all['prob_4'][4]
out.iloc[11,5]*prob_all['prob_4'][5]
out.iloc[11,5]*prob_all['prob_4'][6]
out.iloc[11,5]*prob_all['prob_4'][7]
out.iloc[11,5]*prob_all['prob_4'][8]
out.iloc[11,5]*prob_all['prob_4'][9]
out.iloc[11,5]*prob_all['prob_4'][10]
out.iloc[11,6]*prob_all['prob_4'][1]
out.iloc[11,6]*prob_all['prob_4'][2]
out.iloc[11,6]*prob_all['prob_4'][3]
out.iloc[11,6]*prob_all['prob_4'][4]
out.iloc[11,6]*prob_all['prob_4'][5]
out.iloc[11,6]*prob_all['prob_4'][6]
out.iloc[11,6]*prob_all['prob_4'][7]
out.iloc[11,6]*prob_all['prob_4'][8]
out.iloc[11,6]*prob_all['prob_4'][9]
out.iloc[11,6]*prob_all['prob_4'][10]
out.iloc[11,7]*prob_all['prob_4'][1]
out.iloc[11,7]*prob_all['prob_4'][2]
out.iloc[11,7]*prob_all['prob_4'][3]
out.iloc[11,7]*prob_all['prob_4'][4]
out.iloc[11,7]*prob_all['prob_4'][5]
out.iloc[11,7]*prob_all['prob_4'][6]
out.iloc[11,7]*prob_all['prob_4'][7]
out.iloc[11,7]*prob_all['prob_4'][8]
out.iloc[11,7]*prob_all['prob_4'][9]
out.iloc[11,7]*prob_all['prob_4'][10]
out.iloc[11,8]*prob_all['prob_4'][1]
out.iloc[11,8]*prob_all['prob_4'][2]
out.iloc[11,8]*prob_all['prob_4'][3]
out.iloc[11,8]*prob_all['prob_4'][4]
out.iloc[11,8]*prob_all['prob_4'][5]
out.iloc[11,8]*prob_all['prob_4'][6]
out.iloc[11,8]*prob_all['prob_4'][7]
out.iloc[11,8]*prob_all['prob_4'][8]
out.iloc[11,8]*prob_all['prob_4'][9]
out.iloc[11,8]*prob_all['prob_4'][10]
out.iloc[11,9]*prob_all['prob_4'][1]
out.iloc[11,9]*prob_all['prob_4'][2]
out.iloc[11,9]*prob_all['prob_4'][3]
out.iloc[11,9]*prob_all['prob_4'][4]
out.iloc[11,9]*prob_all['prob_4'][5]
out.iloc[11,9]*prob_all['prob_4'][6]
out.iloc[11,9]*prob_all['prob_4'][7]
out.iloc[11,9]*prob_all['prob_4'][8]
out.iloc[11,9]*prob_all['prob_4'][9]
out.iloc[11,9]*prob_all['prob_4'][10]
out.iloc[12,0]*prob_all['prob_4'][1]
out.iloc[12,0]*prob_all['prob_4'][2]
out.iloc[12,0]*prob_all['prob_4'][3]
out.iloc[12,0]*prob_all['prob_4'][4]
out.iloc[12,0]*prob_all['prob_4'][5]
out.iloc[12,0]*prob_all['prob_4'][6]
out.iloc[12,0]*prob_all['prob_4'][7]
out.iloc[12,0]*prob_all['prob_4'][8]
out.iloc[12,0]*prob_all['prob_4'][9]
out.iloc[12,0]*prob_all['prob_4'][10]
out.iloc[12,1]*prob_all['prob_4'][1]
out.iloc[12,1]*prob_all['prob_4'][2]
out.iloc[12,1]*prob_all['prob_4'][3]
out.iloc[12,1]*prob_all['prob_4'][4]
out.iloc[12,1]*prob_all['prob_4'][5]
out.iloc[12,1]*prob_all['prob_4'][6]
out.iloc[12,1]*prob_all['prob_4'][7]
out.iloc[12,1]*prob_all['prob_4'][8]
out.iloc[12,1]*prob_all['prob_4'][9]
out.iloc[12,1]*prob_all['prob_4'][10]
有人可以帮我吗?我已经坚持了很长时间。谢谢!
解决方案
您可以使用循环来更改索引并将每一行附加到数据帧
for i in range(11, 13):
for j in range(10):
out.loc[len(out)] = [out.iloc[i, j] * prob_all['prob_4'][k] for k in range(1, 11)]
计算out
也可以简化
for i in range(1, 11):
out = out.join(pd.concat([prob_all['prob_3'] * out.iloc[x, i] for x in range(0, 10)], axis=1), rsuffix='x')
out = pd.DataFrame(out.values).T
推荐阅读
- reactjs - 如何使用 Reactjs 在表格中显示数组对象
- android - Android不支持链接电话(Expo react native)
- javascript - 禁用先前时间的时间范围选择框和 javascript
- python - PrettyTable 初始化了超过 1 个变量
- python - 使用 Spark 进行大数据分析
- javascript - 有没有办法在页面完全加载之前显示加载动画?
- pandas - 从另一个数据帧更新数据帧,但仅在值发生变化的情况下
- python - Z3 可以解决 MILP 优化问题吗?能输出top N最好的结果吗?
- python - python动态创建字典
- c# - 使用社交登录名登录 Azure SQL