首页 > 解决方案 > 如何通过 Var_a 和 Var_b 获得 Var_c 的总和?

问题描述

我试图找到两个变量的总和。

如果我有以下数据:

Name   Commodity        Amount_cmdt

Alex       apple           5
Ben        orange          10
Chris      apple           25
Alex       orange          10
Alex       apple           10
Chris      orange          10
Ben        apple            5  

我想要一个看起来像这样的最终数据集:

Name   Commodity      Amount_cmdt       total_apple    total_orange

Alex       apple           5                   15              10
Ben        orange          10                  5               10
Chris      apple           25                  25              20
Alex       orange          10                  15              10
Alex       apple           10                  15              10
Chris      orange          10                  25              20
Ben        apple            5                   5              10 
Chris      orange          10                  25              20   

最终,当我拥有每个人拥有的苹果和橙子的数量时,我可以丢弃重复项。但是我如何制定声明:

如果 name = Chris 和 Commodity = orange,那么 total_orange = sum(Amount_cmdt)?

我写了以下内容,但它汇总了所有苹果或所有橙子,无论名称如何:

foreach var of varlist Name {
    foreach var of varlist Commodity {
        replace total_apple = sum( Amount_cmdt) if Commodity == "apple"
        replace total_orange = sum( Amount_cmdt) if Commodity == "orange"
    }
}

list

标签: stata

解决方案


使用您的玩具示例:

clear

input strL(name commodity) amount total_apple total_orange
Alex       apple           5                   15              10
Ben        orange          10                  5               10
Chris      apple           25                  25              20
Alex       orange          10                  15              10
Alex       apple           10                  15              10
Chris      orange          10                  25              20
Ben        apple            5                   5              10 
Chris      orange          10                  25              20 
end

以下对我有用:

bysort name commodity: egen totals = total(amount)
bysort name (commodity): generate totalapple = totals[1]
bysort name (commodity): generate totalorange = totals[_N]

list name commodity amount total_apple totalapple total_orange totalorange, abbreviate(15)

     +------------------------------------------------------------------------------------+
     |  name   commodity   amount   total_apple   totalapple   total_orange   totalorange |
     |------------------------------------------------------------------------------------|
  1. |  Alex       apple        5            15           15             10            10 |
  2. |  Alex       apple       10            15           15             10            10 |
  3. |  Alex      orange       10            15           15             10            10 |
  4. |   Ben       apple        5             5            5             10            10 |
  5. |   Ben      orange       10             5            5             10            10 |
     |------------------------------------------------------------------------------------|
  6. | Chris       apple       25            25           25             20            20 |
  7. | Chris      orange       10            25           25             20            20 |
  8. | Chris      orange       10            25           25             20            20 |
     +------------------------------------------------------------------------------------+

编辑:

您可以将其概括为两种以上的商品,如下所示:

clear

input strL(name commodity) amount 
Alex       apple           5     
Ben        orange          10                 
Chris      apricot         3
Alex       apricot         4
Ben        apricot         2
Chris      apple           25         
Alex       orange          10              
Alex       apple           10         
Chris      orange          10          
Ben        apple            5             
Chris      apricot         15
Alex       apricot         6
Chris      orange          10                
end

bysort name commodity: egen totals = total(amount)
egen commodities = group(commodity)

levelsof commodity, local(allcommodities) clean
local i 0

foreach var of local allcommodities {
    local ++i
    generate `var' = .
    bysort name (commodity): replace `var' = totals if commodities == `i'
    bysort name (commodity): egen total`var' = min(`var')
    drop `var'
}

drop commodities

修改后的代码片段将产生所需的输出:

list name commodity amount total*, abbreviate(15)

     +-------------------------------------------------------------------------------+
     |  name   commodity   amount   totals   totalapple   totalapricot   totalorange |
     |-------------------------------------------------------------------------------|
  1. |  Alex       apple        5       15           15             10            10 |
  2. |  Alex       apple       10       15           15             10            10 |
  3. |  Alex     apricot        6       10           15             10            10 |
  4. |  Alex     apricot        4       10           15             10            10 |
  5. |  Alex      orange       10       10           15             10            10 |
     |-------------------------------------------------------------------------------|
  6. |   Ben       apple        5        5            5              2            10 |
  7. |   Ben     apricot        2        2            5              2            10 |
  8. |   Ben      orange       10       10            5              2            10 |
  9. | Chris       apple       25       25           25             18            20 |
 10. | Chris     apricot        3       18           25             18            20 |
     |-------------------------------------------------------------------------------|
 11. | Chris     apricot       15       18           25             18            20 |
 12. | Chris      orange       10       20           25             18            20 |
 13. | Chris      orange       10       20           25             18            20 |
     +-------------------------------------------------------------------------------+

推荐阅读