stata - 如何通过 Var_a 和 Var_b 获得 Var_c 的总和?
问题描述
我试图找到两个变量的总和。
如果我有以下数据:
Name Commodity Amount_cmdt
Alex apple 5
Ben orange 10
Chris apple 25
Alex orange 10
Alex apple 10
Chris orange 10
Ben apple 5
我想要一个看起来像这样的最终数据集:
Name Commodity Amount_cmdt total_apple total_orange
Alex apple 5 15 10
Ben orange 10 5 10
Chris apple 25 25 20
Alex orange 10 15 10
Alex apple 10 15 10
Chris orange 10 25 20
Ben apple 5 5 10
Chris orange 10 25 20
最终,当我拥有每个人拥有的苹果和橙子的数量时,我可以丢弃重复项。但是我如何制定声明:
如果 name = Chris 和 Commodity = orange,那么 total_orange = sum(Amount_cmdt)?
我写了以下内容,但它汇总了所有苹果或所有橙子,无论名称如何:
foreach var of varlist Name {
foreach var of varlist Commodity {
replace total_apple = sum( Amount_cmdt) if Commodity == "apple"
replace total_orange = sum( Amount_cmdt) if Commodity == "orange"
}
}
list
解决方案
使用您的玩具示例:
clear
input strL(name commodity) amount total_apple total_orange
Alex apple 5 15 10
Ben orange 10 5 10
Chris apple 25 25 20
Alex orange 10 15 10
Alex apple 10 15 10
Chris orange 10 25 20
Ben apple 5 5 10
Chris orange 10 25 20
end
以下对我有用:
bysort name commodity: egen totals = total(amount)
bysort name (commodity): generate totalapple = totals[1]
bysort name (commodity): generate totalorange = totals[_N]
list name commodity amount total_apple totalapple total_orange totalorange, abbreviate(15)
+------------------------------------------------------------------------------------+
| name commodity amount total_apple totalapple total_orange totalorange |
|------------------------------------------------------------------------------------|
1. | Alex apple 5 15 15 10 10 |
2. | Alex apple 10 15 15 10 10 |
3. | Alex orange 10 15 15 10 10 |
4. | Ben apple 5 5 5 10 10 |
5. | Ben orange 10 5 5 10 10 |
|------------------------------------------------------------------------------------|
6. | Chris apple 25 25 25 20 20 |
7. | Chris orange 10 25 25 20 20 |
8. | Chris orange 10 25 25 20 20 |
+------------------------------------------------------------------------------------+
编辑:
您可以将其概括为两种以上的商品,如下所示:
clear
input strL(name commodity) amount
Alex apple 5
Ben orange 10
Chris apricot 3
Alex apricot 4
Ben apricot 2
Chris apple 25
Alex orange 10
Alex apple 10
Chris orange 10
Ben apple 5
Chris apricot 15
Alex apricot 6
Chris orange 10
end
bysort name commodity: egen totals = total(amount)
egen commodities = group(commodity)
levelsof commodity, local(allcommodities) clean
local i 0
foreach var of local allcommodities {
local ++i
generate `var' = .
bysort name (commodity): replace `var' = totals if commodities == `i'
bysort name (commodity): egen total`var' = min(`var')
drop `var'
}
drop commodities
修改后的代码片段将产生所需的输出:
list name commodity amount total*, abbreviate(15)
+-------------------------------------------------------------------------------+
| name commodity amount totals totalapple totalapricot totalorange |
|-------------------------------------------------------------------------------|
1. | Alex apple 5 15 15 10 10 |
2. | Alex apple 10 15 15 10 10 |
3. | Alex apricot 6 10 15 10 10 |
4. | Alex apricot 4 10 15 10 10 |
5. | Alex orange 10 10 15 10 10 |
|-------------------------------------------------------------------------------|
6. | Ben apple 5 5 5 2 10 |
7. | Ben apricot 2 2 5 2 10 |
8. | Ben orange 10 10 5 2 10 |
9. | Chris apple 25 25 25 18 20 |
10. | Chris apricot 3 18 25 18 20 |
|-------------------------------------------------------------------------------|
11. | Chris apricot 15 18 25 18 20 |
12. | Chris orange 10 20 25 18 20 |
13. | Chris orange 10 20 25 18 20 |
+-------------------------------------------------------------------------------+
推荐阅读
- javascript - if 语句中的间歇值通过 html 输入范围
- c++ - C++ - 通过从文件中读取来创建对象,但内流不会在文件末尾停止。`std::invalid_argument what() stoi`
- php - Wampserver php根相对路径不再起作用
- here-api - 问:[天气] 如何查看示例警报响应数据?
- r - 为什么 melt (reshape2) 用列序号替换列名?
- marklogic - Marklogic 和三元组
- python - Python 字典类
- umbraco - Umbraco 关系服务和从属 Umbraco 实例
- php - bash 或 sed 中的正则表达式
- python - 从另一个 python 脚本调用带有 args 的 python 脚本