首页 > 解决方案 > 重建链接观察链

问题描述

我有一个数据集,其中公司可以由人(类型 0)或其他公司(类型 1)拥有。这是一个例子:


* Example generated by -dataex-. To install: ssc install dataex
clear
input str1(firmid ownerid) str7 last_name byte type
"A" "B" ""        1
"A" "C" ""        1
"A" "D" ""        1
"B" "E" ""        1
"B" "F" ""        1
"E" "I" "Smith"   0
"E" "J" "Johnson" 0
"F" "G" ""        1
"G" "I" "Smith"   0
"G" "J" "Johnson" 0
"G" "H" ""        1
end

公司可以由其他公司拥有,而其他公司又由其他公司拥有,依此类推。例如,A 由 B 拥有,B 由 E 和 F 拥有,等等。我想重建这些所有权链,直到我到达最终所有者(一个人,如 I 或 J,或一家公司数据中没有所有者,例如 C、D 或 H)。最终输出如下所示:


* Example generated by -dataex-. To install: ssc install dataex
clear
input str1(firmid ownerid1 ownerid2 ownerid3) str5 last_name1 str7 last_name2 byte(type1 type2 type3) str1(ultimate_owner1 ultimate_owner2 ultimate_owner3 ultimate_owner4 ultimate_owner5)
"A" "B" "C" "D" ""      ""        1 1 1 "I" "J" "H" "C" "D"
"B" "E" "F" ""  ""      ""        1 1 . "I" "J" "H" ""  "" 
"E" "I" "J" ""  "Smith" "Johnson" 0 0 . "I" "J" ""  ""  "" 
"F" "G" ""  ""  ""      ""        1 . . "I" "J" "H" ""  "" 
"G" "I" "J" "H" "Smith" "Johnson" 0 0 1 "I" "J" "H" ""  "" 
end

我认为vlookup这将是一个很好的起点。使用以下代码,我检索所有中间和最终所有者(所以我接近目标),但代码很麻烦,我看不出如何将其推广到任意长的所有权链。

ssc inst missings
by firmid, sort: g rep=_n
reshape wide ownerid last_name share type, i(firmid) j(rep)

qui forv j=1/3{
    g rep_ownerid`j'=ownerid`j'
forv i=1/3{
    vlookup rep_ownerid`j', gen(ownerid`j'`i') key(firmid) value(ownerid`i')
forv k=1/3{
    vlookup ownerid`j'`i', gen(ownerid`j'`i'`k') key (firmid) value(ownerid`k')
forv m=1/3{
    replace ownerid`m'="99" if ownerid`m'==""
    vlookup ownerid`j'`i'`k', gen(ownerid`j'`i'`k'`m') key (firmid) value(ownerid`m')
}
}
}
}

qui foreach x of varlist ownerid* {
replace `x'="" if `x'=="99"
}

qui missings dropvars, force
drop rep*

我想知道是否有更好/更通用的方法来实现我的目标?原始数据相当大(3M 观察)。

标签: statavlookup

解决方案


推荐阅读