首页 > 解决方案 > 编辑 hbar 图 (Stata)

问题描述

我下面的代码生成了附加的图表。但是,我正在尝试添加两个调整但没有运气。1- 我想组织 Y 轴,所有行业的 11 月都在 12 月之前,而不是像当前图表中那样按哪个月有更多工作来排列。2-我还尝试在 Y 轴上添加标签,它只显示“Nov”和“Dec”,没有额外的文本,虽然 Stata 不会产生任何错误,但它不会改变图表。

preserve
drop if total_jobs_industry<15
graph hbar (count) total_jobs_industry, over(month) over(industry, sort(1)) subtitle("Jobs by Industry and month", span) 
restore 

我知道我可以在 Stata 中手动更改带有微小细节的图表,但如果可能的话,我更喜欢自动化这个过程。 在此处输入图像描述

数据示例:

 Example generated by -dataex-. To install: ssc install dataex
clear
input float total_jobs_industry str39 industry str8 month
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
38 "Computer Hardware & Software"         "Dec_2020"
12 "Consulting"                           "Dec_2020"
63 ""                                     "Dec_2020"
32 "IT Services"                          "Dec_2020"
32 "IT Services"                          "Nov_2020"
38 "Computer Hardware & Software"         "Nov_2020"
12 "Aerospace & Defense"                  "Nov_2020"
12 "Accounting"                           "Nov_2020"
12 "Accounting"                           "Dec_2020"

当我使用 sum 而不是 count 时,我得到下图:

preserve
drop if total_jobs_industry<15
graph hbar (sum) total_jobs_industry, over(month) over(industry, sort(1)) subtitle("Jobs by Industry and month", span) 
restore 

此外,这就是我创建变量以计算每个行业的工作数量的方式:

// The variable id contains observation number running from 1 to X and nt is the total number of observations
generate id = _n
generate nt = _N

// Sorting by inudstry. Now n1 is the observation number within each Industry group and total_jobs_industry is the total number of observations for each Industry group.
sort industry 
by industry: generate n1 = _n
by industry: generate total_jobs_industry = _N
order total_jobs_industry, a(industry)

在此处输入图像描述

标签: stata

解决方案


这是一个非常令人费解的问题。以下原因列表不完整。

  1. 该帖子似乎混合了自身的新旧版本,并且不一致。你不能合理地期望我们可靠地解码这样一个曲折的故事。这里的标准是提供一个最小的可验证示例,并且该线程不符合该标准。请参阅此处的指南

  2. 显示的图表均不对应于给定的数据。

  3. 我很难相信这(count)对您的数据有意义。如前所述,它计算非缺失值,但您的关键变量似乎是total_count_industry. 另一方面,不同的工作(sum)和观察的数量似乎混淆了完全不同类型的计算。

  4. 您的示例数据中似乎有重复的观察结果。

  5. 您声明您还尝试在 Y 轴上添加标签,其中仅显示“Nov”和“Dec”,但您的代码中没有任何内容显示任何此类评论尝试。

  6. 您期望Nov_2020sort before Dec_2020,这不会发生,因为就 Stata 而言,它只是一个字符串变量,因此sort DbeforeN是最重要的。这就是 12 月在 1 月之前排序的原因,这与按行业值排序无关,它只影响条形组的排序。您没有使用 Stata 的日期变量功能。

除了最后一个问题,我怀疑我是否能理解这些问题中的任何一个。它似乎是一个限制graph hbar,它忽略了时间变量显示格式,所以我使用值标签来确保它NovDec按照您希望的顺序进行排序。

clear
input float total_jobs_industry str39 industry str8 month
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Dec_2020"
11 "Architectural & Engineering Services" "Nov_2020"
38 "Computer Hardware & Software"         "Dec_2020"
12 "Consulting"                           "Dec_2020"
63 ""                                     "Dec_2020"
32 "IT Services"                          "Dec_2020"
32 "IT Services"                          "Nov_2020"
38 "Computer Hardware & Software"         "Nov_2020"
12 "Aerospace & Defense"                  "Nov_2020"
12 "Accounting"                           "Nov_2020"
12 "Accounting"                           "Dec_2020"
end 

duplicates drop 

gen mdate = monthly(month, "MY")

levelsof mdate, local(months)
tokenize "`c(Mons)'" 
foreach m of local months { 
    local month = month(dofm(`m'))
    label def mdate `m' "``month''", modify 
}
label val mdate mdate 

set scheme s1color 
graph hbar (asis) total_jobs_industry, over(mdate) over(industry, sort(1) descending) 

在此处输入图像描述


推荐阅读