首页 > 解决方案 > 非连续高度变量的变换

问题描述

我无法有效地转换样本中错误编码的高度变量。

变量编码如下:

     Height 
ID1    601   
ID1    601   
ID1    601   
ID3    409   
ID3    410   
ID4    511
.      .
.      .
.      .
ID100  400

如您所见,变量同时编码为英尺和英寸,其中601等于 6 英尺 1 英寸,511等于 5 英尺 11 英寸等。

我的目标是将这些数字转换为英寸:

replace Height = 48 if Height == 400 
replace Height = 49 if Height == 401
replace Height = 50 if Height == 402
replace Height = 51 if Height == 403
.
.
.
replace Height = 83 if Height == 611

如何使用循环有效地编写代码?

标签: statatransformation

解决方案


下面的代码假设每个观察值Height恰好包含三个数字:

tostring Height, generate(Height_string)
    /* Generate a new variable which is a string-version of Height
       (so that we can get the individual digits) */

generate feet = substr(Height_string, 1, 1)
    /* From the first character in the string, select one character */

generate inch = substr(Height_string, 2, 2)
    /* From the second character in the string, select two characters.
       An equivalent alternative would have been
       generate inch = substr(Height_string, -2, 2)
       which from the second to last character selects two characters */

destring feet inch, replace
    /* Convert these two new variables to numeric */

generate tot_inch = feet * 12 + inch
    /* Generate a new variable which measure only in inches. */

推荐阅读