《两个群体的比较-Stata教学.ppt》由会员分享,可在线阅读,更多相关《两个群体的比较-Stata教学.ppt(63页珍藏版)》请在三一文库上搜索。
1、Ming-chi Chen,社會統計,Page.1,Stata教學,第四講 兩個樣本之間的比較,Ming-chi Chen,社會統計,Page.2,打開85q1family.dta這個社會變遷基本資料調查第三期第二次家庭的Stata資料檔 因為中文相容性問題有一些亂碼,辨識不易 可以打開85q1_format.txt看變數名稱以及變數值名稱 以j2、j3為例 j2問受訪者拾.2.通常您平均每週大約花多少時間做家務工作?_ 小時 j3問受訪者拾.3.通常您的配偶平均每週大約花多少時間做家務工作?_小時,Ming-chi Chen,社會統計,Page.3,我們的資料裡有變數標籤,但是因為相容性的關
2、係會有亂碼 查看是否有亂碼? Data-data editor 在j2這個變數名稱上click一下,下面一整欄的數值都反白了 滑鼠右鍵-variable-properties-label 出現的中文是通常您平均牢週大約花多少時間做家務工作 把亂碼改好 也將j3變數標籤的亂碼改好,Ming-chi Chen,社會統計,Page.4,查看變數有無異常值,關掉Data editor視窗 用box plot來看有無極端值 Graphics-easy graphs-box plot-main-在variable的空格裡鍵入j2,Ming-chi Chen,社會統計,Page.5,用box plot來看有
3、無極端值,Ming-chi Chen,社會統計,Page.6,同樣方法也可以查看j3的極端值 也可以直接在指令欄,Ming-chi Chen,社會統計,Page.7,這就是指令欄,Ming-chi Chen,社會統計,Page.8,在指令欄裡直接鍵入 Graph box j2 然後按enter,Ming-chi Chen,社會統計,Page.9,Summarize varname, detail,指令欄鍵入summarize j2, detail 或statistics-summaries, tables, &tests-summary statistics-summary statistic
4、s,Ming-chi Chen,社會統計,Page.10,. 通常您平均每週大約花多少時間做家務工作? - Percentiles Smallest 1% 0 0 5% 0 0 10% 0 0 Obs 1924 25% 2 0 Sum of Wgt. 1924 50% 7 Mean 50.32692 Largest Std. Dev. 191.1342 75% 20 998 90% 35 998 Variance 36532.28 95% 70 998 Skewness 4.717707 99% 996 999 Kurtosis 23.40378,太愛做家事了吧!,高得不合理,Ming-chi
5、 Chen,社會統計,Page.11,Recode極端值,我們到85q1_format.txt去看,發現 J2 J3 996“不知道“ 998“不適用“ 999“拒答“ 所以要把995以上定義為system missing Recode j2 995/max=. 這裡的句點.就是Stata系統定義的缺失值。,12,. summarize j2, detail 通常您平均每週大約花多少時間做家務工作? - Percentiles Smallest 1% 0 0 5% 0 0 10% 0 0 Obs 1849 25% 2 0 Sum of Wgt. 1849 50% 7 Mean 11.96106
6、 Largest Std. Dev. 15.30762 75% 15 105 90% 28 112 Variance 234.3232 95% 36 168 Skewness 3.208555 99% 70 168 Kurtosis 20.90302,一週只有168小時,所以應該合理換算,以一天16小時算,一週112小時,13,. inspect j2 j2: 通常您平均每週大約花多少時間做家務工作 Number of Observations - - Total Integers Nonintegers | # Negative - - - | # Zero 305 305 - | # Po
7、sitive 1544 1544 - | # - - - | # Total 1849 1849 - | # . . . . Missing 75 +- - 0 168 1924 (47 unique values),用inspect來看大致分佈以及缺失個案數Data-describe data-inspect variables,Ming-chi Chen,社會統計,Page.14,Recode j2 168=112,15,. inspect j2 j2: 通常您平均每週大約花多少時間做家務工作 Number of Observations - - Total Integers Nonint
8、egers | # Negative - - - | # Zero 305 305 - | # Positive 1544 1544 - | # - - - | # Total 1849 1849 - | # . . . . Missing 75 +- - 0 112 1924 (46 unique values),16,. sum j2, detail 通常您平均每週大約花多少時間做家務工作? - Percentiles Smallest 1% 0 0 5% 0 0 10% 0 0 Obs 1849 25% 2 0 Sum of Wgt. 1849 50% 7 Mean 11.90049 L
9、argest Std. Dev. 14.79188 75% 15 105 90% 28 112 Variance 218.7996 95% 36 112 Skewness 2.632377 99% 70 112 Kurtosis 12.87359,17,. inspect j3 j3: 通常您的配偶平均每週大約花多少時間做家 Number of Observations - - Total Integers Nonintegers | # Negative - - - | # Zero 263 263 - | # Positive 1661 1661 - | # - - - | # # Tot
10、al 1924 1924 - | # . . . # Missing - +- - 0 999 1924 (54 unique values),18,. summarize j3, detail 通常您的配偶平均每週大約花多少時間做家務工作? Percentiles Smallest 1% 0 0 5% 0 0 10% 0 0 Obs 1924 25% 4 0 Sum of Wgt. 1924 50% 14 Mean 278.8342 Largest Std. Dev. 436.2336 75% 996 998 90% 998 999 Variance 190299.7 95% 998 999
11、 Skewness 1.03888 99% 998 999 Kurtosis 2.085666,Ming-chi Chen,社會統計,Page.19,Missing value & recode,Recode j3 990/max=. Recode j3 168=112,20,. recode j3 168=112 (j3: 4 changes made) . inspect j3 j3: 通常您的配偶平均每週大約花多少時間做家 Number of Observations - - Total Integers Nonintegers | # Negative - - - | # Zero 2
12、63 263 - | # Positive 1144 1144 - | # - - - | # Total 1407 1407 - | # . . . . Missing 517 +- - 0 150 1924 (50 unique values),21,. summarize j3, detail 通常您的配偶平均每週大約花多少時間做家務工作? Percentiles Smallest 1% 0 0 5% 0 0 10% 0 0 Obs 1407 25% 2 0 Sum of Wgt. 1407 50% 7 Mean 14.49893 Largest Std. Dev. 18.2296 75
13、% 21 112 90% 35 112 Variance 332.3185 95% 49 150 Skewness 2.569526 99% 85 150 Kurtosis 12.65059,Ming-chi Chen,社會統計,Page.22,Recode j3 112/max=112 Tabulate j3,Ming-chi Chen,社會統計,Page.23,70 | 10 0.71 98.29 80 | 3 0.21 98.51 84 | 6 0.43 98.93 85 | 1 0.07 99.00 90 | 1 0.07 99.08 98 | 4 0.28 99.36 100 | 1
14、 0.07 99.43 105 | 1 0.07 99.50 112 | 7 0.50 100.00 -+- Total | 1,407 100.00,Ming-chi Chen,社會統計,Page.24,來看看男女的差別 A1.這題是性別,男是1,女是2。 Data-data editor-找的A1這個變數-滑鼠右鍵 Variable-properties-label改成性別 Value label-define/modify-define-label name 輸入gender-OK-value鍵入1-text鍵入男-OK value鍵入1-text鍵入男-OK-cancel-close-
15、value label選擇gender-OK 關掉Data editor視窗,Ming-chi Chen,社會統計,Page.25,男女的家務分擔是否有不同?,Statistics-Summaries, tables, & tests-tables-One/Two-way table of summary statistics,自變數,依變數,Ming-chi Chen,社會統計,Page.26,差別很大嗎?,| Summary of | 通常您平均每週大約花多少時間做家務工作 | 性別 | Mean Std. Dev. Freq. -+- 男 | 6.0485537 10.23684 968
16、 女 | 18.330306 16.287017 881 -+- Total | 11.900487 14.791877 1849,Ming-chi Chen,社會統計,Page.27,母體變異數未知但已知相等,Statistics-Summaries, tables, & tests-Classical tests of hypotheses-Group mean comparison tests,依變數,自變數,信賴水準,28,. ttest j2, by(a1) level(99) Two-sample t test with equal variances - Group | Obs
17、Mean Std. Err. Std. Dev. 99% Conf. Interval -+- 男 | 968 6.048554 .3290245 10.23684 5.199367 6.897741 女 | 881 18.33031 .5487235 16.28702 16.91382 19.7468 -+- combined | 1849 11.90049 .3439971 14.79188 11.01349 12.78748 -+- diff | -12.28175 .6268771 -13.89815 -10.66535 - diff = mean(男) - mean(女) t = -
18、19.5920 Ho: diff = 0 degrees of freedom = 1847 Ha: diff 0 Pr(T |t|) = 0.0000 Pr(T t) = 1.0000,Ming-chi Chen,社會統計,Page.29,母體變異數未知但已知不相等,以上的方法是假設母體變異數未知但已知相等。 不管樣本大小,統計軟體一般用t檢定 那如果母體變異數未知但已知不相等,怎麼辦?,Ming-chi Chen,社會統計,Page.30,母體變異數未知但已知不相等,Statistics-Summaries, tables, & tests-Classical tests of hypot
19、heses-Group mean comparison tests,變異數不相等,自由度需要比較複雜,由Welch提出的運算方式,Ming-chi Chen,社會統計,Page.31,男女性負擔家務工作時數的差異,在母體變異數未知但已知不等的情況下,. ttest j2, by(a1) unequal welch level(99) Two-sample t test with unequal variances - Group | Obs Mean Std. Err. Std. Dev. 99% Conf. Interval -+- 男 | 968 6.048554 .3290245 10.
20、23684 5.199367 6.897741 女 | 881 18.33031 .5487235 16.28702 16.91382 19.7468 -+- combined | 1849 11.90049 .3439971 14.79188 11.01349 12.78748 -+- diff | -12.28175 .6398083 -13.93195 -10.63155 - diff = mean(男) - mean(女) t = -19.1960 Ho: diff = 0 Welchs degrees of freedom = 1456.62 Ha: diff 0 Pr(T |t|)
21、 = 0.0000 Pr(T t) = 1.0000,Ming-chi Chen,社會統計,Page.32,變異數相等與否的Levene檢定,Statistics-Summaries, tables, & tests-Classical tests of hypotheses-Group variance comparison tests,依變數,自變數,Ming-chi Chen,社會統計,Page.33,變異數相等與否的Levene檢定,. sdtest j2, by(a1) level(99) Variance ratio test - Group | Obs Mean Std. Err
22、. Std. Dev. 99% Conf. Interval -+- 男 | 968 6.048554 .3290245 10.23684 5.199367 6.897741 女 | 881 18.33031 .5487235 16.28702 16.91382 19.7468 -+- combined | 1849 11.90049 .3439971 14.79188 11.01349 12.78748 - ratio = sd(男) / sd(女) f = 0.3950 Ho: ratio = 1 degrees of freedom = 967, 880 Ha: ratio 1 Pr(F
23、 f) = 1.0000,sd(男) / sd(女)不等於一,p值顯示可以拒斥變異數相等的虛無假設,Ming-chi Chen,社會統計,Page.34,根據Levene檢定的結果,選擇變異數不相等的假設比較正確。 也就是男性分擔家務的時數顯著地少於女性。,Ming-chi Chen,社會統計,Page.35,已婚未婚者的家務工作負擔的比較,A5為受訪者的婚姻狀況 1為未婚,2為已婚,3為其他 已婚者家務負擔比較大嗎?,Ming-chi Chen,社會統計,Page.36,已婚未婚者的家務工作負擔的比較,仿照男女的比較 得到如下的錯誤回報 . ttest j2, by(a5) level(99
24、) more than 2 groups found, only 2 allowed r(420); 這是因為a5這個變數有三個變數值:未婚、已婚和其他 要用條件是來限制,僅比較未婚者和已婚者,Ming-chi Chen,社會統計,Page.37,Statistics-Summaries, tables, & tests-Classical tests of hypotheses-Group mean comparison tests,Ming-chi Chen,社會統計,Page.38,變異數相等,. ttest j2 if a5!=3, by(a5) level(99) Two-sampl
25、e t test with equal variances - Group | Obs Mean Std. Err. Std. Dev. 99% Conf. Interval -+- 未婚 | 306 5.598039 .5156249 9.019752 4.261516 6.934562 已婚 | 1531 13.12671 .3912873 15.31029 12.11757 14.13586 -+- combined | 1837 11.87262 .3434793 14.7216 10.98695 12.75828 -+- diff | -7.528675 .9051995 -9.86
26、2742 -5.194608 - diff = mean(未婚) - mean(已婚) t = -8.3171 Ho: diff = 0 degrees of freedom = 1835 Ha: diff 0 Pr(T |t|) = 0.0000 Pr(T t) = 1.0000,Ming-chi Chen,社會統計,Page.39,變異數不相等,. ttest j2 if a5!=3, by(a5) unequal welch level(99) Two-sample t test with unequal variances - Group | Obs Mean Std. Err. St
27、d. Dev. 99% Conf. Interval -+- 未婚 | 306 5.598039 .5156249 9.019752 4.261516 6.934562 已婚 | 1531 13.12671 .3912873 15.31029 12.11757 14.13586 -+- combined | 1837 11.87262 .3434793 14.7216 10.98695 12.75828 -+- diff | -7.528675 .6472826 -9.20044 -5.85691 - diff = mean(未婚) - mean(已婚) t = -11.6312 Ho: di
28、ff = 0 Welchs degrees of freedom = 712.885 Ha: diff 0 Pr(T |t|) = 0.0000 Pr(T t) = 1.0000,Ming-chi Chen,社會統計,Page.40,Levene檢定,. sdtest j2 if a5!=3, by(a5) level(99) Variance ratio test - Group | Obs Mean Std. Err. Std. Dev. 99% Conf. Interval -+- 未婚 | 306 5.598039 .5156249 9.019752 4.261516 6.934562
29、 已婚 | 1531 13.12671 .3912873 15.31029 12.11757 14.13586 -+- combined | 1837 11.87262 .3434793 14.7216 10.98695 12.75828 - ratio = sd(未婚) / sd(已婚) f = 0.3471 Ho: ratio = 1 degrees of freedom = 305, 1530 Ha: ratio 1 Pr(F f) = 1.0000,無法拒斥變異數相等的虛無假設,Ming-chi Chen,社會統計,Page.41,兩層群體的比較,已婚男女間,未婚男女間是否有差異? 婚
30、姻是否不利於女性(至少就花在家務勞動上的時間而言)?,Ming-chi Chen,社會統計,Page.42,變異數相等,Statistics-Summaries, tables, & tests-Classical tests of hypotheses-Group mean comparison tests,43,. by a5, sort : ttest j2 if a5!=3, by(a1) level(99) - - a5 = 未婚 Two-sample t test with equal variances - Group | Obs Mean Std. Err. Std. Dev.
31、 99% Conf. Interval -+- 男 | 177 5.316384 .7992975 10.63396 3.234972 7.397796 女 | 129 5.984496 .5435252 6.173259 4.563295 7.405698 -+- combined | 306 5.598039 .5156249 9.019752 4.261516 6.934562 -+- diff | -.6681119 1.04519 -3.377347 2.041123 - diff = mean(男) - mean(女) t = -0.6392 Ho: diff = 0 degrees of freedom = 304 Ha: diff 0 Pr(T |t|) = 0.5232 Pr(T t) = 0.7384,多重比較變異數相等,44,多重比較變異數相等,- a5 = 已婚 Two-sample t test with equal variances -
链接地址:https://www.31doc.com/p-2578016.html