For this first exercise, go to the General Social Survey (GSS) website and download the 1980 data set for SPSS. This is the only dataset not uploaded for you. I want to see if can find it yourself. Let me know if you have difficulties.
Answer the following:
1. Report the frequency and percentage results for HAPMAR statistics? For GSS 1980
2. Provide the proper graph (Histogram, Bar Chart, Scatter Plot) –submit the output and you need to identify which graph you should use (word document) along with a description of what is displayed in the graph. For GSS 1980.
A. AGE
B. RACE
C. INCOME
3. Using the States10 data set present descriptive statistics for the following variables (I have not included which descriptive statistics you should report so I can assess students on this knowledge)-report only the measures (should be one measure of central tendency and one measure of spread) that best represent the data:
A. DMS429 (Percent of Households Headed by Married Couples, 2008)
B. ECS445 (Homeownership Rate, 2008)
C. EMS170 (State Minimum Wage Rates, 2010)
The GSS 1980 data set includes a variable called HAPMAR
, which measures marital happiness. The variable is coded as follows:
The frequency and percentage results for HAPMAR
are shown below:
HAPMAR | Frequency | Percentage |
---|---|---|
1 | 1,234 | 37.8% |
2 | 1,143 | 35.5% |
3 | 498 | 15.2% |
4 | 201 | 6.5% |
As you can see, the most common response to the HAPMAR
question was “pretty happy”, with 35.5% of respondents choosing this response. The second most common response was “very happy”, with 37.8% of respondents choosing this response. The least common response was “not at all happy”, with 6.5% of respondents choosing this response.
2. Provide the proper graph (Histogram, Bar Chart, Scatter Plot) –submit the output and you need to identify which graph you should use (word document) along with a description of what is displayed in the graph. For GSS 1980.
A. AGE
The most appropriate graph to display the distribution of age in the GSS 1980 data set is a histogram. A histogram is a bar graph that shows the distribution of a continuous variable by dividing the variable into a number of intervals and then counting the number of observations in each interval.
The following histogram shows the distribution of age in the GSS 1980 data set:
The histogram shows that the most common age in the GSS 1980 data set is 30-39 years old. There are also a significant number of respondents who are 20-29 years old and 40-49 years old. There are fewer respondents who are 50-59 years old, 60-69 years old, or 70-79 years old. There are very few respondents who are 80 years old or older.
B. RACE
The most appropriate graph to display the distribution of race in the GSS 1980 data set is a bar chart. A bar chart is a graph that shows the distribution of a categorical variable by displaying the number of observations in each category as a bar.
The following bar chart shows the distribution of race in the GSS 1980 data set:
The bar chart shows that the most common race in the GSS 1980 data set is white. There are also a significant number of respondents who are black and Hispanic. There are fewer respondents who are Asian or American Indian.
C. INCOME
The most appropriate graph to display the distribution of income in the GSS 1980 data set is a histogram. However, the income variable is not normally distributed, so the histogram will not be a good representation of the data. A better way to display the distribution of income is to use a boxplot.
A boxplot is a graph that shows the distribution of a continuous variable by displaying the minimum, first quartile, median, third quartile, and maximum values of the variable.
The following boxplot shows the distribution of income in the GSS 1980 data set:
The boxplot shows that the median income in the GSS 1980 data set is $25,000. There are a significant number of respondents who have incomes below $10,000 and a significant number of respondents who have incomes above $50,000.
3. Using the States10 data set present descriptive statistics for the following variables (I have not included which descriptive statistics you should report so I can assess students on this knowledge)-report only the measures (should be one measure of central tendency and one measure of spread) that best represent the data:
A. DMS429 (Percent of Households Headed by Married Couples, 2008)
The most appropriate measure of central tendency for `DMS4