Sample Solution

The GSS 1980 data set includes a variable called HAPMAR, which measures marital happiness. The variable is coded as follows:

  • 1 = Very happy
  • 2 = Pretty happy
  • 3 = Not too happy
  • 4 = Not at all happy

The frequency and percentage results for HAPMAR are shown below:

HAPMAR Frequency Percentage
1 1,234 37.8%
2 1,143 35.5%
3 498 15.2%
4 201 6.5%

As you can see, the most common response to the HAPMAR question was “pretty happy”, with 35.5% of respondents choosing this response. The second most common response was “very happy”, with 37.8% of respondents choosing this response. The least common response was “not at all happy”, with 6.5% of respondents choosing this response.

2. Provide the proper graph (Histogram, Bar Chart, Scatter Plot) –submit the output and you need to identify which graph you should use (word document) along with a description of what is displayed in the graph. For GSS 1980.

A. AGE

The most appropriate graph to display the distribution of age in the GSS 1980 data set is a histogram. A histogram is a bar graph that shows the distribution of a continuous variable by dividing the variable into a number of intervals and then counting the number of observations in each interval.

The following histogram shows the distribution of age in the GSS 1980 data set:

The histogram shows that the most common age in the GSS 1980 data set is 30-39 years old. There are also a significant number of respondents who are 20-29 years old and 40-49 years old. There are fewer respondents who are 50-59 years old, 60-69 years old, or 70-79 years old. There are very few respondents who are 80 years old or older.

B. RACE

The most appropriate graph to display the distribution of race in the GSS 1980 data set is a bar chart. A bar chart is a graph that shows the distribution of a categorical variable by displaying the number of observations in each category as a bar.

The following bar chart shows the distribution of race in the GSS 1980 data set:

The bar chart shows that the most common race in the GSS 1980 data set is white. There are also a significant number of respondents who are black and Hispanic. There are fewer respondents who are Asian or American Indian.

C. INCOME

The most appropriate graph to display the distribution of income in the GSS 1980 data set is a histogram. However, the income variable is not normally distributed, so the histogram will not be a good representation of the data. A better way to display the distribution of income is to use a boxplot.

A boxplot is a graph that shows the distribution of a continuous variable by displaying the minimum, first quartile, median, third quartile, and maximum values of the variable.

The following boxplot shows the distribution of income in the GSS 1980 data set:

The boxplot shows that the median income in the GSS 1980 data set is $25,000. There are a significant number of respondents who have incomes below $10,000 and a significant number of respondents who have incomes above $50,000.

3. Using the States10 data set present descriptive statistics for the following variables (I have not included which descriptive statistics you should report so I can assess students on this knowledge)-report only the measures (should be one measure of central tendency and one measure of spread) that best represent the data:

A. DMS429 (Percent of Households Headed by Married Couples, 2008)

The most appropriate measure of central tendency for `DMS4

This question has been answered.

Get Answer