(solution) Benybeny, I see that you are well versed in SPSS and statistics.

Benybeny,

I see that you are well versed in SPSS and statistics.  I am in need of assistance with a question similar to one that you have answered before, but i am focusing solely on the age variable.

Take a random sample of 100.

Calculate the 95% confidence interval for the variable.

Calculate a 90% confidence interval.

Take another random sample of 400.

Calculate the 95% confidence interval for the variable.

Calculate a 90% confidence interval.

Explain how different levels of confidence and sample size affect the width of the confidence interval.   Also consider the following, ?Confidence intervals are underutilized?.  Explain the implications for using or not using  confidence intervals.  Examples would be good as well

DOCUMENT
THIS SYSTEM FILE CONTAINS DATA FOR THE 1972-2004
GENERAL SOCIAL SURVEYS, CONDUCTED BY THE NATIONAL OPINION RESEARCH
CENTER.
*** IT IS STRONGLY RECOMMENDED THAT USERS OF THE ***
*** FILE CONSULT THE CUMULATIVE G.S.S. 1972-2004 ***
*** CODEBOOK.
***
THE LABELS IN THIS FILE ARE BELIEVED TO BE ACCURATE,
BUT ARE NO SUBSTITUTE FOR THE QUESTION WORDING AND
RESPONSE CATEGORIES AS PRESENTED IN THE CODEBOOK.
USERS SHOULD PAY PARTICULAR ATTENTION TO THE
TREATMENT OF MISSING DATA. THE 'MISSING VALUES'
STATETMENTS ARE
INCLUDED IN THIS FILE-CREATION PACKAGE SIMPLY AS A
CONVENIENCE. FOR THE MOST PART, THE CATEGORIES
'DONT KNOW', 'NO ANSWER', AND 'NOT APPLICABLE' HAVE
BEEN DECLARED MISSING.
*** USERS ARE THEREFORE ENCOURAGED TO RE-DECLARE MISSING
*** VALUES WHEN ANALYZING THE DATA.
***
FOR CERTAIN VARIABLES, BLANK AND ZERO ARE BOTH VALID
CODES. TO AVOID CONFUSION, BLANKS IN THE RAW DATA
HAVE BEEN RECODED TO NEGATIVE ONE (-1) FOR THESE
VARIABLES ONLY. THE RECODE IS DONE ONLY WHERE BOTH
BLANK AND ZERO ARE VALID CODES. THEREFORE, WE
RECOMMEND THAT USERS MARK IN THEIR CODEBOOKS THE
VARIABLES FOR
WHICH THIS RECODE HAS BEEN MADE:
DOCUMENT THIS SYSTEM FILE CONTAINS DATA FOR THE 1972-2004
GENERAL SOCIAL SURVEYS, CONDUCTED BY THE NATIONAL OPINION RESEARCH
CENTER.
*** IT IS STRONGLY RECOMMENDED THAT USERS OF THE ***
*** FILE CONSULT THE CUMULATIVE G.S.S. 1972-2004 ***
*** CODEBOOK.
***
THE LABELS IN THIS FILE ARE BELIEVED TO BE ACCURATE,
BUT ARE NO SUBSTITUTE FOR THE QUESTION WORDING AND
RESPONSE CATEGORIES AS PRESENTED IN THE CODEBOOK.
USERS SHOULD PAY PARTICULAR ATTENTION TO THE
TREATMENT OF MISSING DATA. THE 'MISSING VALUES'
STATETMENTS ARE
INCLUDED IN THIS FILE-CREATION PACKAGE SIMPLY AS A
CONVENIENCE. FOR THE MOST PART, THE CATEGORIES
'DONT KNOW', 'NO ANSWER', AND 'NOT APPLICABLE' HAVE
BEEN DECLARED MISSING.
*** USERS ARE THEREFORE ENCOURAGED TO RE-DECLARE MISSING
*** VALUES WHEN ANALYZING THE DATA.
***
FOR CERTAIN VARIABLES, BLANK AND ZERO ARE BOTH VALID
CODES. TO AVOID CONFUSION, BLANKS IN THE RAW DATA
HAVE BEEN RECODED TO NEGATIVE ONE (-1) FOR THESE
VARIABLES ONLY. THE RECODE IS DONE ONLY WHERE BOTH
BLANK AND ZERO ARE VALID CODES. THEREFORE, WE
RECOMMEND THAT USERS MARK IN THEIR CODEBOOKS THE
VARIABLES FOR
WHICH THIS RECODE HAS BEEN MADE:
document Length of Interview in 2004 (LNGTHINV)
This is based on the CAPI time stamps from the start of the main
interview (excluding the HEF) to the end of the CDC risk behavior
section. Initially, we had also included the interviewer remarks and
validation section in the total time, but had to exclude this section
when we found that in a high number of cases interviewers had not
finished this section and turned off the computer immediately after
the interview, but left it on and finished their remarks at a later
time. This greatly exaggerated the length of this final section and
thus the total supposed interview time.
There were two cases with no times recorded and we recoded two more
cases to missing. One shows a total time of 10 seconds. Another has a
total time of 28786 seconds, but with its length getting shorter as
sections were completed.
That still left a number of cases in either tail of the distribution
that were highly improbable. About 1% of cases were under 30 minutes
and 1% over three hours.
We examined the section-by-section times for the long cases and
inspected the record of calls for a sample of these cases. The
section-by-section timings often indicated extra-long timing for a
single section. This may indicate an extended interruption occurring
them. However, many of these extra long times were so long (1-2
hours) that they may not represent mere interruptions, but other
occurrences such are errors in the CAPI timings or having the
interview conducted in two sessions. Unfortunately for most cases the
record of calls had no useful information to help explain the
unusually long times. However, among the extra long cases for which
there was some useful information, many were done in two or more
sessions on different days and it is likely that the CAPI program did
not correctly combine these different times. Also, the record of
calls did partially validate one interview of almost 4 hours. The
interview noted that it was done in between the respondent waiting on
customers.
We likewise looked at the extremely short cases. There was little
information that we could find to explain or validate these
lengths. Some may be valid since they did overrepresent shorter
ballots and respondents (e.g. those not in the labor force) who
skipped major sections. However, some also involved cases completed
over two or more different dates and we believe that they may reflect
only partial times from one of the sessions.
Overall, we believe that the extremely short and long cases include a
high, but not precisely known, number of errant times. In addition,
there are undoubtedly some timing errors among cases in the less
extreme range.
Looking at reported times (excluding the four missing cases) shows a
mean length of 87.3 minutes and a median of 83 minutes. If low times
are recoded to a minimum of 30 minutes and high times to a maximum 180
minutes, the mean length is 86.9 and the median still 83 minutes.
(Entered 09 JUL 07)
document Since CAPI 2002 (Computer Assisted Personal Interview)
provides various information on the length of
interview time, there are three elasped interview time
variables newly added to the remp7202p.sys.
The original data set of GSS2002 has three interview time
variables. First, it provides a cumulative
elapsed time for Section A to I. Second, it also gives
an elapsed time for SAQ only. Lastly, it contains a total elapsed time variable for the entire interview. In
order to utilize all information in the original data, we
created three interview time variables as follows:
LNGTHEND: TOTAL ELASPED MINUTES OF INTERVIEW for 2002
LNGTHCUM: CUMULATIVE ELASPED MINUTES FOR SECTIONS A-I for 2002
LNGTHALL: SUMMED ELASPED MINUTES OF SECTION A-I AND SAQ for 2002
(Entered 09 JUL 07)
DOCUMENT THIS SYSTEM FILE CONTAINS DATA FOR THE 1972-2004
GENERAL SOCIAL SURVEYS, CONDUCTED BY THE NATIONAL OPINION RESEARCH
CENTER.
*** IT IS STRONGLY RECOMMENDED THAT USERS OF THE ***
*** FILE CONSULT THE CUMULATIVE G.S.S. 1972-2004 ***
*** CODEBOOK.
***
THE LABELS IN THIS FILE ARE BELIEVED TO BE ACCURATE,
BUT ARE NO SUBSTITUTE FOR THE QUESTION WORDING AND
RESPONSE CATEGORIES AS PRESENTED IN THE CODEBOOK.
USERS SHOULD PAY PARTICULAR ATTENTION TO THE
TREATMENT OF MISSING DATA. THE 'MISSING VALUES'
STATETMENTS ARE
INCLUDED IN THIS FILE-CREATION PACKAGE SIMPLY AS A
CONVENIENCE. FOR THE MOST PART, THE CATEGORIES
'DONT KNOW', 'NO ANSWER', AND 'NOT APPLICABLE' HAVE
BEEN DECLARED MISSING.
*** USERS ARE THEREFORE ENCOURAGED TO RE-DECLARE MISSING
*** VALUES WHEN ANALYZING THE DATA.
***
FOR CERTAIN VARIABLES, BLANK AND ZERO ARE BOTH VALID
CODES. TO AVOID CONFUSION, BLANKS IN THE RAW DATA
HAVE BEEN RECODED TO NEGATIVE ONE (-1) FOR THESE
VARIABLES ONLY. THE RECODE IS DONE ONLY WHERE BOTH
BLANK AND ZERO ARE VALID CODES. THEREFORE, WE
RECOMMEND THAT USERS MARK IN THEIR CODEBOOKS THE
VARIABLES FOR
WHICH THIS RECODE HAS BEEN MADE:
