SPSS/STATS Assignment Help | HomeworkDeskmate

Comparing Two Proportions

Comparing Two Proportions

In the Module Overview, you’ll have noticed the textbook assignment for this module.
ideos are optional, and are there for your reference. Watch the video if you so choose, then complete the practice problems from your textbook.
You’ll notice, I’ve only assigned odd-numbered exercises. This is because I want you to be able to check your work as you go along. Please use good judgment and academic integrity when you complete the assignments; don’t merely copy or paraphrase the answers–that will earn you a 0–but use them to guide your answers. I recommend you complete the assignment, then check the answers, and then make corrections in a different colored pen/cil or font. This way you really have the opportunity to learn to nuances–and there are nuances–in each problem. Please
make sure to show all your work, step by step, if applicable.
If you have any questions, especially concerning my definition of copying or paraphrasing, please feel free to email me.
Feel free to do the problems by hand and then scan them to upload. I really like Genius Scan. You can also take a picture of your work instead, but I ask you make sure everything is legible before you upload the picture. You may feel more comfortable typing your answers, which is fine; just be sure that ALL STATISTICAL DISPLAYS are included (no matter the format you choose).
Complete bookwork – pg 618; #21*, 23, 27, 29*, 33, 35*, 37*

Measures of Central Tendency Paper

Measures of Central Tendency Paper

The mean salary is often used to describe the salaries of employees of a company. However, the median salary may be a better measure of the salaries in comparison to the mean. Research a career you are interested in and calculate the mean and median salaries using at least ten data points. Include the calculations and the data source(s). Which is the better measure of central tendency? Why? Review and respond to the comments posted by your peers and offer your insight on this topic. Do you agree or disagree with their selection? Why or why not?

Check this good simple and easy to use resource for making box plots: http://www.shodor.org/interactivate/activities/BoxPlot/
This is a good place for an example : http://home.ubalt.edu/ntsbarsh/Business-stat/otherapplets/Descriptive.htm
If you have an even number of data points, then you average the 5th and 6th elements (when placed in numerical order) to obtain the median.
The median divides the original data set into equal sized halves; repeat the procedure with these new smaller data sets to find the first and third
quartiles.

Pay attention to whether these new smaller sets have an even or odd number of elements!
This post seems to focus on the mean and median, but there is another measure of central tendency – the mode.
In what situations and what variables would the mode be the best choice as the measure of central tendency?
This side question has nothing to do with salaries, they are continuous quantitative variables.
(Needs to be at least 150 words)

Tools for Data Analysis

Tools for Data Analysis

To complete the Assignment, compose a cohesive document that addresses the following:
Create a table outlining practical applications for each tool discuss in “The Seven Quality Tools” (Stauffer, 2013). Include the following within your table:
Strengths: Why that tool works well for those applications
Tips for use Cautions relevant to the tool
Analyze the effectiveness of each tool listed within your table. In your analysis, address the following:
Choose an online example, or an example from your experience, in which the tool was used. Provide a link to your example. Analyze how it was used within the organization. For each tool listed, find an online example where the tool was used properly and provide the link to your example and a brief description of how it was used and your analysis of its effectiveness or whether there was a better tool and why.

Types of Reliability and Validity

Types of Reliability and Validity

Investigate an individual, standardized cognitive or academic assessment like the WISC, WJ, KTEA or WIAT and discuss the concepts listed below that you are able to find in the technical manual of the assessment:

Test-Retest Reliability
Interrater Reliability
Internal Consistency
Confidence Intervals
Standard Error Measurement
Face Validity
Construct Validity
Criterion-Related Validity
Content Validity
External Validity

Index Construction and Use SPSS

Homework 5: Index Construction & Use Comments
This assignment continues a series of labs and homeworks in which you utilize statistical skills for basic
research. For this assignment, you will again manipulate variables and construct a basic index, as you have
done in several earlier assignments. However, for this assignment, you will take an additional step, using the
index that you create for simple bivariate descriptions of the sample. The index will be the “dependent variable”:
Specifically, you will use ordinal measures to compare support for possible explanations of variation in the index.
Instructions
You will be using the data file hw5.sav to examine variation in respondents’ satisfaction with four areas of their
lives (family, friends, finance, and job). You will then create a summary measure of overall satisfaction, and will
explore how (and whether) that summary measure varies in two ways: across educational levels and with
frequency of sexual activity. Finally, you will briefly explore interactions among these possible influences on
satisfaction. (Note that most of the recoding has been done for you – this is not always the case.)
Requirements & Questions
You must submit your output file (complete but cleaned) and typed answers to these questions. Typed. Probably
with a computer, maybe with some other device, possibly a typewriter. But not a pen, pencil, or crayon. Typed.
1. Univariate analyses of component and independent variables:
• Perform a univariate analysis of SATFAM, SATFIN, SATJOB, and SATFRND – For each, you should
look at and briefly summarize the frequency distribution, as well as basic summary statistics for central
tendency and dispersion. Go beyond just reporting the data and say something interesting (here and
below). For example, about which issues are the respondents the most/least happy?
• Look briefly at the distributions of EDUC and SEXFREQ. (Note, in particular, the percent of the sample
who refused to answer or otherwise did not have an answer for SEXFREQ.)
2. Construct and assess index:
• Construct an index (including variable labels and value labels, at least for the extremes), called
HAPPY, as the summation of values for the four components listed above.
• Perform a univariate analysis of HAPPY – look at and briefly summarize the frequency distribution, as
well as basic summary statistics for central tendency and dispersion..
• What is this variable conceptually? What does it measure, and what does it mean? What does it tell us
that the individual components do not?
• Interpret the “alpha” for your index – is the index reliable? is it a good one? why or why not?
3. Bivariate analyses – what makes people happy?
• Using correlations and chi-square, what can you say about the relationship between educational
attainment and overall satisfaction (i.e. between HAPPY and EDUC)? (You will need to request a
crosstab to get chisquare, but ignore the table itself, for now.) Is it strong? statistically significant?
• Using correlations and chi-square, what can you say about the relationship between frequency of
sexual activity and overall satisfaction (i.e. between HAPPY and SEXFREQ)? (You will need to request
a crosstab to get chisquare, but ignore the table itself, for now.) Is it strong? statistically significant?
4. Discussion/conclusions
• What can you infer from these findings about what makes people happy? (Hint: Did either of the two
independent variables (EDUC and SEXFREQ) have a statistically significant effect on the dependent
variable (HAPPY)?)
• Bonus: Put that at a conceptual level, thinking about what broader concepts these variables might
operationalize. Of what larger concept might education be a specific instance, indicator, or aspect?
What about sexual frequency?

Merger and acquisition engagement of environmental innovators in the automotive industry Software: STATA

Master Level, Use of STATA, Orbis and Zephyr, has to contain patents data, merger and acquisition data, Please use all 3 databases. The paper also requires statistical models such as a regression for example, 4000 words, 15 pages, times new roman 12, i also added my slides and example studies. The assessmentform is also in there which is very important. Please read them to get a picture of the required level

Marketing Research – Quantitative Data Analysis

Topic: Marketing Research – Quantitative Data Analysis

A series of (5) separate Homework Assignments requiring the following: – the correct input of data into the SPSS software and evidence of this process in the form of output in a PDF download. – Summarize, organize and present a summary of your statistical output into easily understandable table(s) on one page – presenting the information asked for in the objectives above, and highlight what you think is necessary for your clients to know in the most easily readable manner.

Sample T – test. Student t tests

Pick any set of data. Conduct a two-sample T-test.
Explain in the discussion question:
Your source of data
Your null hypothesis
Whether or not you reject the null hypothesis (and include the P value)
How this information might be relevant to a decision maker.
Attach the Excel file containing the data source (but be sure everything we need to know about
your executive summary is in the body of the discussion forum, not the attachment).

COVID19 Correlation Analysis

COVID19 Correlation Analysis

i will upload the file here with the information.
questions below should be answered.
i will also provide links to help answer the questions.
https://www1.nyc.gov/assets/doh/downloads/pdf/imm/covid-19-cases-by-zip-04152020-1.pdf
https://state.1keydata.com/state-population-density.php
the questions are:
1. See the graphs below, with log-transformed population density on the X-axis and logtransformed CDR on the Y-axis with a least squares line fit to the data.
Is the correlation positive or negative?

2. As the X-axis (population density) increases, what happens to the Y-axis (Crude Death Rate)?
3. What does the 0.449 mean? This is a log-log regression equation, so it has an easy
interpretation: For every 1% increase in population density, there is a 0.5% increase in the Crude
Death Rate for COVID19
4. Given what we just figured out, which states should emphasize social distancing the most?
(Hint: look at the graph with state name labels).
5. We have tested only one variable…what other variables do you think we should test?

Statistics Assignment

Statistics Assignment

Data:
For this assignment, please download the Homework data file.
Requirements:
1. Create a line chart and identify the time series components in the time series. Then, compute
the correlation between the time series variable and time using either the =CORREL() function or
the correlation tool in the date analysis toolpack. Justify your answer. (Hint: it should be some
combination of average or base, cycle, trend, and random variation.)
2. Create as many forecast as possible on the historical data using each of the methods below.
a. 3-period moving average.
c. Exponential smoothing forecast with alpha = 0.8.
d. Trend forecast (whether or not there is a trend). Use the =TREND() function in Excel.

IMPORTANT NOTES: When computing your moving average forecasts, do not use the Moving
Average Data Analysis tool. This tool will not give you a valid forecast because it uses the
current period in the computation. Instead, use the AVERAGE() function. Also, for both the ES
and the MA forecasts, do not include the period you are forecasting in the history you are using
to compute the forecast.
3. Starting with the fourth period, compute the MAE for each forecasting model, and choose the
best model based on this analysis.
4. Using the best model, make a new forecast for the next period.
Deliverables
Please place all of your analysis on a single spreadsheet. Clearly label your answers. When you
have completed the assignment, post your Excel file on the HW 4 assignment dropbox.
Hints: In this assignment, you are using your entire history to build good models and to test the
forecasting skill of the models. Once you have computed the forecasts and calculated the MAEs
for each model, you will choose the most accurate model on historical data to make a future
forecast. The moving average forecasts will begin at period 4, the ES forecasts will begin at
period 2 (with the naïve starting value), and the trend forecasts will begin at period 1. For
consistency, you should compute the MAEs for periods 3-98 for each. the first draft is by
Wednesday

Date	Total Instances of Fraud
10/29/2019	428
10/30/2019	314
10/31/2019	429
11/1/2019	474
11/2/2019	443
11/3/2019	462
11/4/2019	361
11/5/2019	458
11/6/2019	410
11/7/2019	595
11/8/2019	396
11/9/2019	511
11/10/2019	508
11/11/2019	447
11/12/2019	463
11/13/2019	321
11/14/2019	628
11/15/2019	340
11/16/2019	363
11/17/2019	438
11/18/2019	369
11/19/2019	430
11/20/2019	338
11/21/2019	637
11/22/2019	352
11/23/2019	468
11/24/2019	366
11/25/2019	440
11/26/2019	343
11/27/2019	504
11/28/2019	657
11/29/2019	514
11/30/2019	343
12/1/2019	458
12/2/2019	484
12/3/2019	428
12/4/2019	456
12/5/2019	609
12/6/2019	493
12/7/2019	477
12/8/2019	442
12/9/2019	457
12/10/2019	369
12/11/2019	459
12/12/2019	674
12/13/2019	378
12/14/2019	394
12/15/2019	408
12/16/2019	385
12/17/2019	511
12/18/2019	353
12/19/2019	619
12/20/2019	480
12/21/2019	529
12/22/2019	509
12/23/2019	388
12/24/2019	359
12/25/2019	430
12/26/2019	610
12/27/2019	439
12/28/2019	480
12/29/2019	378
12/30/2019	446
12/31/2019	438
1/1/2020	484
1/2/2020	625
1/3/2020	446
1/4/2020	533
1/5/2020	413
1/6/2020	469
1/7/2020	534
1/8/2020	516
1/9/2020	577
1/10/2020	493
1/11/2020	525
1/12/2020	397
1/13/2020	533
1/14/2020	420
1/15/2020	426
1/16/2020	569
1/17/2020	417
1/18/2020	453
1/19/2020	427
1/20/2020	458
1/21/2020	455
1/22/2020	559
1/23/2020	652
1/24/2020	414
1/25/2020	426
1/26/2020	426
1/27/2020	582
1/28/2020	471
1/29/2020	569
1/30/2020	631
1/31/2020	484
2/1/2020	549
2/2/2020	408

Comparing Global Values and Attitudes

●PROJECT 3: Comparing Global Values and Attitudes ●

An independent-samples hypothesis test helps us determine if two groups (for example, cats and dogs) substantively differ with respect to a social value as measured by an interval-ratio variable (for example, feelings about lasagna measured on a scale from 1 to 10). For this project, you will be asked to prepare a report that tells us how two groups (for example, the US and Spain) differ with respect to one social value variable related to a UN SDG of your choice. Stated differently, you’ll be comparing local and global data and relating it back to the Sustainable Development Goals set forth by the United Nations.

Click here to download the Project 3 Guidebook
- This document contains the instructions, rubric, example, data information, and SPSS instructions needed for the project.
Click here to download the NEW Project 3 data file: Project 3 STA2122 Class Data 2020.sav
This is NEW DATA is taken from the Pew research center’s Global Attitudes Survey.
You will need to open this NEW DATA file in SPSS in order to determine the sample size for your project.
IMPORTANT: If your access to SPSS is interrupted due to COVID-19, Click here to access a document with analyses options you can use in your report.
Click here to read about the sampling and methods used by Pew (Links to an external site.)
Click here to review how to download and install SPSS

Statistical Analysis of Data Using MINITAB

Statistical Analysis of Data Using MINITAB
Deadline: 5pm Monday 16th October 2017

Introduction and dataset
The aim of this coursework is to investigate and predict the onset of diabetes based on
various diagnostic measurements.

The dataset was originally compiled by researcher at the Johns Hopkins University
School of Medicine, from a larger database owned by the National Institute of Diabetes
and Digestive and Kidney Diseases. All patients were females at least 21 years old of
Pima Indian heritage. Note that Pima Indians have one of the highest rates of diabetes
in the world.

This dataset includes 392 observations, taken at the individual level and available from
diabetes_dataset.xlsx file in Statistical Data Analysis Coursework folder on NOW.
The key indicator of diabetes (response variable), as defined by the World Health
Organization, is a plasma glucose concentration greater than 200 mg/dl two hours
following ingestion of a 75 gm carbohydrate solution (variable Glucose).

The explanatory variables (or predictors) are known risk factors for diabetes: number of
pregnancies, diastolic blood pressure, triceps skinfold thickness (an indicator of
bodyfat), 2 hour serum insulin, body mass index, age, and diabetes pedigree function
(see Table).

Table. Measurements recorded in the dataset,
Measurement/variables Description
Glucose plasma glucose concentration 2 hours in an
oral glucose tolerance test
Pregnancies number of times pregnant
BloodPressure diastolic blood pressure (mm Hg)
SkinThickness triceps skin fold thickness (mm)
Insulin 2-Hour serum insulin (mu U/ml)
BMI body mass index (weight in kg/(height in m)2
)
DiabetesPedigreeFunction diabetes pedigree function*
Age age (years)
Outcome class variable (0 or 1)**
* a synthesis of diabetes history in an individual’s relatives
**negative (0)/positive (1) diabetes test
Creating your unique dataset
Copy the data from this file into MINITAB so that Glucose is recorded in column C1,
Pregnancies in C2, etc.

(1) Generate two random numbers between 2 and 7 and provide MINITAB output.
(1 mark)

(2) Using MINITAB, erase columns corresponding to your generated numbers (e.g. if
one of the generated numbers is 5 then erase column C5, etc). Describe how you did
this and provide the sequence of actions (e.g. Calc->Descriptive Stats->….)
(2 mark)

(3) Using MINITAB select a random sample of 300 observations (n = 300) from your
dataset. Provide the sequence of actions of how you did this.
(1 mark)
Your unique dataset will now consist of 300 rows and seven columns including
Glucose, Age and Outcome.
Investigating your unique dataset

(4) For your unique dataset summarise information about your observations and present
graphically the frequency distributions for all variables that are left in your unique
dataset including Glucose but excluding Outcome variables. Comment on unusual
observations and make your own decision, how to deal with them.
(6 marks)

(5) Using MINITAB, define a new variable, Age_Group, by combining observations
for participants younger than 30 into group 1 and all others (of age 30 and older) into
group 2. Provide either a description or a screen shot of how you did this.
(3 marks)

(6) Investigate whether there is a significant difference in mean/median Glucose
concentration between age groups. Formulate the null and alternative hypotheses;
choose, justify and perform an appropriate statistical test using MINITAB; provide all
MINITAB outputs; write your conclusions.
(10 marks)

(7) Show whether the proportion of participants with Glucose concentration greater
than 100 mg/dl is different between age groups that you defined previously. Formulate
the null and alternative hypotheses; choose, justify and perform an appropriate
statistical test using MINITAB; provide all MINITAB outputs; write your conclusions.
(10 marks)

(8) Using MINITAB, produce a table of correlation coefficients. Justify the choice of
correlation coefficient, investigate the resulting table and comment on most interesting
relationships between chosen variables. Do not use Glucose and Outcome variables in
this analysis.
(4 marks)

(9) Using simple linear regression, model Glucose concentration by one of the
variables of your choice that are available in your unique dataset. Comment on
significance of intercept and slope.
(4 marks)

(10) Fit a multiple regression model with Glucose being a response variable and other
five variables excluding Outcome as predictors. Treat variable Pregnancies as an
interval scale data. Identify insignificant predictors in the model and explain why they
are insignificant.
(4 marks)

(11) Cluster your 300 observation into 10 groups using one of the linkage method and
similarity measure from the corresponding drop-down menus. Give a brief (half a page)
description of the linkage method and similarity measure chosen. Show a dendrogram
with cases labelled by Outcome. Comment on the results obtained. Provide all
MINITAB outputs.
(6 marks)

(12) It is known that the incidence of diabetes in the UK is 0.6. In a small northern
village of 100 people isolated from the mainland for six months per year the pharmacy
wants to know how many insulin shots to order. We want to know what is the
probability that between A and B people will develop the disease during this period. To
perform analysis, generate two random numbers between 0 and 100 using MINITAB
and paste the outputs into your report. Denote by A the smallest number and by B the
largest number out of these two generated numbers. Calculate the probability that
between A and B people develop the disease and how many shots should be ordered.
(9 marks)

Glucose	Pregnancies	BloodPressure	SkinThickness	Insulin	BMI	DiabetesPedigreeFunction	Age	Outcome
56	2	56	28	45	24.2	0.332	22	0
68	2	62	13	15	20.1	0.257	23	0
68	2	70	32	66	25	0.187	25	0
68	10	106	23	49	35.5	0.285	47	0
71	1	48	18	76	20.4	0.323	22	0
71	1	78	50	45	33.2	0.422	21	0
74	0	52	10	36	27.8	0.269	22	0
74	3	68	28	45	29.7	0.293	23	0
74	8	70	40	49	35.3	0.705	39	0
75	2	64	24	55	29.7	0.37	33	0
77	1	56	30	56	33.3	1.251	24	0
77	5	82	41	42	35.8	0.156	35	0
78	3	50	32	88	31	0.248	26	1
78	0	88	29	40	36.9	0.434	21	0
79	1	80	25	37	25.4	0.583	22	0
79	1	60	42	48	43.5	0.678	23	0
80	1	74	11	60	30	0.527	22	0
80	3	82	31	70	34.2	1.292	27	1
81	1	72	18	40	26.6	0.283	24	0
81	3	86	16	66	27.5	0.306	22	0
81	2	72	15	76	30.1	0.547	25	0
81	1	74	41	57	46.3	1.096	32	0
81	7	78	40	48	46.7	0.261	42	0
82	1	64	13	95	21.2	0.415	23	0
82	2	52	22	115	28.5	1.699	25	0
83	7	78	26	71	29.3	0.767	36	0
83	2	66	23	50	32.2	0.497	22	0
83	3	58	31	18	34.3	0.336	25	0
83	2	65	28	66	36.8	0.629	24	0
84	2	50	23	76	30.4	0.968	21	0
84	3	68	30	106	31.9	0.591	25	0
84	0	64	22	66	35.8	0.545	21	0
84	1	64	23	115	36.9	0.471	28	0
84	0	82	31	125	38.2	0.233	23	0
84	4	90	23	56	39.5	0.159	25	0
85	4	58	22	49	27.8	0.306	28	0
86	5	68	28	71	30.2	0.364	24	0
86	1	66	52	65	41.3	0.917	29	0
87	2	58	16	52	32.7	0.166	25	0
87	1	78	27	32	34.6	0.101	22	0
87	1	60	37	75	37.2	0.509	22	0
87	1	68	34	77	37.6	0.401	24	0
88	5	66	21	23	24.4	0.342	30	0
88	3	58	11	54	24.8	0.267	22	0
88	2	58	26	16	28.4	0.766	22	0
88	2	74	19	53	29	0.229	22	0
88	1	62	24	44	29.9	0.422	23	0
88	1	78	29	76	32	0.365	29	0
88	12	74	40	54	35.3	0.378	48	0
88	1	30	42	99	55	0.496	26	1
89	1	24	19	25	27.8	0.559	21	0
89	1	66	23	94	28.1	0.167	21	0
89	3	74	16	85	30.4	0.551	38	0
89	1	76	34	37	31.2	0.192	23	0
90	2	80	14	55	24.4	0.249	24	0
90	1	62	18	59	25.1	1.268	25	0
90	1	62	12	43	27.2	0.58	24	0
90	4	88	47	54	37.7	0.362	29	0
91	1	54	25	100	25.2	0.234	23	0
91	4	70	32	88	33.1	0.446	22	0
91	0	68	32	210	39.9	0.381	25	0
92	1	62	25	41	19.5	0.482	25	0
92	12	62	7	258	27.6	0.926	44	1
92	6	62	32	126	32	0.085	46	0
93	0	60	25	92	28.7	0.532	22	0
93	6	50	30	64	28.7	0.356	23	0
93	2	64	32	160	38	0.674	23	1
93	0	100	39	72	43.4	1.021	35	0
94	2	68	18	76	26	0.561	21	0
94	2	76	18	66	31.6	0.649	23	0
94	7	64	25	79	33.3	0.738	41	0
94	0	70	27	115	43.5	0.347	21	0
95	1	66	13	38	19.6	0.334	25	0
95	1	60	18	58	23.9	0.26	22	0
95	1	74	21	73	25.9	0.673	36	0
95	2	54	14	88	26.1	0.748	22	0
95	1	82	25	180	35	0.233	43	1
95	0	80	45	92	36.5	0.33	26	0
95	0	85	25	36	37.4	0.247	24	1
95	0	64	39	105	44.6	0.366	22	0
96	4	56	17	49	20.8	0.34	26	0
96	2	68	13	49	21.1	0.647	26	0
96	3	56	34	115	24.7	0.944	39	0
96	1	64	27	87	33.2	0.289	21	0
96	5	74	18	67	33.6	0.997	43	0
97	1	64	19	82	18.2	0.299	21	0
97	1	66	15	140	23.2	0.487	22	0
97	0	64	36	100	36.8	0.6	25	0
97	7	76	32	91	40.9	0.871	32	1
98	0	82	15	84	25.2	0.299	22	0
98	6	58	33	190	34	0.43	43	0
98	2	60	17	120	34.7	0.198	22	0
99	3	80	11	64	19.3	0.284	30	0
99	2	70	16	44	20.4	0.235	27	0
99	3	62	19	74	21.8	0.279	26	0
99	4	76	15	51	23.2	0.223	21	0
99	2	52	15	94	24.6	0.637	21	0
99	3	54	19	86	25.6	0.154	24	0
99	6	60	19	54	26.9	0.497	32	0
99	5	54	28	83	34	0.499	30	0
99	2	60	17	160	36.6	0.453	21	0
99	1	72	30	18	38.6	0.412	21	0
100	1	74	12	46	19.5	0.149	28	0
100	1	66	15	56	23.6	0.666	26	0
100	1	72	12	70	25.3	0.658	28	0
100	12	84	33	105	30	0.488	46	0
100	0	70	26	50	30.8	0.597	21	0
100	3	68	23	81	31.6	0.949	28	0
100	1	66	29	196	32	0.444	42	0
100	2	66	20	90	32.9	0.867	28	1
100	14	78	25	184	36.6	0.412	46	1
100	2	54	28	105	37.8	0.498	24	0
100	2	68	25	71	38.5	0.324	26	0
100	8	74	40	215	39.4	0.661	43	1
100	2	70	52	57	40.5	0.677	25	0
100	0	88	60	110	46.8	0.962	31	0
101	2	58	35	90	21.8	0.155	22	0
101	2	58	17	265	24.2	0.614	23	0
101	1	50	15	36	24.2	0.526	26	0
101	10	76	48	180	32.9	0.171	63	0
102	0	86	17	105	29.3	0.695	27	0
102	3	44	20	94	30.8	0.4	26	0
102	0	78	40	90	34.5	0.238	24	0
102	7	74	40	105	37.2	0.204	45	0
102	0	64	46	78	40.6	0.496	21	0
102	2	86	36	120	45.5	0.127	23	1
103	1	80	11	82	19.4	0.491	22	0
103	4	60	33	192	24	0.966	33	0
103	3	72	30	152	27.6	0.73	27	0
103	6	72	32	190	37.7	0.324	55	0
103	1	30	38	83	43.3	0.183	33	0
104	0	64	23	116	27.8	0.454	23	0
104	6	74	18	156	29.9	0.722	41	1
104	0	64	37	64	33.6	0.51	22	1
105	6	70	32	68	30.8	0.122	37	0
105	2	80	45	191	33.7	0.711	29	1
105	2	58	40	94	34.9	0.225	25	0
105	5	72	29	325	36.9	0.159	28	0
105	0	64	41	142	41.5	0.173	22	0
106	2	56	27	165	29	0.426	22	0
106	2	64	35	119	30.5	1.4	34	0
106	3	54	21	158	30.9	0.292	24	0
106	1	70	28	135	34.2	0.142	22	0
106	0	70	37	148	39.4	0.605	22	0
107	3	62	13	48	22.9	0.678	23	1
107	1	72	30	82	30.8	0.821	24	0
107	2	74	30	100	33.6	0.404	23	0
107	0	62	30	74	36.6	0.757	25	1
108	6	44	20	130	24	0.813	35	0
108	2	62	32	56	25.2	0.128	21	0
108	2	62	10	278	25.3	0.881	22	0
108	2	52	26	63	32.5	0.318	22	0
108	1	60	46	178	35.5	0.415	24	0
108	5	72	43	75	36.1	0.263	33	0
109	1	38	18	120	23.1	0.407	26	0
109	1	56	21	135	25.2	0.833	23	0
109	1	60	8	182	25.4	0.947	21	0
109	8	76	39	114	27.9	0.64	31	1
109	1	58	18	116	28.5	0.219	22	0
109	4	64	44	99	34.8	0.905	26	1
109	5	62	41	129	35.8	0.514	25	1
110	4	76	20	100	28.4	0.118	27	0
110	2	74	29	125	32.4	0.698	27	0
111	1	62	13	182	24	0.138	23	0
111	3	90	12	78	28.4	0.495	29	0
111	3	58	31	44	29.5	0.43	22	0
111	4	72	47	207	37.1	1.39	56	1
112	2	68	22	94	34.1	0.315	26	0
112	9	82	32	175	34.2	0.26	36	1
112	1	72	30	176	34.4	0.528	25	0
112	1	80	45	132	34.8	0.217	24	0
112	2	86	42	160	38.4	0.246	28	0
112	2	78	50	140	39.4	0.175	24	0
113	3	50	10	85	29.5	0.626	25	0
114	7	76	17	110	23.8	0.466	31	0
114	1	66	36	200	38.1	0.289	21	0
114	0	80	34	285	44.2	0.167	27	0
115	1	70	30	96	34.6	0.529	32	1
115	3	66	39	140	38.1	0.15	28	0
116	4	72	12	87	22.1	0.463	37	0
116	3	74	15	105	26.3	0.107	24	0
116	1	78	29	180	36.1	0.496	25	0
117	2	90	19	71	25.2	0.313	21	0
117	0	66	31	188	30.8	0.493	22	0
117	4	64	27	120	33.2	0.23	24	0
117	1	60	23	106	33.8	0.466	27	0
117	1	88	24	145	34.5	0.403	40	1
117	5	86	30	105	39.1	0.251	42	0
117	0	80	31	53	45.2	0.089	24	0
118	1	58	36	94	33.3	0.261	23	0
118	0	84	47	230	45.8	0.551	31	1
119	1	54	13	50	22.3	0.205	24	0
119	6	50	22	176	27.1	1.318	33	1
119	0	64	18	92	34.9	0.725	23	0
119	1	44	47	63	35.5	0.28	25	0
119	1	88	41	170	45.3	0.507	26	0
119	1	86	39	220	45.6	0.808	29	1
120	9	72	22	56	20.8	0.733	48	0
120	0	74	18	63	30.5	0.285	26	0
120	1	80	48	200	38.9	1.162	41	0
120	2	76	37	105	39.7	0.215	29	0
120	11	80	37	150	42.3	0.785	48	1
120	3	70	30	135	42.9	0.452	30	0
121	5	72	23	112	26.2	0.245	30	0
121	0	66	30	165	34.3	0.203	33	1
121	1	78	39	74	39	0.261	28	0
121	2	70	32	95	39.1	0.886	23	0
122	2	60	18	106	29.8	0.717	22	0
122	1	64	32	156	35.1	0.692	30	1
122	2	76	27	200	35.9	0.483	26	0
122	2	52	43	158	36.2	0.816	28	0
122	1	90	51	220	49.7	0.325	31	1
123	4	80	15	176	32	0.443	34	0
123	9	70	44	94	33.1	0.374	40	0
123	6	72	45	230	33.6	0.733	34	0
123	5	74	40	77	34.1	0.269	28	0
123	2	48	32	165	42.1	0.52	26	0
123	3	100	35	240	57.3	0.88	22	0
124	0	56	13	105	21.8	0.452	21	0
124	7	70	33	215	25.5	0.161	37	0
124	8	76	24	600	28.7	0.687	52	1
124	2	68	28	205	32.9	0.875	30	1
124	3	80	33	130	33.2	0.305	26	0
124	9	70	33	402	35.4	0.282	34	0
125	1	70	24	110	24.3	0.221	25	0
125	4	70	18	122	28.9	1.144	45	1
125	6	68	30	120	30	0.464	32	0
125	10	70	26	115	31.1	0.205	41	1
125	1	50	40	167	33.3	0.962	28	1
125	2	60	20	140	33.8	0.088	31	0
126	8	74	38	75	25.9	0.162	39	0
126	0	86	27	120	27.4	0.515	21	0
126	1	56	29	152	28.7	0.801	21	0
126	5	78	27	22	29.6	0.439	40	0
126	0	84	29	215	30.7	0.52	24	0
126	8	88	36	108	38.5	0.349	49	0
126	3	88	41	235	39.3	0.704	27	0
127	2	58	24	275	27.7	1.6	25	0
127	2	46	21	335	34.4	0.176	22	0
127	4	88	11	155	34.5	0.598	28	0
127	0	80	37	210	36.3	0.804	23	0
128	1	82	17	183	27.5	0.115	22	0
128	0	68	19	180	30.5	1.391	25	1
128	1	98	41	58	32	1.321	33	1
128	3	72	25	190	32.4	0.549	27	1
128	1	88	39	110	36.5	1.057	37	1
128	1	48	45	194	40.5	0.613	24	1
128	2	78	37	182	43.3	1.224	31	1
129	6	90	7	326	19.6	0.582	60	0
129	3	64	29	115	26.4	0.219	28	1
129	4	60	12	231	27.5	0.527	31	0
129	2	74	26	205	33.2	0.591	25	0
129	4	86	20	270	35.1	0.231	23	0
129	10	76	28	122	35.9	0.28	39	0
129	3	92	49	155	36.4	0.968	32	1
129	7	68	49	125	38.5	0.439	43	1
129	0	110	46	130	67.1	0.319	26	1
130	1	70	13	105	25.9	0.472	22	0
130	3	78	23	79	28.4	0.323	34	1
130	1	60	23	170	28.6	0.692	21	0
131	1	64	14	415	23.7	0.389	21	0
131	4	68	21	166	33.1	0.16	28	0
133	7	88	15	155	32.4	0.262	37	0
133	1	102	28	140	32.8	0.234	45	1
134	9	74	33	60	25.9	0.46	81	0
134	0	58	20	291	26.4	0.352	21	0
134	6	70	23	130	35.4	0.542	29	1
134	6	80	37	370	46.2	0.238	46	1
135	0	94	46	145	40.6	0.284	26	0
135	0	68	42	250	42.3	0.365	24	1
136	7	74	26	135	26	0.647	51	0
136	11	84	35	130	28.3	0.26	42	1
136	5	84	41	88	35	0.286	35	1
136	15	70	32	110	37.1	0.153	43	1
136	1	74	50	204	37.4	0.399	24	0
137	0	68	14	148	24.8	0.143	21	0
137	0	40	35	168	43.1	2.288	33	1
138	0	60	35	167	34.6	0.534	21	1
138	11	74	26	144	36.1	0.557	50	1
139	0	62	17	210	22.1	0.207	21	0
139	5	64	35	140	28.6	0.411	26	0
139	1	46	19	83	28.7	0.654	22	0
139	5	80	35	160	31.6	0.361	25	1
139	1	62	41	480	40.7	0.536	21	0
140	1	74	26	180	24.1	0.828	23	0
140	12	82	43	325	39.2	0.528	58	1
140	0	65	26	130	42.6	0.431	24	1
141	2	58	34	128	25.4	0.699	24	0
142	2	82	18	64	24.7	0.761	21	0
142	7	60	33	190	28.8	0.687	61	0
142	7	90	24	480	30.4	0.128	43	1
143	1	74	22	61	26.2	0.256	21	0
143	1	86	30	330	30.1	0.892	23	0
143	11	94	33	146	36.6	0.254	51	1
143	1	84	23	310	42.4	1.076	22	0
144	4	58	28	140	29.5	0.287	37	0
144	2	58	33	135	31.6	0.422	25	1
144	5	82	26	285	32	0.452	58	1
144	6	72	27	228	33.9	0.255	40	0
144	1	82	46	180	46.1	0.335	46	1
145	13	82	19	110	22.2	0.245	57	0
145	9	88	34	165	30.3	0.771	53	1
145	9	80	46	130	37.9	0.637	40	1
146	2	70	38	360	28	0.337	29	1
146	4	85	27	100	28.9	0.189	27	0
146	2	76	35	194	38.2	0.329	29	0
147	4	74	25	293	34.9	0.385	30	0
148	4	60	27	318	30.9	0.15	29	1
148	10	84	48	237	37.6	1.001	51	1
149	1	68	29	127	29.3	0.349	42	1
150	7	66	42	342	34.7	0.718	42	0
150	7	78	29	126	35.2	0.692	54	1
151	6	62	31	120	35.5	0.692	28	0
151	12	70	40	271	41.8	0.742	38	1
151	8	78	32	210	42.9	0.516	36	1
152	13	90	33	29	26.8	0.731	43	1
152	9	78	34	171	34.2	0.893	33	1
152	0	82	39	272	41.5	0.27	27	0
153	1	82	42	485	40.6	0.687	23	0
153	13	88	37	140	40.6	1.174	39	0
154	6	74	32	193	29.3	0.839	39	0
154	9	78	30	100	30.9	0.164	45	0
154	4	72	29	126	31.3	0.338	37	0
154	4	62	31	284	32.8	0.237	23	0
154	6	78	41	140	46.1	0.571	27	0
155	2	74	17	96	26.6	0.433	27	1
155	11	76	28	150	33.3	1.353	51	1
155	8	62	26	495	34	0.543	46	1
155	2	52	27	540	38.7	0.24	25	1
155	5	84	44	545	38.7	0.619	34	0
156	9	86	28	155	34.3	1.189	42	1
157	1	72	21	168	25.6	0.123	24	0
157	2	74	35	440	39.4	0.134	30	0
158	3	64	13	387	31.2	0.295	24	0
158	3	76	36	245	31.6	0.851	28	1
158	3	70	30	328	35.5	0.344	35	1
158	5	84	41	210	39.4	0.395	29	1
160	7	54	32	175	30.5	0.588	39	1
161	10	68	23	132	25.5	0.326	47	1
162	0	76	56	100	53.2	0.759	25	1
163	3	70	18	105	31.6	0.268	28	1
163	17	72	41	114	40.9	0.817	47	1
164	1	82	43	67	32.8	0.341	50	0
165	6	68	26	168	33.6	0.631	49	0
165	0	76	43	255	47.9	0.259	26	0
165	0	90	33	680	52.3	0.427	23	0
166	5	72	19	175	25.8	0.587	51	1
167	1	74	17	144	23.4	0.447	33	1
167	8	106	46	231	37.6	0.165	43	1
168	7	88	42	321	38.2	0.787	40	1
169	3	74	19	125	29.9	0.268	31	1
170	3	64	37	225	34.5	0.356	30	1
171	3	72	33	135	33.3	0.199	24	1
171	9	110	24	240	45.4	0.721	54	1
172	1	68	49	579	42.4	0.702	28	1
173	4	70	14	168	29.7	0.361	33	1
173	3	78	39	185	33.8	0.97	31	1
173	3	84	33	474	35.7	0.258	22	1
173	3	82	48	465	38.4	2.137	25	1
173	0	78	32	265	46.5	1.159	58	0
174	3	58	22	194	32.9	0.593	36	1
174	2	88	37	120	44.5	0.646	24	1
176	3	86	27	156	33.3	1.154	52	1
176	8	90	34	300	33.7	0.467	58	1
177	0	60	29	478	34.6	1.072	21	1
179	8	72	42	130	32.7	0.719	36	1
179	0	50	36	159	37.8	0.455	22	1
180	3	64	25	70	34	0.271	26	0
180	0	90	26	90	36.5	0.314	35	1
180	0	78	63	14	59.4	2.42	25	1
181	8	68	36	495	30.1	0.615	60	1
181	1	64	30	180	34.1	0.328	38	1
181	7	84	21	192	35.9	0.586	51	1
181	1	78	42	293	40	1.258	22	1
181	0	88	44	510	43.3	0.222	26	1
184	4	78	39	277	37	0.264	31	1
186	8	90	35	225	34.5	0.423	37	1
187	7	50	33	392	33.9	0.826	34	1
187	3	70	22	200	36.4	0.408	36	1
187	7	68	39	304	37.7	0.254	41	1
187	5	76	27	207	43.6	1.034	53	1
188	0	82	14	185	32	0.682	22	1
189	1	60	23	846	30.1	0.398	59	1
189	5	64	33	325	31.2	0.583	29	1
191	3	68	15	130	30.9	0.299	34	0
193	1	50	16	375	25.9	0.655	24	0
195	7	70	33	145	25.1	0.163	55	1
196	1	76	36	249	36.5	0.875	29	1
196	8	76	29	280	37.5	0.605	57	1
197	2	70	45	543	30.5	0.158	53	1
197	4	70	39	744	36.7	2.329	31	0
198	0	66	32	274	41.3	0.502	28	1

Statistical Analysis of Data Using SPSS

Statistical Analysis of Data Using SPSS

Introduction and dataset
The aim of this coursework is to investigate and predict the onset of diabetes based on
various diagnostic measurements.

Glucose	Pregnancies	BloodPressure	SkinThickness	Insulin	BMI	DiabetesPedigreeFunction	Age	Outcome
56	2	56	28	45	24.2	0.332	22	0
68	2	62	13	15	20.1	0.257	23	0
68	2	70	32	66	25	0.187	25	0
68	10	106	23	49	35.5	0.285	47	0
71	1	48	18	76	20.4	0.323	22	0
71	1	78	50	45	33.2	0.422	21	0
74	0	52	10	36	27.8	0.269	22	0
74	3	68	28	45	29.7	0.293	23	0
74	8	70	40	49	35.3	0.705	39	0
75	2	64	24	55	29.7	0.37	33	0
77	1	56	30	56	33.3	1.251	24	0
77	5	82	41	42	35.8	0.156	35	0
78	3	50	32	88	31	0.248	26	1
78	0	88	29	40	36.9	0.434	21	0
79	1	80	25	37	25.4	0.583	22	0
79	1	60	42	48	43.5	0.678	23	0
80	1	74	11	60	30	0.527	22	0
80	3	82	31	70	34.2	1.292	27	1
81	1	72	18	40	26.6	0.283	24	0
81	3	86	16	66	27.5	0.306	22	0
81	2	72	15	76	30.1	0.547	25	0
81	1	74	41	57	46.3	1.096	32	0
81	7	78	40	48	46.7	0.261	42	0
82	1	64	13	95	21.2	0.415	23	0
82	2	52	22	115	28.5	1.699	25	0
83	7	78	26	71	29.3	0.767	36	0
83	2	66	23	50	32.2	0.497	22	0
83	3	58	31	18	34.3	0.336	25	0
83	2	65	28	66	36.8	0.629	24	0
84	2	50	23	76	30.4	0.968	21	0
84	3	68	30	106	31.9	0.591	25	0
84	0	64	22	66	35.8	0.545	21	0
84	1	64	23	115	36.9	0.471	28	0
84	0	82	31	125	38.2	0.233	23	0
84	4	90	23	56	39.5	0.159	25	0
85	4	58	22	49	27.8	0.306	28	0
86	5	68	28	71	30.2	0.364	24	0
86	1	66	52	65	41.3	0.917	29	0
87	2	58	16	52	32.7	0.166	25	0
87	1	78	27	32	34.6	0.101	22	0
87	1	60	37	75	37.2	0.509	22	0
87	1	68	34	77	37.6	0.401	24	0
88	5	66	21	23	24.4	0.342	30	0
88	3	58	11	54	24.8	0.267	22	0
88	2	58	26	16	28.4	0.766	22	0
88	2	74	19	53	29	0.229	22	0
88	1	62	24	44	29.9	0.422	23	0
88	1	78	29	76	32	0.365	29	0
88	12	74	40	54	35.3	0.378	48	0
88	1	30	42	99	55	0.496	26	1
89	1	24	19	25	27.8	0.559	21	0
89	1	66	23	94	28.1	0.167	21	0
89	3	74	16	85	30.4	0.551	38	0
89	1	76	34	37	31.2	0.192	23	0
90	2	80	14	55	24.4	0.249	24	0
90	1	62	18	59	25.1	1.268	25	0
90	1	62	12	43	27.2	0.58	24	0
90	4	88	47	54	37.7	0.362	29	0
91	1	54	25	100	25.2	0.234	23	0
91	4	70	32	88	33.1	0.446	22	0
91	0	68	32	210	39.9	0.381	25	0
92	1	62	25	41	19.5	0.482	25	0
92	12	62	7	258	27.6	0.926	44	1
92	6	62	32	126	32	0.085	46	0
93	0	60	25	92	28.7	0.532	22	0
93	6	50	30	64	28.7	0.356	23	0
93	2	64	32	160	38	0.674	23	1
93	0	100	39	72	43.4	1.021	35	0
94	2	68	18	76	26	0.561	21	0
94	2	76	18	66	31.6	0.649	23	0
94	7	64	25	79	33.3	0.738	41	0
94	0	70	27	115	43.5	0.347	21	0
95	1	66	13	38	19.6	0.334	25	0
95	1	60	18	58	23.9	0.26	22	0
95	1	74	21	73	25.9	0.673	36	0
95	2	54	14	88	26.1	0.748	22	0
95	1	82	25	180	35	0.233	43	1
95	0	80	45	92	36.5	0.33	26	0
95	0	85	25	36	37.4	0.247	24	1
95	0	64	39	105	44.6	0.366	22	0
96	4	56	17	49	20.8	0.34	26	0
96	2	68	13	49	21.1	0.647	26	0
96	3	56	34	115	24.7	0.944	39	0
96	1	64	27	87	33.2	0.289	21	0
96	5	74	18	67	33.6	0.997	43	0
97	1	64	19	82	18.2	0.299	21	0
97	1	66	15	140	23.2	0.487	22	0
97	0	64	36	100	36.8	0.6	25	0
97	7	76	32	91	40.9	0.871	32	1
98	0	82	15	84	25.2	0.299	22	0
98	6	58	33	190	34	0.43	43	0
98	2	60	17	120	34.7	0.198	22	0
99	3	80	11	64	19.3	0.284	30	0
99	2	70	16	44	20.4	0.235	27	0
99	3	62	19	74	21.8	0.279	26	0
99	4	76	15	51	23.2	0.223	21	0
99	2	52	15	94	24.6	0.637	21	0
99	3	54	19	86	25.6	0.154	24	0
99	6	60	19	54	26.9	0.497	32	0
99	5	54	28	83	34	0.499	30	0
99	2	60	17	160	36.6	0.453	21	0
99	1	72	30	18	38.6	0.412	21	0
100	1	74	12	46	19.5	0.149	28	0
100	1	66	15	56	23.6	0.666	26	0
100	1	72	12	70	25.3	0.658	28	0
100	12	84	33	105	30	0.488	46	0
100	0	70	26	50	30.8	0.597	21	0
100	3	68	23	81	31.6	0.949	28	0
100	1	66	29	196	32	0.444	42	0
100	2	66	20	90	32.9	0.867	28	1
100	14	78	25	184	36.6	0.412	46	1
100	2	54	28	105	37.8	0.498	24	0
100	2	68	25	71	38.5	0.324	26	0
100	8	74	40	215	39.4	0.661	43	1
100	2	70	52	57	40.5	0.677	25	0
100	0	88	60	110	46.8	0.962	31	0
101	2	58	35	90	21.8	0.155	22	0
101	2	58	17	265	24.2	0.614	23	0
101	1	50	15	36	24.2	0.526	26	0
101	10	76	48	180	32.9	0.171	63	0
102	0	86	17	105	29.3	0.695	27	0
102	3	44	20	94	30.8	0.4	26	0
102	0	78	40	90	34.5	0.238	24	0
102	7	74	40	105	37.2	0.204	45	0
102	0	64	46	78	40.6	0.496	21	0
102	2	86	36	120	45.5	0.127	23	1
103	1	80	11	82	19.4	0.491	22	0
103	4	60	33	192	24	0.966	33	0
103	3	72	30	152	27.6	0.73	27	0
103	6	72	32	190	37.7	0.324	55	0
103	1	30	38	83	43.3	0.183	33	0
104	0	64	23	116	27.8	0.454	23	0
104	6	74	18	156	29.9	0.722	41	1
104	0	64	37	64	33.6	0.51	22	1
105	6	70	32	68	30.8	0.122	37	0
105	2	80	45	191	33.7	0.711	29	1
105	2	58	40	94	34.9	0.225	25	0
105	5	72	29	325	36.9	0.159	28	0
105	0	64	41	142	41.5	0.173	22	0
106	2	56	27	165	29	0.426	22	0
106	2	64	35	119	30.5	1.4	34	0
106	3	54	21	158	30.9	0.292	24	0
106	1	70	28	135	34.2	0.142	22	0
106	0	70	37	148	39.4	0.605	22	0
107	3	62	13	48	22.9	0.678	23	1
107	1	72	30	82	30.8	0.821	24	0
107	2	74	30	100	33.6	0.404	23	0
107	0	62	30	74	36.6	0.757	25	1
108	6	44	20	130	24	0.813	35	0
108	2	62	32	56	25.2	0.128	21	0
108	2	62	10	278	25.3	0.881	22	0
108	2	52	26	63	32.5	0.318	22	0
108	1	60	46	178	35.5	0.415	24	0
108	5	72	43	75	36.1	0.263	33	0
109	1	38	18	120	23.1	0.407	26	0
109	1	56	21	135	25.2	0.833	23	0
109	1	60	8	182	25.4	0.947	21	0
109	8	76	39	114	27.9	0.64	31	1
109	1	58	18	116	28.5	0.219	22	0
109	4	64	44	99	34.8	0.905	26	1
109	5	62	41	129	35.8	0.514	25	1
110	4	76	20	100	28.4	0.118	27	0
110	2	74	29	125	32.4	0.698	27	0
111	1	62	13	182	24	0.138	23	0
111	3	90	12	78	28.4	0.495	29	0
111	3	58	31	44	29.5	0.43	22	0
111	4	72	47	207	37.1	1.39	56	1
112	2	68	22	94	34.1	0.315	26	0
112	9	82	32	175	34.2	0.26	36	1
112	1	72	30	176	34.4	0.528	25	0
112	1	80	45	132	34.8	0.217	24	0
112	2	86	42	160	38.4	0.246	28	0
112	2	78	50	140	39.4	0.175	24	0
113	3	50	10	85	29.5	0.626	25	0
114	7	76	17	110	23.8	0.466	31	0
114	1	66	36	200	38.1	0.289	21	0
114	0	80	34	285	44.2	0.167	27	0
115	1	70	30	96	34.6	0.529	32	1
115	3	66	39	140	38.1	0.15	28	0
116	4	72	12	87	22.1	0.463	37	0
116	3	74	15	105	26.3	0.107	24	0
116	1	78	29	180	36.1	0.496	25	0
117	2	90	19	71	25.2	0.313	21	0
117	0	66	31	188	30.8	0.493	22	0
117	4	64	27	120	33.2	0.23	24	0
117	1	60	23	106	33.8	0.466	27	0
117	1	88	24	145	34.5	0.403	40	1
117	5	86	30	105	39.1	0.251	42	0
117	0	80	31	53	45.2	0.089	24	0
118	1	58	36	94	33.3	0.261	23	0
118	0	84	47	230	45.8	0.551	31	1
119	1	54	13	50	22.3	0.205	24	0
119	6	50	22	176	27.1	1.318	33	1
119	0	64	18	92	34.9	0.725	23	0
119	1	44	47	63	35.5	0.28	25	0
119	1	88	41	170	45.3	0.507	26	0
119	1	86	39	220	45.6	0.808	29	1
120	9	72	22	56	20.8	0.733	48	0
120	0	74	18	63	30.5	0.285	26	0
120	1	80	48	200	38.9	1.162	41	0
120	2	76	37	105	39.7	0.215	29	0
120	11	80	37	150	42.3	0.785	48	1
120	3	70	30	135	42.9	0.452	30	0
121	5	72	23	112	26.2	0.245	30	0
121	0	66	30	165	34.3	0.203	33	1
121	1	78	39	74	39	0.261	28	0
121	2	70	32	95	39.1	0.886	23	0
122	2	60	18	106	29.8	0.717	22	0
122	1	64	32	156	35.1	0.692	30	1
122	2	76	27	200	35.9	0.483	26	0
122	2	52	43	158	36.2	0.816	28	0
122	1	90	51	220	49.7	0.325	31	1
123	4	80	15	176	32	0.443	34	0
123	9	70	44	94	33.1	0.374	40	0
123	6	72	45	230	33.6	0.733	34	0
123	5	74	40	77	34.1	0.269	28	0
123	2	48	32	165	42.1	0.52	26	0
123	3	100	35	240	57.3	0.88	22	0
124	0	56	13	105	21.8	0.452	21	0
124	7	70	33	215	25.5	0.161	37	0
124	8	76	24	600	28.7	0.687	52	1
124	2	68	28	205	32.9	0.875	30	1
124	3	80	33	130	33.2	0.305	26	0
124	9	70	33	402	35.4	0.282	34	0
125	1	70	24	110	24.3	0.221	25	0
125	4	70	18	122	28.9	1.144	45	1
125	6	68	30	120	30	0.464	32	0
125	10	70	26	115	31.1	0.205	41	1
125	1	50	40	167	33.3	0.962	28	1
125	2	60	20	140	33.8	0.088	31	0
126	8	74	38	75	25.9	0.162	39	0
126	0	86	27	120	27.4	0.515	21	0
126	1	56	29	152	28.7	0.801	21	0
126	5	78	27	22	29.6	0.439	40	0
126	0	84	29	215	30.7	0.52	24	0
126	8	88	36	108	38.5	0.349	49	0
126	3	88	41	235	39.3	0.704	27	0
127	2	58	24	275	27.7	1.6	25	0
127	2	46	21	335	34.4	0.176	22	0
127	4	88	11	155	34.5	0.598	28	0
127	0	80	37	210	36.3	0.804	23	0
128	1	82	17	183	27.5	0.115	22	0
128	0	68	19	180	30.5	1.391	25	1
128	1	98	41	58	32	1.321	33	1
128	3	72	25	190	32.4	0.549	27	1
128	1	88	39	110	36.5	1.057	37	1
128	1	48	45	194	40.5	0.613	24	1
128	2	78	37	182	43.3	1.224	31	1
129	6	90	7	326	19.6	0.582	60	0
129	3	64	29	115	26.4	0.219	28	1
129	4	60	12	231	27.5	0.527	31	0
129	2	74	26	205	33.2	0.591	25	0
129	4	86	20	270	35.1	0.231	23	0
129	10	76	28	122	35.9	0.28	39	0
129	3	92	49	155	36.4	0.968	32	1
129	7	68	49	125	38.5	0.439	43	1
129	0	110	46	130	67.1	0.319	26	1
130	1	70	13	105	25.9	0.472	22	0
130	3	78	23	79	28.4	0.323	34	1
130	1	60	23	170	28.6	0.692	21	0
131	1	64	14	415	23.7	0.389	21	0
131	4	68	21	166	33.1	0.16	28	0
133	7	88	15	155	32.4	0.262	37	0
133	1	102	28	140	32.8	0.234	45	1
134	9	74	33	60	25.9	0.46	81	0
134	0	58	20	291	26.4	0.352	21	0
134	6	70	23	130	35.4	0.542	29	1
134	6	80	37	370	46.2	0.238	46	1
135	0	94	46	145	40.6	0.284	26	0
135	0	68	42	250	42.3	0.365	24	1
136	7	74	26	135	26	0.647	51	0
136	11	84	35	130	28.3	0.26	42	1
136	5	84	41	88	35	0.286	35	1
136	15	70	32	110	37.1	0.153	43	1
136	1	74	50	204	37.4	0.399	24	0
137	0	68	14	148	24.8	0.143	21	0
137	0	40	35	168	43.1	2.288	33	1
138	0	60	35	167	34.6	0.534	21	1
138	11	74	26	144	36.1	0.557	50	1
139	0	62	17	210	22.1	0.207	21	0
139	5	64	35	140	28.6	0.411	26	0
139	1	46	19	83	28.7	0.654	22	0
139	5	80	35	160	31.6	0.361	25	1
139	1	62	41	480	40.7	0.536	21	0
140	1	74	26	180	24.1	0.828	23	0
140	12	82	43	325	39.2	0.528	58	1
140	0	65	26	130	42.6	0.431	24	1
141	2	58	34	128	25.4	0.699	24	0
142	2	82	18	64	24.7	0.761	21	0
142	7	60	33	190	28.8	0.687	61	0
142	7	90	24	480	30.4	0.128	43	1
143	1	74	22	61	26.2	0.256	21	0
143	1	86	30	330	30.1	0.892	23	0
143	11	94	33	146	36.6	0.254	51	1
143	1	84	23	310	42.4	1.076	22	0
144	4	58	28	140	29.5	0.287	37	0
144	2	58	33	135	31.6	0.422	25	1
144	5	82	26	285	32	0.452	58	1
144	6	72	27	228	33.9	0.255	40	0
144	1	82	46	180	46.1	0.335	46	1
145	13	82	19	110	22.2	0.245	57	0
145	9	88	34	165	30.3	0.771	53	1
145	9	80	46	130	37.9	0.637	40	1
146	2	70	38	360	28	0.337	29	1
146	4	85	27	100	28.9	0.189	27	0
146	2	76	35	194	38.2	0.329	29	0
147	4	74	25	293	34.9	0.385	30	0
148	4	60	27	318	30.9	0.15	29	1
148	10	84	48	237	37.6	1.001	51	1
149	1	68	29	127	29.3	0.349	42	1
150	7	66	42	342	34.7	0.718	42	0
150	7	78	29	126	35.2	0.692	54	1
151	6	62	31	120	35.5	0.692	28	0
151	12	70	40	271	41.8	0.742	38	1
151	8	78	32	210	42.9	0.516	36	1
152	13	90	33	29	26.8	0.731	43	1
152	9	78	34	171	34.2	0.893	33	1
152	0	82	39	272	41.5	0.27	27	0
153	1	82	42	485	40.6	0.687	23	0
153	13	88	37	140	40.6	1.174	39	0
154	6	74	32	193	29.3	0.839	39	0
154	9	78	30	100	30.9	0.164	45	0
154	4	72	29	126	31.3	0.338	37	0
154	4	62	31	284	32.8	0.237	23	0
154	6	78	41	140	46.1	0.571	27	0
155	2	74	17	96	26.6	0.433	27	1
155	11	76	28	150	33.3	1.353	51	1
155	8	62	26	495	34	0.543	46	1
155	2	52	27	540	38.7	0.24	25	1
155	5	84	44	545	38.7	0.619	34	0
156	9	86	28	155	34.3	1.189	42	1
157	1	72	21	168	25.6	0.123	24	0
157	2	74	35	440	39.4	0.134	30	0
158	3	64	13	387	31.2	0.295	24	0
158	3	76	36	245	31.6	0.851	28	1
158	3	70	30	328	35.5	0.344	35	1
158	5	84	41	210	39.4	0.395	29	1
160	7	54	32	175	30.5	0.588	39	1
161	10	68	23	132	25.5	0.326	47	1
162	0	76	56	100	53.2	0.759	25	1
163	3	70	18	105	31.6	0.268	28	1
163	17	72	41	114	40.9	0.817	47	1
164	1	82	43	67	32.8	0.341	50	0
165	6	68	26	168	33.6	0.631	49	0
165	0	76	43	255	47.9	0.259	26	0
165	0	90	33	680	52.3	0.427	23	0
166	5	72	19	175	25.8	0.587	51	1
167	1	74	17	144	23.4	0.447	33	1
167	8	106	46	231	37.6	0.165	43	1
168	7	88	42	321	38.2	0.787	40	1
169	3	74	19	125	29.9	0.268	31	1
170	3	64	37	225	34.5	0.356	30	1
171	3	72	33	135	33.3	0.199	24	1
171	9	110	24	240	45.4	0.721	54	1
172	1	68	49	579	42.4	0.702	28	1
173	4	70	14	168	29.7	0.361	33	1
173	3	78	39	185	33.8	0.97	31	1
173	3	84	33	474	35.7	0.258	22	1
173	3	82	48	465	38.4	2.137	25	1
173	0	78	32	265	46.5	1.159	58	0
174	3	58	22	194	32.9	0.593	36	1
174	2	88	37	120	44.5	0.646	24	1
176	3	86	27	156	33.3	1.154	52	1
176	8	90	34	300	33.7	0.467	58	1
177	0	60	29	478	34.6	1.072	21	1
179	8	72	42	130	32.7	0.719	36	1
179	0	50	36	159	37.8	0.455	22	1
180	3	64	25	70	34	0.271	26	0
180	0	90	26	90	36.5	0.314	35	1
180	0	78	63	14	59.4	2.42	25	1
181	8	68	36	495	30.1	0.615	60	1
181	1	64	30	180	34.1	0.328	38	1
181	7	84	21	192	35.9	0.586	51	1
181	1	78	42	293	40	1.258	22	1
181	0	88	44	510	43.3	0.222	26	1
184	4	78	39	277	37	0.264	31	1
186	8	90	35	225	34.5	0.423	37	1
187	7	50	33	392	33.9	0.826	34	1
187	3	70	22	200	36.4	0.408	36	1
187	7	68	39	304	37.7	0.254	41	1
187	5	76	27	207	43.6	1.034	53	1
188	0	82	14	185	32	0.682	22	1
189	1	60	23	846	30.1	0.398	59	1
189	5	64	33	325	31.2	0.583	29	1
191	3	68	15	130	30.9	0.299	34	0
193	1	50	16	375	25.9	0.655	24	0
195	7	70	33	145	25.1	0.163	55	1
196	1	76	36	249	36.5	0.875	29	1
196	8	76	29	280	37.5	0.605	57	1
197	2	70	45	543	30.5	0.158	53	1
197	4	70	39	744	36.7	2.329	31	0
198	0	66	32	274	41.3	0.502	28	1

From the tabulated figures:

(1) Generate two random numbers between 2 and 7 and provide SPSS output.
(1 mark)

(2) Using SPSS, erase columns corresponding to your generated numbers (e.g. if
one of the generated numbers is 5 then erase column C5, etc). Describe how you did
this and provide the sequence of actions (e.g. Calc->Descriptive Stats->….)
(2 mark)

(3) Using SPSS select a random sample of 300 observations (n = 300) from your
dataset. Provide the sequence of actions of how you did this.
(1 mark)
Your unique dataset will now consist of 300 rows and seven columns including
Glucose, Age and Outcome.
Investigating your unique dataset

(5) Using SPSS, define a new variable, Age_Group, by combining observations
for participants younger than 30 into group 1 and all others (of age 30 and older) into
group 2. Provide either a description or a screen shot of how you did this.
(3 marks)

(6) Investigate whether there is a significant difference in mean/median Glucose
concentration between age groups. Formulate the null and alternative hypotheses;
choose, justify and perform an appropriate statistical test using SPSS; provide all
SPSS outputs; write your conclusions.
(10 marks)

(7) Show whether the proportion of participants with Glucose concentration greater
than 100 mg/dl is different between age groups that you defined previously. Formulate
the null and alternative hypotheses; choose, justify and perform an appropriate
statistical test using SPSS; provide all SPSS outputs; write your conclusions.
(10 marks)

(8) Using SPSS, produce a table of correlation coefficients. Justify the choice of
correlation coefficient, investigate the resulting table and comment on most interesting
relationships between chosen variables. Do not use Glucose and Outcome variables in
this analysis.
(4 marks)

(12) It is known that the incidence of diabetes in the UK is 0.6. In a small northern
village of 100 people isolated from the mainland for six months per year the pharmacy
wants to know how many insulin shots to order. We want to know what is the
probability that between A and B people will develop the disease during this period. To
perform analysis, generate two random numbers between 0 and 100 using SPSS
and paste the outputs into your report. Denote by A the smallest number and by B the
largest number out of these two generated numbers. Calculate the probability that
between A and B people develop the disease and how many shots should be ordered.
(9 marks)

Categorical (Nominal) Dependent Variables – Logit (Logistic Regression)

Here is an introductory/survey video of Logit Analysis, which allows us to analyze nominal dependent variables. Regression only allow us to work with continuous variables.

Video: Introduction to Logit Analysis:
https://youtu.be/ANi_PpkTSJA

Note: This Extra Credit Assignment is a bit tougher than the other ones, so it is worth a bonus of up to 10% of the final grade if you get everything right. The other assignments are worth 7% each.

Afterwatching the video, try this extra credit assignment:
Prompt:
Answer Part 1, Part 2, and Part 3. Given the following coefficients from a logit analysis, and the sample data values given for two respondents, calculate the probability of a person liking a dark-colored imported car over a light-colored imported car. Your answers are probabilities. Show your work. Use Word or PDF format for submission to Turnitin.com (link below). You may need to hand-write the formula and show your work on paper, then photograph or scan it into a file. That’s OK, but typing it into Word is preferred, if you can figure it out.

The Dependent Variable (DV) is “Prefers Dark colored imported car.” This measure is labeled”PrefDark” in the data
= 0 if preference is for a light colored car,
= 1 if preference is for a dark-colored car.

Here are the Independent Variables (IVs):
Age in years (no intervals – labeled “Age” in the data)

Gender (measure is labeled “Gender” in the data)
= 0 if male,
= 1 if female.

Education level (measure is labeled EducLevel in the data)
= 0 if completed high school only
= 1 if completed Associate’s degree (Community College)
= 2 if completed Undergraduate degree (BA or BS)
= 3 if completed a Graduate degree

Income per year (in Euros, measure is labeled Income))

Consider, also, these coefficients for each measure (data point), calculated by running a Logit analysis on the data sample for the DV, PrefDark:

Coefficients and Constant
Age             0.101
Gender      0.34
EducLevel –5.1
Income      0.000142
Constant      3.22

Assume all coefficients and the constant are statistically significant (you can’t ignore them).

Part 1 (4 points):
Now consider this person, Respondent 1:
Age = 24
Gender = 1 (female)
EducLevel = 2 (Undergraduate degree)
Income/year = Euros 38000
What is the probability this person prefers a dark-colored imported car?

Part 2 (4 points):
Additionally, consider this other person, Respondent 2:
54 year old male, with a graduate degree, earning Euros 58000 per year.
What is the probability this person prefers a dark-colored imported car?

Hint: Use the formula given in the video for calculating P(Y_i=y_i).

Show your work, please.

Part 3 (2 points)
Which Respondent has a higher probability of preferring a dark-colored car?
This is quite straightforward if you have Parts 1 and 2 correct.

Regression Modeling

Assignment Content

Purpose
This assignment provides an opportunity to develop, evaluate, and apply bivariate and multivariate linear regression models.

Resources: Microsoft Excel®, DAT565_v3_Wk5_Data_File

Instructions:
The Excel file for this assignment contains a database with information about the tax assessment value assigned to medical office buildings in a city. The following is a list of the variables in the database:
- FloorArea: square feet of floor space
- Offices: number of offices in the building
- Entrances: number of customer entrances
- Age: age of the building (years)
- AssessedValue: tax assessment value (thousands of dollars)
- Use the data to construct a model that predicts the tax assessment value assigned to medical office buildings with specific characteristics.
- Construct a scatter plot in Excel with FloorArea as the independent variable and AssessmentValue as the dependent variable. Insert the bivariate linear regression equation and r^2 in your graph. Do you observe a linear relationship between the 2 variables?
- Use Excel’s Analysis ToolPak to conduct a regression analysis of FloorArea and AssessmentValue. Is FloorArea a significant predictor of AssessmentValue?
- Construct a scatter plot in Excel with Age as the independent variable and AssessmentValue as the dependent variable. Insert the bivariate linear regression equation and r^2 in your graph. Do you observe a linear relationship between the 2 variables?
- Use Excel’s Analysis ToolPak to conduct a regression analysis of Age and Assessment Value. Is Age a significant predictor of AssessmentValue?
- Construct a multiple regression model.
- Use Excel’s Analysis ToolPak to conduct a regression analysis with AssessmentValue as the dependent variable and FloorArea, Offices, Entrances, and Age as independent variables. What is the overall fit r^2? What is the adjusted r^2?
- Which predictors are considered significant if we work with α=0.05? Which predictors can be eliminated?
- What is the final model if we only use FloorArea and Offices as predictors?
- Suppose our final model is:
- AssessedValue = 115.9 + 0.26 x FloorArea + 78.34 x Offices
- What wouldbe the assessed value of a medical office building with a floor area of 3500 sq. ft., 2 offices, that was built 15 years ago? Is this assessed value consistent with what appears in the database?
- Submit your assignment.
  Resources
- Center for Writing Excellence
- Reference and Citation Generator
- Grammar and Writing Guides