For this exercise, we use theHepatitis Disease datasetfrom UCI data repository. This data consists of 156 instances with 20 attributes.Attribute information: 1. Class: DIE, LIVE 2. AGE: 10, 20, 30,...

1 answer below »

For this exercise, we use theHepatitis Disease datasetfrom UCI data repository. This data consists of 156 instances with 20 attributes.

Attribute information:

1. Class: DIE, LIVE

2. AGE: 10, 20, 30, 40, 50, 60, 70, 80

3. SEX: male, female

4. STEROID: no, yes

5. ANTIVIRALS: no, yes

6. FATIGUE: no, yes

7. MALAISE: no, yes

8. ANOREXIA: no, yes

9. LIVER BIG: no, yes

10. LIVER FIRM: no, yes

11. SPLEEN PALPABLE: no, yes

12. SPIDERS: no, yes

13. ASCITES: no, yes

14. VARICES: no, yes

15. BILIRUBIN: Continuous

16. ALK PHOSPHATE: Continuous

17. SGOT: Continuous

18. ALBUMIN: Continuous

19. PROTIME: Continuous

20. HISTOLOGY: no, yes

Question no 1: Use Excel and the hepatitis dataset. Answer the following questions:
(1+1+1+1+1+2=7)

a. Probability of a Male patient being dead.

b. There is one patient with attribute ANOREXIA value to be "?" -- question is, what is the likely value of this attribute for this patient?

c. What is the probability that a patient between age [10,50] use steroid? (Replace “?” with “Yes”)

d. Which one is more likely, a person with no ANTIVIRALS being Alive or a person with MALAISE being dead?

e. Which Age group is more likely to be dead ? What are the probabilities? (Group the ages in 3 groups. 20-40, 40-60, 60-80)

f. Is the age attribute normally distributed? Reason why or why not?

[ for Question no 1: you are allowed to use inbuilt excel function. As an example, for probability of a male being dead, I would like to see something as follows:

"This question could be answered by finding xx and doing xxx".

Show how finding XX

How doing XXXX

Therefore answer is:

2.
Use Excel/Python and the Hepatitis dataset: (3+2= 5)

Create 3 different visualizations showing the mean and standard deviation (orstandard erroras it is referred to in this context) of the sampling distributions of sample age for sample sizes: 2, 5, 10

What happens to the mean of the sample means of age as the sample size is increased? What happens to the standard error ?

[description: In addition to doing the work in python or excel, you need to write a descriptive answer that summarizes your findings.

Question no 3: USE PYTHON (1+2+2)

a. Generate a discrete uniform distribution of population size 100 between interval (1,10).)

b Consider the sample size of N=10, Simulate the sampling distribution of the sample mean. (repeat 100 times) Draw the visualization.

c Consider the sample size of N=30, what is the sample mean and sample standard deviation? (repeat 100 times). Draw the visualization.

[Code+ graphs]

hepatitis-dux3c0en.numbers

Answered Same DayOct 13, 2022

Answer To: For this exercise, we use theHepatitis Disease datasetfrom UCI data repository. This data consists...

Baljit answered on Oct 14 2022

52 Votes

SOLUTION.PDF

For this exercise, we use theHepatitis Disease datasetfrom UCI data repository. This data consists of 156 instances with 20 attributes.Attribute information: 1. Class: DIE, LIVE 2. AGE: 10, 20, 30,...

Answer To: For this exercise, we use theHepatitis Disease datasetfrom UCI data repository. This data consists...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment