**NPTEL Data Science Using Python – Week 2: Assignment 2 Answers**

**What type of data is represented by a nominal scale?**

- Data that can be ordered or ranked.
**Data that represents categories without a natural order.**- Numerical data that can take infinitely many values.
- Numerical data that can only take fixed values.

2. **You are given a dataset containing information on a collection of books in a library, including the number of pages, genre, and publication year. Identify which of these attributes are qualitative and which are quantitative. Further, specify whether the qualitative data are nominal or ordinal.**

- Number of pages, genre, and publication year are quantitative.
- Number of pages, genre, and publication year are qualitative.
**Number of pages and publication year are quantitative while genre is qualitative.**- Number of pages and publication year are qualitative while genre is quantitative.

3. **Given the following dataset representing the ages of participants in a study: [22, 25, 19, 24, 22, 26, 28, 22, 21]. The median of these ages is ____________.**

**22**

4. **If a data point is below Q1 – 1.5 IQR or above Q3 + 1.5 IQR, it is considered: (where Q1: First Quartile, Q3: Third Quartile, IQR: Inter Quartile Range):**

- A median
- An average
**An outlier**- The mode

5. **Which of the following statements are true regarding box plots?**

**They display the five-number summary.**- They cannot show outliers.
- The whiskers represent the range of the data.
**The box represents the interquartile range.**

6. **Which of the following is not a goal of inferential statistics?**

**To describe the sample data.**- To make conclusions from a sample about the population.
- To predict future economic conditions.
- To understand consumer behavior.

7. **The probability of an email being spam is _____, given that it contains the word “free.” If 3% of all emails are spam and the word “free” appears in 10% of all emails, and we know that “free” appears in 60% of spam emails.**

**0.18**

8. **In hypothesis testing, if the observed p-value is less than the chosen significance level, we _______ the null hypothesis in favor of the alternative hypothesis.**

**reject**

9. **Which conditions necessitate the use of a two-tailed test?**

**When testing if a new teaching method differs from the traditional method.****When the direction of the effect is not specified.**- When the alternative hypothesis specifies a direction.
**When assessing if a drug has any effect, regardless of the direction.**

10. **Which of the following steps are parts of a hypothesis test’s decision-making process?**

**Stating the hypotheses.****Selecting a significance level.****Choosing the appropriate test and performing calculations.**- Consulting subject matter experts.

11. **Given a dataset with a sample mean of 100, a population mean of 105, a standard deviation of 15, and a sample size of 30. One-sample t-test value is________.**

**-1.8275**

12. **Conditional probability is used when:**

**Two events are independent.**- Two events are dependent.
**One event has already occurred.****Updating probabilities with new information.**

13. **Which measure of centrality indicates that the most common energy level score reported by participants was?**

- 6.5
**8**- 7
- Cannot be determined.

14. **The measure of spread indicating the average distance of each data point from the mean is the:**

- Range
- Correlation
**Standard Deviation**- Median

15. **Based on the box and whisker plot, if several data points were identified outside the range of typical values, these points are considered ____________.**

**Outliers**