NPTEL Data Science Using Python – Week 2: Assignment 2 Answers
- What type of data is represented by a nominal scale?
- Data that can be ordered or ranked.
- Data that represents categories without a natural order.
- Numerical data that can take infinitely many values.
- Numerical data that can only take fixed values.
2. You are given a dataset containing information on a collection of books in a library, including the number of pages, genre, and publication year. Identify which of these attributes are qualitative and which are quantitative. Further, specify whether the qualitative data are nominal or ordinal.
- Number of pages, genre, and publication year are quantitative.
- Number of pages, genre, and publication year are qualitative.
- Number of pages and publication year are quantitative while genre is qualitative.
- Number of pages and publication year are qualitative while genre is quantitative.
3. Given the following dataset representing the ages of participants in a study: [22, 25, 19, 24, 22, 26, 28, 22, 21]. The median of these ages is ____________.
- 22
4. If a data point is below Q1 – 1.5 IQR or above Q3 + 1.5 IQR, it is considered: (where Q1: First Quartile, Q3: Third Quartile, IQR: Inter Quartile Range):
- A median
- An average
- An outlier
- The mode
5. Which of the following statements are true regarding box plots?
- They display the five-number summary.
- They cannot show outliers.
- The whiskers represent the range of the data.
- The box represents the interquartile range.
6. Which of the following is not a goal of inferential statistics?
- To describe the sample data.
- To make conclusions from a sample about the population.
- To predict future economic conditions.
- To understand consumer behavior.
7. The probability of an email being spam is _____, given that it contains the word “free.” If 3% of all emails are spam and the word “free” appears in 10% of all emails, and we know that “free” appears in 60% of spam emails.
- 0.18
8. In hypothesis testing, if the observed p-value is less than the chosen significance level, we _______ the null hypothesis in favor of the alternative hypothesis.
- reject
9. Which conditions necessitate the use of a two-tailed test?
- When testing if a new teaching method differs from the traditional method.
- When the direction of the effect is not specified.
- When the alternative hypothesis specifies a direction.
- When assessing if a drug has any effect, regardless of the direction.
10. Which of the following steps are parts of a hypothesis test’s decision-making process?
- Stating the hypotheses.
- Selecting a significance level.
- Choosing the appropriate test and performing calculations.
- Consulting subject matter experts.
11. Given a dataset with a sample mean of 100, a population mean of 105, a standard deviation of 15, and a sample size of 30. One-sample t-test value is________.
- -1.8275
12. Conditional probability is used when:
- Two events are independent.
- Two events are dependent.
- One event has already occurred.
- Updating probabilities with new information.
13. Which measure of centrality indicates that the most common energy level score reported by participants was?
- 6.5
- 8
- 7
- Cannot be determined.
14. The measure of spread indicating the average distance of each data point from the mean is the:
- Range
- Correlation
- Standard Deviation
- Median
15. Based on the box and whisker plot, if several data points were identified outside the range of typical values, these points are considered ____________.
- Outliers