NPTEL Data Science Using Python – Week 2: Assignment 2 Answers

NPTEL Data Science Using Python – Week 2: Assignment 2 Answers

  1. What type of data is represented by a nominal scale?
  • Data that can be ordered or ranked.
  • Data that represents categories without a natural order.
  • Numerical data that can take infinitely many values.
  • Numerical data that can only take fixed values.

2. You are given a dataset containing information on a collection of books in a library, including the number of pages, genre, and publication year. Identify which of these attributes are qualitative and which are quantitative. Further, specify whether the qualitative data are nominal or ordinal.

  • Number of pages, genre, and publication year are quantitative.
  • Number of pages, genre, and publication year are qualitative.
  • Number of pages and publication year are quantitative while genre is qualitative.
  • Number of pages and publication year are qualitative while genre is quantitative.

3. Given the following dataset representing the ages of participants in a study: [22, 25, 19, 24, 22, 26, 28, 22, 21]. The median of these ages is ____________.

  • 22

4. If a data point is below Q1 – 1.5 IQR or above Q3 + 1.5 IQR, it is considered: (where Q1: First Quartile, Q3: Third Quartile, IQR: Inter Quartile Range):

  • A median
  • An average
  • An outlier
  • The mode

5. Which of the following statements are true regarding box plots?

  • They display the five-number summary.
  • They cannot show outliers.
  • The whiskers represent the range of the data.
  • The box represents the interquartile range.

6. Which of the following is not a goal of inferential statistics?

  • To describe the sample data.
  • To make conclusions from a sample about the population.
  • To predict future economic conditions.
  • To understand consumer behavior.

7. The probability of an email being spam is _____, given that it contains the word “free.” If 3% of all emails are spam and the word “free” appears in 10% of all emails, and we know that “free” appears in 60% of spam emails.

  • 0.18

8. In hypothesis testing, if the observed p-value is less than the chosen significance level, we _______ the null hypothesis in favor of the alternative hypothesis.

  • reject

9. Which conditions necessitate the use of a two-tailed test?

  • When testing if a new teaching method differs from the traditional method.
  • When the direction of the effect is not specified.
  • When the alternative hypothesis specifies a direction.
  • When assessing if a drug has any effect, regardless of the direction.

10. Which of the following steps are parts of a hypothesis test’s decision-making process?

  • Stating the hypotheses.
  • Selecting a significance level.
  • Choosing the appropriate test and performing calculations.
  • Consulting subject matter experts.

11. Given a dataset with a sample mean of 100, a population mean of 105, a standard deviation of 15, and a sample size of 30. One-sample t-test value is________.

  • -1.8275

12. Conditional probability is used when:

  • Two events are independent.
  • Two events are dependent.
  • One event has already occurred.
  • Updating probabilities with new information.

13. Which measure of centrality indicates that the most common energy level score reported by participants was?

  • 6.5
  • 8
  • 7
  • Cannot be determined.

14. The measure of spread indicating the average distance of each data point from the mean is the:

  • Range
  • Correlation
  • Standard Deviation
  • Median

15. Based on the box and whisker plot, if several data points were identified outside the range of typical values, these points are considered ____________.

  • Outliers

Leave a Reply

Your email address will not be published. Required fields are marked *