Descriptive statistics is a branch of statistics that summarizes and describes the main features of a dataset.
- It does not involve prediction or inference.
- Focuses on understanding data characteristics through numerical summaries and visualizations.
- It is a key step in Exploratory Data Analysis (EDA).
EDA = Descriptive Statistics + Data Visualization
Dataset and Data Types
A dataset is a collection of data objects (examples: records, patterns, samples, observations).
Each data object is described by attributes (features or properties).
Example: Sample Patient Table (Table 2.2)
Patient ID | Name | Age | Blood Test | Fever | Disease |
---|---|---|---|---|---|
1 | John | 21 | Negative | Low | No |
2 | Andre | 36 | Positive | High | Yes |
Types of Data (Figure 2.1)

Data is broadly classified into:
1. Categorical (Qualitative) Data
- Describes qualities or labels, not measurable.
- Further classified into:
- Nominal Data: No inherent order.
e.g., Patient ID, Blood Group, Gender.- Only equality comparisons (
=
,≠
) are valid.
- Only equality comparisons (
- Ordinal Data: Has a meaningful order.
e.g., Fever = {Low, Medium, High}- Can be ranked, but exact differences are unknown.
- Nominal Data: No inherent order.
2. Numerical (Quantitative) Data
- Represents measurable quantities.
- Subtypes:
- Interval Data: Numeric values with meaningful differences, but no true zero.
e.g., Temperature in Celsius or Fahrenheit.- Operations allowed:
+
,–
- Operations allowed:
- Ratio Data: Has a meaningful zero and allows all mathematical operations.
e.g., Age, Height, Weight.
- Interval Data: Numeric values with meaningful differences, but no true zero.
Based on Values
Type | Description | Example |
---|---|---|
Discrete | Countable integers | Employee ID, Survey scores |
Continuous | Values with decimal precision, measurable | Age (e.g., 12.5), Height, Weight |
Based on Number of Variables (Figure 2.2)

Type | Description |
---|---|
Univariate | One variable per record |
Bivariate | Two variables |
Multivariate | Three or more variables |