Data is a raw fact or unprocessed information that can be stored, transmitted, and processed by a computer system. It can be in the form of text, numbers, images, audio, video, etc. Data becomes meaningful only after it is processed into information.
6 V’s of Big Data:
Big Data is defined by six main characteristics, also known as the 6 V’s. These elements differentiate Big Data from traditional small-scale data.
1️⃣ Volume
- Refers to the huge size of data generated and collected every second.
- While traditional data is measured in GB (Gigabytes) or TB (Terabytes), Big Data is measured in PB (Petabytes) or EB (Exabytes).
- Example: Facebook and YouTube generate petabytes of data daily.
2️⃣ Velocity
- Describes the speed at which data is generated, transmitted, and processed.
- With the rise of IoT, social media, and sensors, data is created in real-time or near real-time.
- Fast data generation requires quick processing to make timely decisions.
- Example: Online stock market trading or live traffic updates.
3️⃣ Variety
- Refers to the different types, forms, and sources of data.
- Forms: Structured (tables), semi-structured (JSON, XML), and unstructured (text, images, video).
- Functions: Human conversations, archived records, sensor logs, etc.
- Sources: Social media, public datasets, multimedia content.
- Example: A single YouTube video can have text (title), audio, video, and viewer comments.
4️⃣ Veracity
- Indicates the accuracy, reliability, and trustworthiness of the data.
- Data may contain errors, noise, duplicates, or inconsistencies due to human or technical issues.
- Veracity affects the confidence level of any analysis.
- Example: Inaccurate data in a medical report can lead to wrong treatment.
5️⃣ Validity
- Refers to whether the data is correct, relevant, and valid for the specific purpose.
- Invalid data can lead to misleading results or poor decisions.
- Validity ensures that data is usable and aligned with business goals.
- Example: Using old customer data for a new marketing campaign may reduce effectiveness.
6️⃣ Value
- Represents the usefulness and importance of the data.
- The real power of Big Data lies in the value it provides through insights and improved decisions.
- Not all data is valuable — extracting the right insights is key.
- Example: Analyzing customer behavior can increase sales and customer satisfaction.