History

The process of accessing and using big data has been around for a long time but the concept gained preference in 2000 when an industry analyst Doug Laney made Big Data mainstream.

Notable Moments in the history of big data are:

1881: The first instance of data was discovered.

1928: A German engineer named Fritz Pfleumer developed a magnetic data storage method on tape.

1948: Theory of Shannon’s Information is developed.

1970: Presentation of the relational database.

1976: Development and commercial use of MRP systems.

1989: Creation of the World Wide Web.

2001: Release of fundamentals related to big data.

2005: Hadoop, an open-source software framework was developed.

2007: Introduction to masses about big data.

2014: Shift in the usage of ERP systems, Internet of Things, etc.

2017: Increment in the creation of data.

The term big data refers to a large, hard-to-manage volume of data that exponentially keeps on growing with time and helps businesses grow on a day-to-day basis. It includes large, complex, structured, and unstructured data types used to analyze the insights for proper decisions and strategic business moves. Big data is said to be generated and transferred from a wide variety of sources.

Types of Big Data

There are 3 types of Big Data

  1. Structured: Structured form of big data refers to the data for which the format is known in advance. It includes data that can be stored, accessed, and processed in a fixed format. For instance, data related to employees is stored in a table.
  2. Unstructured: Unstructured form of big data refers to the data for which there is no specific structure to store and manage it. Unstructured data includes heterogeneous data sources including files like text, images, videos, etc. For instance, Google Search.
  3. Semi-Structured: Semi-structured form of big data refers to the data that includes both the method of structured and unstructured. It is a big data form that does not include a major format for Big Data Analytics. For instance, Data in an XML file.

5 V’s of Big Data

Often considered as characteristics of big data earlier there were 3 V’s of big data but for now, there are around 5 V’s of big data.

  1. Volume: As the name suggests, big data is all related to its size. Businesses collect data from a variety of sources like business transactions, email listing, images, videos, and others so for business, the size of data plays a major role in determining its insights.
  2. Velocity: The term velocity in big data refers to the generation speed of the data via different sources like business transactions, networks, logs, social media sites connections, and others.
  3. Value: For an organization operating at its biggest the data derived and stored does not count if it is not worth the value. The data is only needed in a business when it serves as valuable insights. No data ever comes with insights instead it needs to be converted into something valuable.
  4. Veracity: As data comes in from different sources the main reason for veracity refers to the quality of data. Veracity in big data refers to the inconsistencies and uncertainties involved in the data stored and managed by a business.

Variety: Variety in big data refers to the nature of the data stored and managed by a business-like structure, unstructured, and semi-structured. It involves heterogeneous sources.