Big Data is a term that describes a gigantic quantity of data, which traditional software isn’t able to process and manage.
Where does so much information come from?
The internet and social networks produce an increasing amount of data in form of images, videos, commentaries, messages, likes, etc. The simple fact of browse on internet produces data. Internet of Things will increase even more the quantity of information.
Data category in Big Data
Structured
Standardized and organized data in lists, tables, or other rigid structure. Are text and numbers of easy interpretation.
Unstructured
Compose the majority of Big Data. Don’t have a defined structure and aren’t related to each other. Are social media posts, videos, sound files, geolocations, and documents.

Semistructured
This class of data is intermediated between the two previous ones. Have a heterogenous organization, irregular structure, and embedded in the data. Some examples are emails, BibTex, XML, and JSON files.
The Vs of Big Data
The Vs are information parameters on Big Data. In the current century’s beginning, were considered only the first 3 Vs. Over time, were included the two last ones.
- Volume: The quantity of information.
- Velocity: Processing and interpretation speed.
- Variety: Types of data.
- Veracity: Autencity and when it was collected, usually data related to past events have little importance.
- Valor: Information’s utility and importance.
Big Data Analytics
It’s the process to extract, store, and analyze data. Discovering hidden patterns and relations between the information.

Data are collected from many sources, unstructured data are deposited in data lakes, in a brute format, without processing. Big Data Analytics can find patterns, describe current situations, make diagnostics, predict future scenarios, and offer solutions based on a great quantity of information. Exist many Big Data platforms that provide solutions with appropriate tools.
Some applications
- With data analysis, corporations can understand better the customers’ profile to improve marketing and attendance, develop products, and better services. In addition to offering recommendations based on the customer’s history.
- Structured (year, brand, and model of equipment) and unstructured (sensor reading, temperature) information, allows alert about possible failures, to run preventive maintenance.
- Machine learning: The huge quantity of data allows a better training of neural networks, to bring faster and more accurate results.
- Data analysis can detect patterns that indicate fraud.