Unstructured data: The smart person’s guide
- by 7wData
By 2025, IDG projects that there will be 163 zettabytes of data in the world, and estimates indicate that 80% of this data is unstructured.
With structured data, data fields are aligned side-by-side in fixed record lengths, with specific data fields appearing at static locations within each record. Unstructured data does not contain a set record format—it can come in any shape or form. Unstructured data comes from documents, social media feeds, digital pictures and videos, audio transmissions, sensors used to gather climate information, and unstructured content from the web.
Learn more about unstructured data by reading this smart person's guide. We will update this resource periodically with the latest information and tips about unstructured data.
Unstructured data is any data that aren't stored in a fixed record length format, which is known as transactional data. Examples of unstructured data include:
The records in your accounting, inventory management, or order systems are not unstructured data because those records are all structured in the same, uniform way. Every record consists of a series of contiguous data fields, and one of these fields is the access point, or key, into the records (e.g., the key to an order record might be the order number).
To process unstructured data, your systems and databases need be able to read this data by looking at a key or a reference point; for example, the key to a stored photo of Jim Smith would likely be Jim Smith. After that, the system must access the entire data object (i.e., the photo of Jim Smith).
The difference between an unstructured object of Jim Smith and a structured record of him is that the unstructured data object would be Jim's photo, which would require quite a bit of storage; the structured record of Jim Smith would be information about him, such as his address or his phone number. There is much more data in a full photo of Jim than in a small, fixed string of data elements like an address, phone number, etc. that you would find in a structured data record; for this reason, unstructured data, with the large objects that are assigned to its keys, requires more processing and more storage.
Because 80% of the data in companies is unstructured, organizations need to understand the types of unstructured data they are accumulating and the best ways to process and store this data for business advantages. Without data management strategies and guidance in these areas, companies run the risks of not capitalizing on unstructured data, failing to keep up with competitors, or storing more unstructured data than they really need, thereby running up data center costs.
In a majority of cases, unstructured data is ultimately related back to the company's structured data records. As an example, every x-ray or MRI image for a patient is related back to the patient's record in the hospital's record system. The patient record in the record system is enriched with unstructured data that is linked to it, and the doctor gets a more complete picture of the patient.
This is the value of unstructured data: It enriches corporate data and enables leaders to work smarter.
Unstructured data can affect everyone at the company, from the entry-level staffer to the CEO.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More