What is Big Data Architecture?
- by 7wData
Big data architecture is the foundation for big data analytics. Think of big data architecture as an architectural blueprint of a large campus or office building. Architects begin by understanding the goals and objectives of the building project, and the advantages and limitations of different approaches. It’s not an easy task, but it’s perfectly doable with the right planning and tools.
System architects go through a similar process to plan big data architecture. They meet with stakeholders to understand company objectives for its big data, and plan the computing framework with appropriate hardware and software, data sources and formats, analytics tools, data storage decisions, and results consumption.
Do I Need Big Data Architecture?
Not everyone does need to leverage big data architecture. Single computing tasks rarely top more than 100GB of data, which does not require a big data architecture. Unless you are analyzing terabytes and petabytes of data – and doing it consistently -- look to a scalable server instead of a massively scale-out architecture like Hadoop. If you need analytics, then consider a scalable array that offers native analytics for stored data.
You probably do need big data architecture if any of the following applies to you:
- You want to extract information from extensive networking or web logs.
- You process massive datasets over 100GB in size. Some of these computing tasks run 8 hours or longer.
- You are willing to invest in a big data project, including third-party products to optimize your environment.
- You store large amounts of unstructured data that you need to summarize or transform into a structured format for better analytics.
- You have multiple large data sources to analyze, including structured and unstructured.
- You want to proactively analyze big data for business needs, such as analyzing store sales by season and advertising, applying sentiment analysis to social media posts, or investigating email for suspicious communication patterns – or all the above.
With use cases like these, chances are that your organization will benefit from a big data architecture expressly built for these challenging tasks. Plan for an environment that will capture, store, transform, and communicate this valuable intelligence.
Planning the Big Data Architecture
Big data architecture includes mechanisms for ingesting, protecting, processing, and transforming data into filesystems or database structures. Analytics tools and analyst queries run in the environment to mine intelligence from data, which outputs to a variety of different vehicles.
The architecture has multiple layers. Let’s start by discussing the Big Four logical layers that exist in any big data architecture.
Big data sources layer: Data sources for big data architecture are all over the map. Data can come through from company servers and sensors, or from third-party data providers. The big data environment can ingest data in batch mode or real-time. A few data source examples include enterprise applications like ERP or CRM, MS Office docs, data warehouses and relational database management systems (RDBMS), databases, mobile devices, sensors, social media, and email.
Data massaging and storage layer: This layer receives data from the sources. If necessary, it converts unstructured data to a format that analytic tools can understand and stores the data according to its format. The big data architecture might store structured data in a RDBMS, and unstructured data in a specialized file System like Hadoop Distributed File System (HDFS), or a NoSQL database.
Analysis layer: The analytics layer interacts with stored data to extract business intelligence. Multiple analytics tools operate in the big data environment. Structured data supports mature technologies like sampling, while unstructured data needs more advanced (and newer) specialized analytics toolsets.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More