What is Big Data Architecture?

What is Big Data Architecture?

Big data architecture is the foundation for big data analytics. Think of big data architecture as an architectural blueprint of a large campus or office building. Architects begin by understanding the goals and objectives of the building project, and the advantages and limitations of different approaches. It’s not an easy task, but it’s perfectly doable with the right planning and tools.

System architects go through a similar process to plan big data architecture. They meet with stakeholders to understand company objectives for its big data, and plan the computing framework with appropriate hardware and software, data sources and formats, analytics tools, data storage decisions, and results consumption.

Do I Need Big Data Architecture?

Not everyone does need to leverage big data architecture. Single computing tasks rarely top more than 100GB of data, which does not require a big data architecture. Unless you are analyzing terabytes and petabytes of data – and doing it consistently -- look to a scalable server instead of a massively scale-out architecture like Hadoop. If you need analytics, then consider a scalable array that offers native analytics for stored data.

You probably do need big data architecture if any of the following applies to you:

  • You want to extract information from extensive networking or web logs.
  • You process massive datasets over 100GB in size. Some of these computing tasks run 8 hours or longer.
  • You are willing to invest in a big data project, including third-party products to optimize your environment.
  • You store large amounts of unstructured data that you need to summarize or transform into a structured format for better analytics.
  • You have multiple large data sources to analyze, including structured and unstructured.
  • You want to proactively analyze big data for business needs, such as analyzing store sales by season and advertising, applying sentiment analysis to social media posts, or investigating email for suspicious communication patterns – or all the above.

With use cases like these, chances are that your organization will benefit from a big data architecture expressly built for these challenging tasks. Plan for an environment that will capture, store, transform, and communicate this valuable intelligence.

Planning the Big Data Architecture

Big data architecture includes mechanisms for ingesting, protecting, processing, and transforming data into filesystems or database structures. Analytics tools and analyst queries run in the environment to mine intelligence from data, which outputs to a variety of different vehicles.
The architecture has multiple layers. Let’s start by discussing the Big Four logical layers that exist in any big data architecture.

Big data sources layer: Data sources for big data architecture are all over the map. Data can come through from company servers and sensors, or from third-party data providers. The big data environment can ingest data in batch mode or real-time. A few data source examples include enterprise applications like ERP or CRM, MS Office docs, data warehouses and relational database management systems (RDBMS), databases, mobile devices, sensors, social media, and email.

Data massaging and storage layer: This layer receives data from the sources. If necessary, it converts unstructured data to a format that analytic tools can understand and stores the data according to its format. The big data architecture might store structured data in a RDBMS, and unstructured data in a specialized file System like Hadoop Distributed File System (HDFS), or a NoSQL database.

Analysis layer: The analytics layer interacts with stored data to extract business intelligence. Multiple analytics tools operate in the big data environment. Structured data supports mature technologies like sampling, while unstructured data needs more advanced (and newer) specialized analytics toolsets.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Man against machine: AI is better than dermatologists at diagnosing skin cancer

30 Jun, 2020

Researchers have shown for the first time that a form of artificial intelligence or machine learning known as a deep …

Read more

Which Spark machine learning API should you use?

15 Jul, 2017

You’re not a data scientist. Supposedly according to the tech and business press, machine learning will stop global warming, except …

Read more

10 Signs Of A Bad Data Scientist

23 Dec, 2016

With the number of people claiming to be a data scientist growing, the “true” data scientists are becoming hard to …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.