What’s the difference between a data engineer, a data analyst and a data scientist?
- by 7wData
If you want to make sure you don't lose your job in finance (or anywhere else) in the next five years, you probably want to work in 'big data'. But what do big data jobs entail?
Speaking at last week's Women of Silicon Roundabout conference in London, Dr. Rebecca Pope, the head of data science and engineering at KPMG, said you don't need to be an excellent statistician or a high class mathematician to work in big data. Nor do you need a lot of prior programming knowledge.
However, you do need an interest in statistics, you do need to be willing to learn how to code, and you do need to know how to do some high level mathematical operations.
Pope herself didn't study pure statistics (she's a neuroscientist). Nor did she study programming. Instead, she learned how to program after graduating, and she attended "endless hackathons."
"I started learning R. But my advice would be that if you are launching a career in data science you should specialize in Python – make Python the first language you learn," said Pope.
Data scientists are not just statisticians, said Pope. "A statistician is interested in building a model that builds a relationship between a variable and an outcome." A data scientist wants to do something more: predict. Data scientists train models on data so that models can predict the future as accurately as possible.
Big data jobs come in stages. A business use has to be established and raw data has to be made fit for purpose (so-called 'data wrangling'), then the algorithms that analyze the data are written and tested on the data available, and - if they're machine learning algorithms - they learn from the data and to predict the future. Visualisations and APIs have to be created so that the business can engage with the resulting product.
Different sorts of data professionals are engaged at different stages. Or, you can be a generalist data scientist operating across the spectrum.
Pope put together the following chart showing the skills data engineers need and the tasks they perform. Basically it's a lot of software engineering and preparing of data.
The data engineer's job is "the representation and movement of data so that it is consumable and usable," said Pope.  If you're a data engineer you need to take the raw data, clean it, move it into a database, tag it, and generally make sure it's ready for the next stage of the process...
Pope said the programming languages and platforms you'll need for data engineering jobs are:Â Apache Spark, Scala, Docker, Java, Hadoop, and Kubernetes NiFI.
After the data engineer, comes the data analyst. The chart below shows where data analysts operate.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Strategies for simplifying complex Salesforce data migrations – Free Webinar
27 March 2024
5 PM CET – 6 PM CET
Read MoreCategories
You Might Be Interested In
The Era Of Continuous Intelligence
3 May, 2021The primary data challenge (and opportunity) presenting itself to many organizations, whether commercial enterprises, academic institutions or public sector bodies …
Will Artificial Intelligence Ever Rival Human Thinking?
22 Oct, 2022Some of the world’s most advanced artificial intelligence (AI) systems, at least the ones the public hear about, are famous …
What is performance management? A super simple explanation for everyone
17 Jun, 2018When properly designed and implemented, performance management techniques and processes enable an organisation to monitor, manage and improve strategy execution …
Recent Jobs
Do You Want to Share Your Story?
Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.