So You Want To Be a Data Scientist: A Guide for College Grads
- by 7wData
Congratulations, recent college graduate, and welcome to the workforce! Of all the jobs that you’ll apply for, the one with the sexy title “data scientist” may be the toughest to get–and potentially the most rewarding too.
But never fear: Datanami is here with advice from actual data scientists on how to become one of them. The first piece of advice for budding data scientists is not to get frustrated by the job requirements.
No recent college grad can fill is simultaneously a math/statistics genius, an expert in marketing/derivatives /cybersecurity, and a pro Python/Java/R coder.
(Hint: That’s why data scientists are called unicorns—because they don’t exist!) “There are many skills under the umbrella of data science, and we should not expect any one single person to be a master of them all,” says Kirk Borne, a data scientist with Booz Allen Hamilton.
“The best solution to the data science talent shortage is a team of data scientists.
So I suggest that you become expert in two or more skill areas, but also have a working knowledge of the others.
According to Borne, you’ll do well by yourself to bone up on core data science skills such as machine learning, information retrieval, statistics, and data and information visualization.
You’ll also want to know your way around a databases and data structures and have at least some programming languages under your belt, such as Python, R, SAS, or Spark.
Familiarity with graph analysis, natural language processing, and optimization also looks good on your data science resume, as do data modeling and simulation.
“The good news for physics, biology, astronomy, chemistry, and other science students is that they can easily translate their science skills into a data science profession,” he says.
Should You Go Back to School? While a number of doctorate programs in data science have popped up recently to help stem the unicorn shortage, you won’t want to stay in school for too long.
A master’s degree is ideal, according to Borne.
“These days more and more organizations are willing to hire data scientists with little course work and with some experience, without an advanced degree,” he tells Datanami.
“The degree will eventually be very important for career advancement (perhaps most importantly an MBA, which now include business analytics), so don’t avoid getting your degree– it just doesn’t have to come before your first data science job.
That assessment is echoed by Ashish Thusoo, the CEO of Qubole, a hosted Hadoop service provider.
While having a solid background in math, data mining, statistics, probability theory, and SQL are required, data scientists will eventually need to venture forth from the ivory tower into industry to get their hands on the most important element: interesting data.
“Learning these skills in industry is very important,” says Thusoo, who is also the co-creator ofApache Hive.
“You have strong fundamentals.
But in order to apply those skills, you need to get access to data.
A lot of interesting data sets are tied up in industry.
This was not true 20 or 30 years ago, when a lot of interesting data sets would be in academia.
Today’s top data scientists didn’t go to school to become data scientists.
Instead, they went to school to learn to be computer scientists, astrophysicists (like Borne), chemical engineers, or theoretical physicists.
As the world evolved, those hard science and math skills proved invaluable in manipulating the ever-growing wave of data.
“More important than anything else is being able to think around data,” Thusoo says.
“I think the tools and languages, those things you can pick up.
A random forest algorithm is a random forest algorithm, whether it’s implemented in Python or Scala or Java or any other language.
You need to understand where to use that particular technique, rather than how to code that technique.
Statistics also plays a critical role in big data, says Dr.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More