Source: standard.co.uk
Data, data, data. As our lives get evermore entwined into the world of digital, so the importance of data skyrockets: the importance of acquiring it, storing it and – crucially – interpreting it. In fact, last year reports abound that so important is data to us now, it has become a more valuable resource than oil…
And while that may remain open to debate, it highlights the inescapable truth that data is crucial to every day life.
So, if you are thinking about a career change – or looking to start your first – what’s stopping you from considering data science?
“I didn’t decide that I wanted to be a data scientist until a few months before I started working as one,” says Thordis Thorsteins, security data scientist at cyber company Panaseer. “[I] found that skills that I’d gathered through other experiences were very applicable.”
Then there’s the money – it can be a highly lucrative field to work in. Glassdoor, for instance, lists ‘Data Scientist’ as one of its best jobs for 2020, citing the median base salary as £45,188.
Interest sufficiently piqued? Here, Thorsteins explains what data science is and how you can make an entree.
What is data science?
The core of data science is using theory from maths and power from computer science to answer questions. These questions can range from ‘what events this weekend might I be most interested in?’ to ‘which security challenge should my company first focus on?’.
It’s a collection of techniques that can be applied to any domain (that has sufficient data) to see trends and patterns that may not be obvious at all.
When coupled with domain specific knowledge it brings enormous value as the human input helps model the problem correctly, and the data science techniques give us a view on the problem we wouldn’t be able to get without it.
Why is data science an exciting career?
The possibilities are endless and the potential benefits are clear. From helping diagnose diseases correctly, to helping avoid cyber-attacks – the field is not bound to any single industry, so the variety of options is far-reaching.
Something that may not be clear is that there are different types of data scientists – for example ‘product data scientists’ and ‘research scientists’. These different roles involve various different levels of design, working with people that are, and are not, data scientists, and studying the state-of-the-art techniques.
Lastly, the field is constantly evolving, so the possibility to learn is never-ending. It’s an exciting field that people with different backgrounds can bring a lot to.
What skills or qualifications do you need?
In my opinion, the key ingredients when starting out are a problem-solving mindset and practicality. With time and experience you will pick up coding skills (mainly what open source projects to use) and learn practical things about how to work with data, but these skills are easier to pick up as you go.
Other skills that come with experience are learning to balance what can be done with what brings value, learning to ask the right questions so that a computer can answer it and preventing pitfalls like overlooking bias in data or ways to mis-use your solution.
Many people in the field come from a STEM background, but this is by no means the only way to get into data science. The online resources are plentiful and there is a big community of meet-up groups that will help anyone interested get up to speed.
Is it too late to get started at 30, 40, 50 plus – if not, how might you go about it?
It’s never too late to get started in data science. The resources available online are of high quality and very extensive which means that upskilling (regardless of age or previous experience) is a lot more manageable than it might seem.
It is worth reaching out to someone you look up to in the field for specific advice if you feel like you’ll be out of place, but I can assure you that additional experience in other areas comes in surprisingly useful.
You are likely to have many things to offer that people who started earlier in the field do not and you should embrace that.
Is it true women are under-represented in data science?
There is no reason why they should be, but this is sadly still the case. It’s worth noting that it’s not only women that are under-represented in data science, and efforts should be made to increase representation from all groups.
I think the under-representation is a bit of a chicken and egg situation. The fact that there is a lack of women in field in some cases discourages women that are interested and means that people that are in or hiring into the field don’t realise how to make the workplace a good one for people of all genders, and this is turn means that the lack of women persists.
What can be done to change this?
I think there are a variety of things. As someone involved in hiring, you can advertise open posts on sites that encourage diversity and eliminate discrimination.
You should make sure that the language in your job ads isn’t biased and that you don’t exaggerate experience and qualifications needed for a job (it’s rare to truly need 10 years of A.I. experience, a PhD and publications, and research shows that women are less likely to apply for positions they don’t think they meet all the requirements for than men are).
As someone in data science, you can offer to mentor individuals from underrepresented groups that want help getting started in the field and share achievements of, for example, women around you to help other women find role models. As a member of society, you should be mindful of your language and assumptions, and encourage individuals around you.
A good thing to do is to follow communities that encourage diversity to learn how you can help drive positive change.
What trends do you see moving into 2020?
I think the data science community is going to put more focus on data ethics in 2020. This was a hot topic in 2019, and I think we’re getting better at spotting and pointing out downfalls that make data science solutions unfair to some groups.
I believe people are going to be more mindful of this when developing solutions in 2020 because of the raised visibility of the issue. I also believe that the data science community will become better at clearly communicating necessary details about data science and how it works to data science consumers.
For instance, while not all the technical details are relevant for the average person who wants to know the quickest way to get to work, they may benefit from knowing that this solution performs best in the morning or doesn’t take into account that you may not want to walk down all streets at night when it’s dark.
Sharing the necessary information in way that’s easy to understand is not an easy task, but I believe that we can do better and this in turn will help us make the solutions better when the users can point out issues.