Source – eyefortravel.com/mobile-and-technology
As head of data science at Trainline, Fergus Weldon, who will be speaking at EyeforTravel’s Smart Travel Data Summit 2017, is at the coalface of innovation. In an exclusive interview he shares ten things he knows about data science today.
1. Being a data scientist is being the voice of the customer
In the past 12 months, the data science team at Trainline has doubled to eight people and it’s growing fast. “At Trainline, being a data scientist means being the voice of the customer through data,” says Fergus Weldon, the company’s head of data science.
Little wonder that the team is growing fast. “We have a lot of customers!” says Weldon. “Every day we help people across Europe make more than 125,000 smarter journeys.”
The idea is to give customers a more “seamless journey through the data they share”. One feature in the app, and an example of this is BusyBot, which crowdsources data from app users on the busyness of their train. “Each week 150,000 pieces of user feedback are shared with us, which are fed into our algorithms that then inform other users where to board their train to have the best chance of getting a seat,” explains Weldon. “We also regularly listen to customer calls, and meet with customers as often as possible, so that we can experience their feedback first hand, and use this to innovate further.”
2. No data scientist has it all
There are four broad skillsets in data science and each data scientist usually excels in one or more of these areas. In Weldon’s experience so far, there aren’t any scientists that excel in all four!
So what are those four:
- Mathematics and statistics
- The ability to programme or ‘hack’
- Skills to tell a story through data and communicate effectively
- Commercial acumen to put the other skills in action
In a data science team, you need to ensure that all these bases are covered – everybody should have different strengths, which work in harmony collectively. “It also helps to have a natural curiosity and thirst to solve complex data conundrums,” he says.
It helps to have a natural curiosity and thirst to solve complex data conundrums
3. Finding the right skills doesn’t come easy, but humans will always be relevant
To address this challenge, Trainline has been liaising closely with universities that specialise in data science and machine learning to find talent at a more junior level. It also regularly hosts meetups for the data science community (as well as attending meetups elsewhere) to tell the story and get potential candidates excited. A robust hiring process to help identify the strongest talent is also important. Candidates are interviewed at least once over the phone and twice in person and are also given technical challenges to assess their practical application of data science.
On the subject of the role of humans in data science, Weldon believes that there will always be a human element, as people are needed to enable machine-learning. For example, there has been lots of discussion recently about Google’s DeepMind AI ‘teaching itself’ to walk. It’s an amazing achievement, argues Weldon, but is quick to point out that it’s not being widely documented that data engineers and scientists had to optimise the AI (essentially tell it what to look for) for it to happen.
According to Weldon, humans are still very much integral to all technological advancements, and will continue to be for the foreseeable future. What’s more, for data science to be successful in a commercial environment, much more than ‘off-the-shelf’ algorithms are needed – brainpower and human curiosity is absolutely crucial for success.
4. A data driven company should have data on tap
A company could call itself data-driven -and many companies do! – yet if you ask a CEO what yesterday’s sales figures are, for example, and they have to ask someone else for this information, then the company, argues Weldon, can’t really say it’s driven by data.
On the subject of what it means to have ‘smart data’, Weldon says this is data that is readily available and easily understood by everyone in a company. This may sound simple but it’s not. In fact, a huge amount of data engineering and science has to be implemented in order for data to be made accessible and digestible in this way, he says.
5. The most innovative companies are organised around their data assets
Data should be central to every decision made or action taken. According to Weldon, at Trainline every business decision is linked back to customer data and the lessons it provides. People from across the business, in every team, access this data every day through dedicated dashboard’s and tools which provide insights into customer behaviour, and it is this steady pipeline of information that enables teams to take action.
6. Multi-tasking is multi-failing!
This year Weldon will be investing in managing time efficiently. “We have huge workloads and with data becoming more and more accessible the need for innovation is continuous – but also ever distracting,” Weldon says. “In our world, multi-tasking is multi-failing!
For data scientists to perform well they must be able to completely focus on the task at hand. So, Weldon’s focus for the next few months will be to put the right structure and working practices in place that will help the team flourish and achieve as much as possible. “Factoring in time for our data scientists to really put their curiosity into action and delve deep into our data will also ensure the team’s hunger to learn is met, and there is passion in everything we do,” he says.
7. ‘Big Data’ is just marketing and communications buzz
In Weldon’s world, no matter the size, it’s simply ‘data’. Having said that, the amount of data being dealt with does impact the skillset needed by scientists handling it. Data available to major, international corporate companies is a very different ball game to that available to small start-ups – and each comes with a different set of challenges. The former will require more scientists completely focused on the statistical, analytical side of things, while those working in a start-up environment will be required to use business acumen more frequently.
8. Transparency and a clear narrative for how you are using data should be a priority
Trainline is completely embracing General Data Protection Regulation (GDPR). “It’s a great new piece of legislation and we’re very much in favour of it,” says Weldon, who says the company has been operating in what would be a GDPR compliant environment for some time. A priority right now is ensuring that every feature contains a clear narrative for users so they can see when Trainline would like to store data, why they are gathering it and how it will improve the journey. “It’s very important to have a completely transparent approach,” stresses Weldon.
Every feature should contain a clear narrative for users so they can see when Trainline would like to store data, why they are gathering it and how it will improve the journey
9. Data science is not for the fainthearted
“It’s a really tough job and definitely not for the fainthearted!” stresses Weldon, who says they are seeing more and more job applicants coming through who have done an intensive few weeks programme in data science, and think this is enough. “What they fail to understand is lots – and I mean lots – of practical experience is needed to be ready to work in the field.”
It’s also not as glamourous as many people envisage – over 90% of a data scientist’s time is spent preparing data, ready for the ‘science’ to take place. “It can be very laborious,” he says, adding “it’s also high pressure being at the centre of everything.
But when this results in seeing the data magic happen, it’s completely worth it.”
10. Being informed counts
Weldon’s top tip for people entering the field is to make sure they understand the tools and packages they are using. They should also understand why they’re using these specific tools and what makes them the best option. “Make a point of being informed about the resources on offer and the ones that will deliver the strongest results for you,” he says.