Sonny Laskar, an IIM Indore EPGP alum (2013-14 batch), is currently ranked in the top 250 Data Scientists on Kaggle, the world’s largest Data Science platform. He shares, in an interview, insights into his chosen field of work as data scientist and how his stint at IIMI helped his professional growth.
Q: Can you let us know about your Professional experience?
A: I have been in the IT Services Industry for close to nine years. Started my career in TCS as System Analyst and then moved into Data Centre and Data warehousing management Services.
Post EPGP, I joined Microland in the Automation vertical where I head the IT Operations Analytics wing where we are building a Big Data Artificial Intelligence Platform.
Most of my work revolves around building Predictive Analytics solution for massive datasets.
Q: Tell us something about Kaggle and your journey in Data Science Competitions?
A: I came to know about Kaggle while I was at IIM Indore. IIM Indore’s EPGP program has a strong alignment towards Analytics and hence the link.
Today, Kaggle is world’s largest Data Science platform for competitions where companies of all sizes run their business problems as Data Science Competitions for up to million dollars of prize money.
There are around half a million registered members on Kaggle. I am currently ranked in the top 250 Data Scientists on Kaggle.
Q: Does this recognition on Kaggle help in your professional world?
A: Yes, of course. Anyone who is in the space of Analytics knows about Kaggle and most of the recruiters consider Kaggle rank as very good proxy for competency.
Q: Tell us about few of the competitions you have won?
A: The most triumphant victory was the Avito Duplicate Ad Detection contest sponsored by Avito. Avito is the largest classified portal in Russia, very similar to companies like OLX and Quikr.
All of us who have used these portals know that as time goes, our ads come down in the listing page which basically means lower probability of someone viewing it and hence buying it. To game this, sellers would be posting the same ad through multiple accounts with different descriptions and photos.
Such activities created duplicate data in the system which adds no additional value; at the same time it impacts customer experience and leads to loss of revenue.
Hence, Avito wanted a Machine Learning solution to this problem where an algorithm would be able to automatically detect the duplicate ads and remove them. Our model was 95.6% accurate as compared to their in-house solution which was 90% accurate. We won the second position in the contest. I had three of my friends from UK and Germany also with me in this competition.
In addition to this, another interesting competition that I won was sponsored by Honeywell where they wanted to predict fuel efficiency of airplanes based on various data that the airplane generated. Since fuel cost is about 30% of the total operating costs, it is extremely important to find out ways to make airplanes fuel efficient.
Similar study was done in past and it was found that total weight of the airplane is an important factor. Hence you must have noticed that most airplanes give you additional discount if you have no check-in luggage.
Last week, I was also ranked 7th in the Women Health Risk Assessment competition sponsored by Microsoft. This competition was about predicting which individuals have high risk of getting infected by HIV.
Q: What skillsets are needed to be a strong Data Scientist?
A: To be good in this space, you don’t need to be very good at Statistics or be a programming expert or a domain expert. In fact, you need to be all of this and have a little bit of common sense! Understanding of domain is not mandatory but helps a lot.
Q: What type of problems do you see Industry is solving with Data and Analytics?
A: Almost everything. Today, I think there is hardly any company which is not using data to build analytics to create additional value. The ‘Buy-also- this’ feature on Ecommerce websites to the personalised marketing offers to whether you should get a loan or not are all such applications.
Q: Which Industry do you see in the forefront in the use of Data and Analytics?
A: Healthcare, FMCG, Ecommerce and BFSI are in the forefront since they generate huge amount of data which helps them in creating new ways to sell their products. With the emergence of IoT(Internet of Things), we can expect a tremendous focus in this space.
Q: How much did IIM Indore programme help you in your journey?
A: I should say pretty much. It gave a high level view of the overall breadth of the scope and applications of Analytics. Since the programme is very much packed, it cannot teach you the nitty-gritties of everything. Since I already come with a lot of exposure to Data and Big Data Solutions, it was very easy for me to materialise all the learnings.
Q: What is your suggestion to anyone who is currently planning to enter this space?
A: Work hard! There are loads of free MOOCs available. You can master this space without investing a single penny. All you need is time, effort and patience.