A ndrew Eichenbaum is the Head of Research and Analytics at Yummly, a semantic web search engine for food, cooking and recipes. Previously he was a Senior Analytics Engineer at Yoono and an Analytics Scientist at MyBuys. He specializes in search, personalization, NLP, reporting, product design/road-map

Hi Andrew! What is your educational background and how has it compared you for your role in data analytics and science? What skills did you not develop in school that you find important in your work?

My highest degree is a PhD. in Physics from the University of Wisconsin, where I specialized in experimental particle physics. This area of research inherently requires work with a very large number of data points, so statistical analysis and programing were basic to daily research. Along with that, higher math and numerical methods were used and developed to allow us to overcome problems that we came across.

The biggest problem with this background was that programing was a tool learned as I went. Thus the quality and cleanliness (e.g. coding style, commenting, etc.) of my code was deficient when I first entered the work force. The second big problem that was not addressed is communication with non-similar expert people. In graduate studies you learn to talk the lingo with your colleagues and at conferences. But people doing the similar analytical work in another field can have a whole different way of talking about a similar problem.

Both of the problems were surmountable in time, but I did loose out on some of my first interviews by just not knowing how to communicate or how to write nice/elegant code.

What are the biggest challenges in data science and analytics? What are the most important things to ‘get right’. What are the best technologies available to solve these problems?

There are a lot of problems in data science and analytics. They range from better ad targeting, to transportation safety, to defining social interactions. But the biggest challenge for the scientist is finding a problem that they are passionate about, and a place they can work on it. When the data geek just works on another data problem, they will get results that may be useful. But when they are immersed in a problem that they can not stop thinking about, those challenges will be in the back of the heads at all times. This leads to 2 AM coding sessions to check out a new bit of data, or the suggestion of a new idea that might be even more valuable than the original project.

As for what is most important thing to ‘get right’, two words: data cleanliness. If you have dirty data, or worse yet, data you do not understand, luck is the only thing that will save your project. The first thing I do whenever starting a new project is look at the data to understand what is going on. There is no reason not to plot out original distributions and compare to base assumptions. Sometimes the problem you are trying to solve is not a problem with the data, but a problem with the way people understand the data. I find these solutions have the highest ROI for any business since it takes so little time, and can have such a big impact on the bottom line.  Finally, use whatever technology works for you and the situation. Remember there is no one correct technology, just as there is no one correct solution to any given problem.

What’s your definition of data analytics and science?

Data Science/Analytics is process of taking a question, and then answering that question using a data driven approach. The finer points of data science are formulating the question into something that can be answered with data, and getting the data into a format that you can use to answer the reformulated question.

What advice can you give someone with little experience in analytics to pursue a career in the field?

I believe that data people are born with a certain mind set which can not be learned by taking classes or trying to solve problems. If you are wondering if you could be a data person, ask yourself this question: Have you ever read an article that states, “There is conclusive evidence,” or “The results were statistically significant,” and wondered how they came to these conclusions? If the answer is yes, then you might have what it takes to be a data scientist.

Connect with Andrew on:

No comments yet.

Scott White

Scott is an Architect at Salesforce, managing the Chatter Discovery engineering teams,  focused on making organizations around the world smarter […]

Scott Nicholson

Scott is the Chief Data Scientist at Accretive Health, working on uncovering insights that will help doctors increase the quality of […]

Antonio Piccolboni

Antonio began his career in bioinformatics, spending 10 years split between academia and industry.   He then worked  for a web […]

Kate Matsudaira

Kate Matsudaira was most recently CTO at Decide where she managed a team of people doing data mining and machine learning. […]

Christyn Perras

Christyn is currently a quantitative analyst at Youtube.  Previously she worked at Slide, a social gaming startup where she also […]

John Cook

John Cook started out in applied math and worked for University of Texas and Vanderbilt University. He then left […]