Scott is the Chief Data Scientist at Accretive Health, working on uncovering insights that will help doctors increase the quality of care while decreasing cost.  Previously he was a Product Manager of Optimization/Analytics  at Adara Media and then more recently a Data Scientist at LinkedIn.

What is your educational background and how has it compared you for your role in data analytics and science?  What skills did you not develop in school that you find important in your work?

I have a PhD in Economics and have spent a lot of time with Stata. The most valuable skills I learned were econometrics/statistics and a deep analytical intuition. But I had to later pick up much more relevant skills such as R, Python, SQL, etc.

What are the biggest challenges in data science and analytics?  What are the most important things to ‘get right’.  What are the best technologies available to solve these problems?

Asking the right questions. Cleaning/extracting/munging/preparing data. You’ve got to know where the value is with your data, and to trust it before you put it in your model. The best tech out there for cleaning data is to hammer through it yourself or push the issues upstream to fix them at the source (i.e., log data appropriately).

What’s your definition of data analytics and science?

Owning the end-to-end process. Start with asking the right questions, and do whatever you need to do to deploy insights, have an impact, and iterate.

What advice can you give someone with little experience in analytics to pursue a career in the field?

Find an area that you are passionate about and figure out how to get some relevant data now. What are some interesting questions to ask? How can you answer them with data? Another more structured method is to get the O’Reilly books Programming Collective Intelligence, Mining the Social Web, and Machine Learning for Hackers. Those give a good overview of techniques.

No comments yet.

Antonio Piccolboni

Antonio began his career in bioinformatics, spending 10 years split between academia and industry.   He then worked  for a web [...]

Kate Matsudaira

Kate Matsudaira was most recently CTO at Decide where she managed a team of people doing data mining and machine learning. [...]

Christyn Perras

Christyn is currently a quantitative analyst at Youtube.  Previously she worked at Slide, a social gaming startup where she also [...]

John Cook

John Cook started out in applied math and worked for University of Texas and Vanderbilt University. He then left [...]

Hadley Wickham

Hadley Wickham has recently joined Rstudio as Chief Scientist.  Previously, he spent over four years as a statistics professor at Rice [...]

Andrew Eichenbaum

Andrew Eichenbaum is the Head of Research and Analytics at Yummly, a semantic web search engine for food, cooking and [...]