Ding Zhou

ZhouDing_Klout_IMG_9924 Ding is currently Chief Scientist at Klout overseeing its data product, engineering and infrastructure teams. At Klout, Ding developed and helped establish the Klout score as the standard for measuring influence on the social web. Ding was an early engineer lead at Facebook where his team developed the social network giant’s revenue engine. Prior to that, Ding worked on various big-data products at Google, Yahoo, and NEC Labs.

Hi Ding! What are the most important skills for a data scientist or analyst to have?  What are the most in demand skills relating to big data and analytics?

Data science is a very broad topic incorporating varying elements and a extremely diverse set of techniques and theories from many fields. Generally speaking, the most essential skills for a data scientist or analyst include math and statistics. Speaking in the context of the Web and Social Networking era, with big data everywhere, software engineering skills and data infrastructure knowledge have become an important element for the success of a data scientist. Last but not the least, data visualization and insights presentation skills are also a welcome part of the job description. In terms of the the most  in-demand skills, I would say that it’s a rare find to have big data infrastructure engineering and data science skills all in one person. However, given that most of the big data science problems these days involve deep understanding of science as well as infrastructure, a good mix of engineering and science skills will be in high demand.

What are the biggest challenges to utilizing data well within a business setting? What are the best technologies to use?

The biggest challenge to utilize data well nowadays is to be able to manage it well. A poorly managed data store is essentially a blackhole and thus a waste of storage space and engineering hours. Very often, companies with big data seek help from big data infrastructure solutions so they can make sense of the data they have without much custom engineering work. At Klout, we leverage heavily on open source free solutions like Hive/HBase to manage our data.

What’s your definition of data analytics and science?

Well, as a data scientist, I’d like to consider data analytics and science as the process of harnessing information to increase the chance of success for every decision.

When it is used to serve business decisions, sometimes it is referred to as “business intelligence”. When it is practiced to help computer programs make decisions, such as which search results to rank first or what your Klout score should be, some people refer to it as “artificial intelligence” or “machine learning”.

There is a subtle difference between analytics and science. I’d like to consider data analytics as the effort to collect, aggregate, slice and dice the data where the goal is to monitor and compare things that we know are important or explore the data to draw certain insights. Data science seeks to model the insights or extract theories from observations with a goal to generalize and apply the model or theory in future actions. Data science starts with data analytics.

What is the future of data science?  What types of industries will use data best to build successful businesses?

Data science had been an essential part of everyone’s daily life even before people started calling it data science. One can argue that wherever there is data there is value to have data science as its purpose is to help smart decisions be made. So if you look at where data is highly concentrated today, it’s not hard to understand why data science skills are highly demanded in Internet, Finance, Retail industries. And if you believe that more and more industries will be digitalized, data science will tap into many more industries in the future.

Connect with Ding on:

Have questions?  Continue the conversation in the comments.

Tags: ,

No comments yet.

Scott White

Scott is an Architect at Salesforce, managing the Chatter Discovery engineering teams,  focused on making organizations around the world smarter […]

Scott Nicholson

Scott is the Chief Data Scientist at Accretive Health, working on uncovering insights that will help doctors increase the quality of […]

Antonio Piccolboni

Antonio began his career in bioinformatics, spending 10 years split between academia and industry.   He then worked  for a web […]

Kate Matsudaira

Kate Matsudaira was most recently CTO at Decide where she managed a team of people doing data mining and machine learning. […]

Christyn Perras

Christyn is currently a quantitative analyst at Youtube.  Previously she worked at Slide, a social gaming startup where she also […]

John Cook

John Cook started out in applied math and worked for University of Texas and Vanderbilt University. He then left […]