Jonathan Hsu

Jonathan is currently an Analytics and Data Science Manager at Facebook .  In this role, he manages a team of analysts and data scientists working on a variety of aspects related to leveraging data to make the product better.  His analytics work began in the Metrics team at Slide after his company that launched a successful Facebook app was acquired. At Slide, he led a small team of analysts covering all analytics responsibilities for the company.

Hi Jonathan!  What is your educational background and how has it compared you for your role in data analytics and science?  What skills did you not develop in school that you find important in your work?

My undergraduate degree was in physics from UC Berkeley. I then went to Stanford where I did my PhD in theoretical physics. My work was on black holes and cosmological inflation in string theory. The biggest things that I missed in my PhD were programming work and thinking about industry. All of my work in my PhD was rather formal pencil and paper math. I didn’t write any meaningful code until I got to Slide and I certainly didn’t spend any time thinking about what technology companies were trying to accomplish.

What are the biggest challenges in data science and analytics?  What are the most important things to ‘get right’.  What are the best technologies available to solve these problems?

The biggest challenge for any company is to be successful and the judicious use of data to guide strategy can be a major factor contributing to that success. I would say that the most important thing to “get right” is to never lose the forest for the trees. There are many start-ups that have failed because they put all their faith in a single data mining approach or a single quantitative view of the world. The best analysts in the consumer web startup space understand that the success of a company rarely rests solely upon the level of sophistication or the amount of statistical significance with which some quantitative question can be addressed. It’s important to have a flexible approach that can incorporate both formal statistics and back-of-the-envelope estimations to understand a given problem with varying levels of urgency.

Regarding technologies, I am partial towards Python, SQL and R.

What’s your definition of data analytics and science?

In the current parlance of Silicon Valley, the terms “data science” and “analytics” span a wide variety of functions. I think of analytics as an extension of what is traditionally thought of as Business Analytics with significantly stronger technical capabilities that include some level of proficiency in modern tools and technologies of quantitative analysis. This end of the spectrum is often closely aligned with Product Management and/or “inbound” marketing at companies in Silicon Valley. In this case, the role is about using data to help the company make decisions.

The term Data Scientist is also used to refer to software engineers who build features that are based on sophisticated handling of large data. This typically involves applying some machine learning technique to a recommendation, targeting, or signal detection problem. This end is more closely aligned with traditional software engineering in that the goal is to implement some particular feature with the usual software engineering concerns (scalability, performance, etc.).

Finally, Data Science sometimes refers to research oriented roles that are involved in long-term academic style research projects. This description is often useful for attracting PhDs to use their sophisticated tool sets to tackle important business problems.

At Facebook, we tend to use the term “data scientist” for people who are doing a fair amount of programming work and “analyst” for everything else. Across the Valley, I’ve seen all sorts of titles for these roles: data scientist, analyst, dataanalyst, analytics engineer, analytics scientist, etc. In the real world, roles tend not to be so cleanly differentiated. The majority of data people I have known in the Valley identify with all three profiles above to varying degrees. As with most things in the business world, the key trait is flexibility to do whatever you can to make the company successful given your particular mix of capabilities and interests.

What advice can you give someone with little experience in analytics to pursue a career in the field?

There are two pieces of advice that are probably relevant. The obvious one is that it pays to have proficiency in the tools of the trade. A little bit of working knowledge in programming languages (both statistical and scripting languages) goes a long way. The more overlooked piece of advice regarding this space is that you should really be fundamentally interested in what the company is trying to achieve. While it’s definitely important to have technical depth, you will probably get bored if you are not truly interested in the core mission of a company. On the hiring side of these roles, it’s generally not too hard to find people with either very high levels of technical skill or genuine passion about what we do. However, it turns out to be very hard to find people with both of these qualities in excess. When people get very proficient in the technical aspects there is a tendency to fall in love with the techniques and to lose interest in business objectives. So while it’s definitely a good investment for you to learn how to write code and learn how to use R, it’s also a good idea for you to read the industry press and explore the strategic landscape that you’re considering to be sure that you really care about what the industry is trying to achieve.

Connect with Jonathan on:

No comments yet.