Kate Matsudaira was most recently CTO at Decide where she managed a team of people doing data mining and machine learning.  Before that she was CTO of SEOmoz where she dealt with massive amounts of web crawler data.  She has recently founded a new startup called where Kate says “…we’re not working with a lot of data yet, but I expect to in the future.”

Hi Kate! What is your educational background and how has it prepared you for your role in this field?  What skills did you not develop in school that you find important in your work?

I studied computer science in both undergrad and in my graduate work. To be honest, there’s a lot I had to learn after school to be successful in my field.  First, technology has changed so much since I graduated. When I was in school, I learned lots of fundamentals and programming in languages like C and Matlab; now, applications are evolving so quickly, everyone in this field are in a constant state of learning. In addition, I was super focused on the technical in school, but pretty quickly found myself in a leadership role after I graduated. This required a whole different set of soft skills that I didn’t learn in a Computer Science program.  And even if you are not interested in management, to be promoted, or work on the projects you are more interested in, it often involves a bit of persuasion and communication. So at some point you have to develop those soft skills.

What are the biggest challenges in data science and/or analytics?  What are the most important things to ‘get right’. What are the best technologies available to solve these problems?

In my experience, the biggest challenge with data is knowing how to make good use of it. You’ve got to ask the right questions in order to do thoughtful analysis and draw meaningful conclusions. And in many cases there are challenges collecting the right data and  getting in into a usable state.  When it comes to analysis a lot of people get confused between correlation and causation. Correlation is not causation (just because things seem correlated does not mean one caused the other), and people are tempted to draw conclusions based on data relationships that aren’t really there.

It’s important to be able to use data to impact your business smartly. Understand and use the tools you’ve got, react to the data, and then make smart decisions based on that data.

What’s your definition of data analytics and data science?

My definition is collecting data (through monitoring & metrics) and analyzing that data to some sort of conclusion or result. It’s really about the act of using data to make decisions or drive products. And it doesn’t have to be a complex system; it can be as simple as just taking in data and using it to make smarter decisions.

What advice can you give someone with little experience in analytics to pursue a career in the field?

The best way to break into the field is to just keep learning, and to get as much real world experience as you can. Start a job, internship, or personal project, because nothing trumps real world experience. If you can, try to figure out ways to work data analysis into your current job. If you can’t, find ways to do it outside of work. Kaggle and other sites allow people anywhere to work on data science experiments, which is a great way to start building up a portfolio or resume you can point to in a job interview.

How do you think the field will be different in 5-10 years?

I think there will be a lot more data in the next 5-10 years. People are going to get better at collecting it and analyzing it, and a lot more tools will be available and will run faster. We’ve seem so many tools that have evolved so much just in the last 5 years, but they can still be really slow. I think in the near future they will advance and speed up, and open up even more new applications for data. I expect we’ll see lots more companies that are built on and driven by data, and using data science to run their business.

Connect with Kate on:

Have questions?  Continue the conversation in the comments.

Tags:

No comments yet.

Scott White

Scott is an Architect at Salesforce, managing the Chatter Discovery engineering teams,  focused on making organizations around the world smarter [...]

Scott Nicholson

Scott is the Chief Data Scientist at Accretive Health, working on uncovering insights that will help doctors increase the quality of [...]

Antonio Piccolboni

Antonio began his career in bioinformatics, spending 10 years split between academia and industry.   He then worked  for a web [...]

Christyn Perras

Christyn is currently a quantitative analyst at Youtube.  Previously she worked at Slide, a social gaming startup where she also [...]

John Cook

John Cook started out in applied math and worked for University of Texas and Vanderbilt University. He then left [...]

Hadley Wickham

Hadley Wickham has recently joined Rstudio as Chief Scientist.  Previously, he spent over four years as a statistics professor at Rice [...]