UX Researchers — You’ve Probably Been Doing Statistical Analysis All Along!
For years, I felt inadequate when it came to statistical analysis for user research. Terms like descriptive analytics, chi-square tests, non-parametric tests, and correlation tests made me feel like I was missing core data analysis skills.
To learn these skills, I started small. I watched a few how-to videos, read support pages, imported a survey with 800+ responses into an open source analysis tool and and fiddled with the tool’s interactions.
It was then I had a delightful realization — I already knew how to do this. I’d been telling stories with data and executing statistical analysis for over 10 years! I just didn’t know the lingo, the formal names for the techniques.
Here’s a break down of the lingo I learned:
Descriptive Analytics
This is really just summarizing and describing the key features of your data. As researchers, we do this all the time when reporting on survey results. We look at the frequencies — how many respondents selected each multiple choice option for questions on demographics, behaviors, attitudes etc. We calculate percentages and visualize using bar charts and pie charts. This is descriptive analytics! Many times it is available as a filter in survey software.
Chi-Square Tests
These tests check if there is a relationship or association between two categorical variables. For example, if we looked at gender and movie preferences, a chi-square test could tell us if men and women had significantly different tastes or if it was just random chance.
Stakeholders sometimes need to understand if user characteristics like age, gender, region etc. are related to product usage patterns or feature adoptions. The chi-square test formalizes this.
Non-Parametric Tests
These are tests that don’t require data to be normally distributed or meet other assumptions. They are useful for ordinal data like rating scales.
For example, we often ask users to rate feature satisfaction on a 1–5 scale. If we want to see if there are differences in ratings between customer segments, we can use non-parametric tests.
Correlation Tests
These measure the strength and direction of the relationship between two numeric variables. For example, in an unmoderated prototype test or a live A/B test we may have one score measuring user success and another measuring task completion time. A correlation test could tell us if participants with higher success tend to complete tasks faster — or not.
Any time we look at relationships between two scaled variables like behavioral metrics, usability scores, or attitudinal ratings, we are exploring correlations.
T-Tests
Imagine you want to know if boys and girls have different opinions about a new superhero movie. A t-test can help you compare the average rating given by boys to the average rating given by girls.
If the difference between those two averages is really big, the t-test will tell you that boys and girls probably have different tastes for that movie. But if the difference is small, it might just be by chance.
Regression Analysis
This is like trying to predict how much money a movie will make based on its budget. You look at past movies and see if there’s a pattern — do movies with bigger budgets tend to make more money at the box office?
Regression finds the line that best fits that relationship on a graph. A steep upward line means budget is a good predictor of box office success. A flat line means budget doesn’t really matter.
Factor Analysis
Let’s say you asked a bunch of movie fans to rate different aspects of movies like the acting, special effects, story, etc. Factor analysis looks for which of those aspects tend to get similar ratings across many fans.
Factor analysis might group “acting” and “story” together into one “factor” because fans who liked the acting also tended to like the story, helping organize all the different ratings into just a few broader categories or “factors” that capture how fans perceive movies.
When To Use Which Test
You can use these tests to find:
- Summaries and overviews: Descriptive analytics when you want to summarize survey results or give a quick snapshot of your data.
- Connections between categories: Chi-square tests when you want to see if there’s a relationship between two categorical things, like gender and movie preferences.
- Comparisons with unusual data: Non-parametric tests when your data doesn’t fit normal patterns or when you’re working with ratings (like 1–5 scales).
- Relationships between numbers: Correlation tests when you want to see if two number-based things are related, like task success and completion time.
- Differences between groups: T-tests when you want to compare averages between two groups, like movie ratings between males versus females.
- Predictive relationships: Regression analysis when you want to predict one thing (like box office success) from another thing (like movie budget).
- Grouped categories and patterns: Factor analysis when you want to find patterns to group related items (like acting and story) together.
You can combine statistical analysis methods, too. For example, use descriptive analytics first to understand your data, then choose other tests based on your specific questions and the type of data you have. The test you choose depends on your research question, the type of data you have, and what you need to learn from it!
Looking back, I’ve been using all of these descriptive and associative techniques for years in tools like Excel, Google Sheets, and survey platforms. I just didn’t know the formal names for these processes. Perhaps the same is true for you.
If you are as intimidated by these terms as I was, now that you know the name for analysis you might already be doing, you can use the ‘fancy label’ to communicate findings with intention and credibility.