In late 2019, I invited 16 hiring design managers to rate 240 designer resumes, and analyzed the dataset to provide both parties with stats and recommendations based on actions instead of opinions.
To conduct this research, I've designed an app and an algorithm to distribute resumes among hiring managers evenly.
The team built an initial MVP in a couple of weekends. The project lasted five months.
The problem that I can relate to
As a design manager, I used to interview and hire a lot of designers while at Alfa-bank. It occupied a lot of my time, and I was always looking for a way to optimize the process for recruiters, my team, and myself.
After I left the bank, it felt like a good time to collaborate with other hiring managers from the industry to see if there were some commonalities or differences in what they were looking for in the product designer resume.
I reached out to them, pitched the idea, and all of them agreed to participate and supported me (which I am grateful for) since they were also interested in the outcome.
Working on weekends is tough
I worked on this project on weekends while living in New York and working full-time at Scentbird. My friends helped me with building back-end and front-end. We were a team of three :-)
Working on weekends can be frustrating, mostly because the team's schedules not always aligned, and it took us approximately four to five months to accomplish this experiment.
Candidates & hiring managers: user problem
Candidates suffer from the lack of feedback after they apply. The lack of proper feedback is frustrating. People need some sort of feedback to improve their resumes, portfolios, or maybe even adjust career paths.
Giving personal feedback to everyone is impossible. Hiring managers often receive direct messages on social networks. They don't have time to reply to everyone, but if the candidate is interesting, they are more likely to respond and participate in a conversation.
Solution: reply probability score
How likely are you going to receive an invitation to a screening interview?
I thought of ways to give candidates a tool where they could get at least a probability of their resume to appear interesting to hiring managers.
The examples of good resumes can be misleading or irrelevant to one's situation. Instead, common patterns in good and bad resumes, revealed on stats and real actions instead of expert opinions are more trustworthy.
To reduce biases (or even eliminate them), I invited 16 hiring design managers from different companies and genders to label a set of resumes from real people (not all of them were candidates) and to find the patterns I was looking for.
Scope of work
Candidates and participants: advertising, information about experiment
Hiring managers: invitation, interviews
Website: landing pages, descriptions, editorial
Experiment app: candidate registration, managers voting feed, control panel
Algorithms: optimize resumes feed, resumes distribution
Back-end: saving data properly for the future analysis
Data analysis (Python, pandas)
Writing and publishing research, notifying participants about the results
Hiring managers interviews
Before designing an experiment, I interviewed managers on their hiring processes, growth reasons, and expectations. These interviews were published as a part of a project before we started collecting resumes.
How many product designer job openings do you usually have?
What is the reason or this growth or a permanent number of openings?
What products are you working at?
Who is the end-user of the results of a designer's work? How are they using it?
To what extent does a designer affect your team/product/company?
What a designer has to do to start working with you/get a job in your company?
Product designer job descriptions compilation
I scraped 12 publicly available sources with product designer job descriptions. Even though they varied in minutia details, the commonalities dominated. I compiled the average description for the landing page, hoping that participants will read them.
Landing page, experiment description
The original landing page contained a research description, goals, rules, a list of invited hiring managers, a compiled job description, and FAQ (I gathered them while talking to hiring managers).
The research was done in Russian, all design managers were from the Russian design community.
All of them:
worked on digital products within a company,
had a design team of 10+ designers,
hired designers (or made final hiring decisions) in the last six months.
The hardest part for me was to find female leaders in the Russian design community that could pass these qualifications. For some reason, men were represented better at that moment.
To eliminate possible bias, the algorithm distributed votes evenly between men and women (see details below).
Minimum viable UI
Registration and user anonymity
The only required field was an email address. Portfolio and resume were optional: some people had more that one link and included them into the text, so there was up to users on how they wanted to fill in the information. From my previous experience, some people shortened their message to one link with their portfolio.
Users could check if they wanted managers to contact them later (aka if users were candidates). This flag was necessary for the later plans to connect matching pairs of managers and candidates but to eliminate those who were not interested in the job. This option was hidden from managers, so it could not affect their decision (they were told to treat all resumes as they all came from the real candidates).
Voting process: resumes feed
The feed was designed to work on mobile-first and then scaled to larger devices.
The feed is straightforward and intentionally emulated an email inbox or messenger, with a few lines of preview text.
Two interesting findings came from mobile version (even though I think it is normal and essential to have a mobile version nowadays, for a lot of designers it was still somewhat a gimmick back in 2019!):
Four managers emphasized that they could fit this experiment into their schedules only because the mobile version was so simple to use and appealing
Most of the designer portfolios lack mobile versions (e.g., Behance is tough to view from a mobile device)
Reduce rating time
Make sure that each resume will receive a guaranteed amount of ratings from women and men
How it works
Each resume had to have 7 (seven) ratings. It is enough to correct human errors and have an uneven number of ratings to calculate the score.
Each resume was first displayed to two women and two men
The rest three ratings were randomized between other managers to emulate a real-life situation.
This approach helped us to dramatically reduce the number of time managers spent compared to what I experienced before.
First, "Spam" votes had to be normalized. E.g., if one manager labeled a resume as spam, but others didn't or even labeled it as a "yes," then the vote counted as "no" with the weight of a -1 instead of -10.
The sum of weights was divided by the number of votes for each resume, and we've got out the score. 4/7=0.57.
Seven votes for each resume are an uneven number that helps to calculate the resume's weight in terms of "yes" or "no" (there will always be more than or less than 50%).
Resumes grouped by their performance (score)
Extra: 8 resumes (3.3%).
Scored 100% (all votes are positive).
Median review time: 63.9 sec.
Min. time: 22.2; max. time: 158.8 sec.
Median word count: 276
Shortest resume 107 words; longest: 538 words.
Definitely yes: 20 resumes (8.2%)
The score is more than 80%.
Median review time: 54.6 sec.
Min. time: 26.1; max. time: 100.6 sec.
Median word count: 262
Shortest resume: 74 words; longest: 795 words.
Likely yes: 62 resumes (25.5%)
The score is between 50% and 80%.
Median review time: 59.8 sec.
Min. time: 18.3; max. time: 134.1 sec.
Median word count: 197
Shortest resume: 44 words; longest: 800 words.
Likely no: 52 resumes (21.4%)
The score is between 30% and 50%.
Median review time: 54.2 sec.
Min. time: 19.4; max. time: 141.7 sec.
Median word count: 220
Shortest resume: 2 words; longest: 671 words.
Definitely no: 97 resumes (39.9%)
The score is less than 30%.
Median review time: 59.6 sec.
Min. time: 18.8; max. time: 153.1 sec.
Median word count: 163
Shortest resume: 4 words; longest: 713 words.
Spam: 4 resumes (1.6%)
Everyone identified this response as spam.
Median review time: 45.3 sec.
Min. time: 31.1; max. time: 59.4 sec.
Median word count: 184
Shortest resume: — one word; longest: 715 words.
Hiring managers productivity
I thought of visualizing each manager's accuracy as a sphere: a target or a planet. With this idea in mind, I wrote a small script that automatically generated an SVG file from my data for every participant.
Hiring managers accuracy
The last exercise in this research was to learn how accurate each manager was and to calculate the average accuracy. I wanted this information to compare their accuracy to future experiments I had in mind: to see how designers would rate other designers and to build my own classifier.
Given each resume's rating, I could quickly tell if it is spam or if it is a "no," (<=50%>) or a "yes" (>50%).
Given each manager's votes, I compared them with the final scores to count matches (yes:yes, no:no are matches, no:yes and yes:no were mismatches).
I removed from calculations managers that rated less than 30 resumes.
Given each managers' accuracy, I calculated the median accuracy, circa 75% (which means they would collectively guess right seven times out of ten).
The best score was circa 85%. The accuracy did not correlate to the number of total votes.
More than 60% of responses will end up in the trash, and some will even end up in spam.
The main factors that affect the success of a cover letter are confirmed facts about professional achievements, an understanding of the daily routine of a designer in a product team, proven knowledge and application of development process and technologies, accuracy of presentation/portfolio.
The most important findings
Authors of resumes with a negative score (<50%) are writing about their skills instead of accomplishments. They omit facts.
Years of experience without facts negatively affect the score. The best performing group had candidates that described themselves both as recent graduates and veterans with 15+ years of experience.
For some reason, managers preferred simple portfolio websites to platforms like Behance. Behance and Dribble profile are better than nothing, though.
At least half of portfolio websites lacked mobile version, while half of the hiring managers used their mobile devices to prescreen candidates.
It takes approximately 50 seconds for a hiring manager to review both resume and portfolio, combined.
Now that I had adequately labeled the data set and the specimen to start building a classifier and get automated predictions. Read more interesting stuff in part two!