Computer says ‘yes’
Can intelligent algorithms really identify the perfect job applicant?
• I’m applying for a job in the US as customer-care agent at HireVue, a software company in Utah. It’s full-time and they need somebody to start right away, I don’t have any relevant qualifications or experience, and English isn’t my first language, since I was born in Germany, where I still live. but why not give it a shot? The starting salary is $40,000 (£29,700). So I log in using their app, click my way through the terms and conditions, and start my interview. I’m shown a film of somebody I could end up working with (brown hair, denim jacket); she chats a bit and then asks: “What kind of roles have you had to date?”
Immediately my phone screen is showing a countdown: I’ve got 30 seconds to think of an answer. I can see myself on screen and I start to talk. I’m given three minutes for my response. Ten more video clips follow and I'm asked questions by other members of staff at the end of each one. Thirty-three minutes in total. Then an algorithm will decide my fate. HireVue is one of the largest providers in what potentially amounts to a billion-dollar segment of the HR services market: candidate selection by algorithm.
HireVue has been offering companies its web video interviewing platform for 13 years, saving candidates from having to travel across the US for job interviews. In the process, it has collected more than 5m video answers to date. Why should all this valuable data just sit around on servers doing nothing? That was the question the firm’s founders asked themselves about four years ago. Since then, they’ve been using algorithms and machine learning as evaluation tools – with some success. In 2014, Forbes named HireVue one of “America’s most promising companies” and it is used by big multinationals such as Unilever and Vodafone. When it comes to selecting salespeople and customer-care staff, they are already convinced that the algorithms can make the right choice.
Supporters of hiring by algorithm say that computer-based selection works faster and more effectively than a human if used correctly. It also claims to be without bias. During my interview, the HireVue software recorded 25,000 pieces of data relating to my facial expressions, body language and speech. These are supposed to be able to decode my personality, compare me with existing company workers, and calculate to what extent I would make a good customer-care agent in Utah. In other words, it’s a machine to identify the perfect candidate.
People see what they want to see
Finding the right applicant for a position means minimising the weakest link in the process: human judgment. Prejudices impede objective decision-making, and there is no shortage of them in HR: maybe the interviewer doesn’t like the colour of the applicant’s hair or how they smell; maybe he or she is just hungry or tired. Claus-Christian Carbon, professor of psychology at Bamberg University in central Germany, says: “At every stage, humans are deeply marked by prejudices, distortions and distractions of which they are completely unaware.” Subjective impressions can be helpful when putting together small working parties, but on a larger scale not being objective can be destructive. Carbon’s work demonstrates that humans are poorly equipped to make a neutral assessment of another person.
Take passport control at airports. A study found participants scored just 74% when it came to identifying the person in front of them by their photograph; airport staff only scored 75%. “And that’s just a simple matter of facial recognition,” says Carbon. “We have a far harder time evaluating emotions.” While humans can recognise a smile quite reliably, other expressions of emotion present greater difficulties. If the person in front of them is from Asia or the Middle East, most Europeans lack the cultural experience required. “This makes us highly unreliable in these situations – dramatically so,” says Carbon. “Generally, we can only really make an accurate assessment of people we know well, ie our family or very good friends. And yet we remain convinced that we can make judgments about how diligent or responsible a perfect stranger will be. That’s crazy, isn’t it?”
Algorithms have the edge on humans in a variety of ways. They don’t get tired, don’t have preferences, and can process huge quantities of information much faster. They’ve also been HR’s little helper for some time: 72% of all applications made in the US are pre-processed by computers that weed out candidates who do not fulfil the basic criteria; they’re taken out of the running without a single human having read their application. Machines also send out personalised automatic responses to candidates who write in by email.
Computers promise neutrality
Now, however, algorithms are storming the last redoubt of human resources departments: interviews. Computer programs are being applied to videos and personality tests, evaluating candidates’ suitability. In the US, companies such as Applied Predictive Technologies, Koru Careers and Pymetrics already offer these kinds of services. Pymetrics gets applicants to play a range of games and runs checks on more than 90 different character traits.
At HireVue, the search for the best applicant starts inside the company. Members of staff who are considered the best in their roles are video-interviewed and the program focuses on their facial expressions, body language and speech patterns. This produces a data profile, which the algorithm can then overlay on the videos recorded by the candidates; they answer between eight and 12 questions asked by existing members of staff – in my case Kayla, Quinton, Kim and James. They want to know whether I am a team player, how I deal with feedback and what hours I would be prepared to work.
Someone in HR is then shown a percentage score describing how well I fit the job; that person can then opt to view individual answers and mark their favourite candidates. “The days of marathon back-to-back interviews are long past,” says Loren Larsen, chief technology officer at HireVue.
Garbage in, garbage out
Yet can someone’s suitability for a particular job be reduced to a number? Dorothea Alewell, professor of human resource management at Hamburg University, has her doubts. One of her lecture courses is on “identifying aptitude” or the analogue version of the selection process. “Overall,” says Alewell, “there are three questions we need to answer here: how do we measure success in this position? What does it take to reach success? And how can we measure if someone has what it takes?” She adds: “That may sound simple, but it’s actually a minefield.”
You enter the minefield even before the position is advertised. Staff are generally assessed on the basis of short-term successes, even though their true quality only really shows itself over a longer period, once projects have been completed and evaluated. But that kind of data is often not available, so companies opt for other criteria reputed to be measurable markers of promise: dominant personality, leadership qualities, and probably openness and friendliness in the role I applied for. Yet which characteristics and combinations are defining factors is still anyone’s guess, and “personality” is a hazy term. What is more, people may respond differently depending on other circumstances. As such, a video interview is only ever going to be a limited indicator of how someone will perform in an everyday work situation.
Nevertheless, these programs are popular in personnel departments because they are considered efficient and neutral. “Artificial intelligence encourages diversity and reduces discrimination,” says Larsen. “Algorithm selection doesn’t exclude people based on the colour of their skin or because of the school they went to. What they look at are facts.” By “facts”, he means data.
Algorithms can discriminate
So much for the theory. In practice, new problems arise. Algorithms are trained, rather like children imitating their parents: they collect a lot of data, analyse it and then apply the patterns they identify. What no one really knows is whether the criteria used, or the conclusions drawn, are ones we would want to base job selection on. “AI and neural networks are a black box,” explains Martin Spindler, professor of statistics at Hamburg University. “The companies throw data in and get a result out.” What happens inside the box, however, is almost impossible to find out.
Another example from the USA reveals the potential effects of unwanted distortion. Xerox Services had commissioned an algorithm to find new staff who would remain with the company longer; Xerox later noticed that fewer people were being employed from the outer edges of the city. The program had noted that a long journey to work was often a reason why staff moved on. The problem was that lower earners often lived further away from the city centre, and this had led to inadvertent but systematic discrimination.
HireVue is aware of issues like this, and is applying statistical processes to reduce their effect. After each round in the hiring process, they carry out spot checks on the algorithm to find out whether it has made discriminatory decisions. “What we find out is whether the algorithm has learned patterns that distort the results. One example is in companies where there is an imbalance between the number of men and women on the staff,” explains Larsen. If so, the programmers feed in new data. “Machines can only become better and fairer if we check up on them consistently," he concludes.
So, computers aren’t anywhere near perfect yet. The very idea that they will ever be able to decode the personality of a human based on data remains pie in the sky. The only thing that can be stated with any certainty is that computers can find out that I am statistically similar to someone else they have interviewed and that we both use the same terminology. How successful this will make me as a customer-care agent is something they can only estimate – even if the numbers they produce suggest precision.
These statistics, however, are what explain the spread of aptitude testing and algorithms. “People love numbers,” says Carbon. “Not because they really know what to do with them, but because they take on some of the responsibility.” If a new appointment does not work out the way the company expects, the HR department can use the data to justify itself. It’s a very tempting way of dealing with issues, and people are ever happier to put their trust in computers. In jobs with clear targets, this has some advantages. “Where indicators of success are well known, computers can focus on them,” explains Carbon. If these factors are harder to discern, however, humans are better.
As helpers, then, computer programs are popular in HR departments – and the people who work there are not worried that the machines will take their jobs. In a survey by Bamberg University, only 3.8% of HR professionals who responded said they thought a computer could replace them. “The benefit is simple: personnel departments can now focus fully on company staff again. That was always their core task,” says Alewell.
When I ask Larsen about my test results, he is able to explain that I used the words “interacting”, “discuss” and “video” frequently, which gained me marks, but that I also knotted my eyebrows and repeated “excitement” a lot, which cost me a few points. But he can only speculate why that was the case. He suggests the algorithm may have docked marks because what I was saying did not match the expression on my face.
In the end, a number pops up on his screen: 67%. That is the extent to which I would be suitable for the customer-care position HireVue has advertised. Compared with the other applicants, that puts me in second place and therefore means I’m sure to go through to the next round – despite the fact that English is a foreign language for me and that I don’t have any experience in customer service. All thanks to the algorithm.