The topic of ‘bias’ in decision making has received a lot of attention in recent years. Bias can manifest in various forms, including confirmation bias, affinity bias and loss aversion bias. In contrast to noise, decisions heavily influenced by bias can appear highly consistent, and invite causal explanations as a result. The goal of Kahneman et al. (2021) is to draw attention to the distinct and equally potent issues arising from noise; the implications of failing to do this are immense.

Noise, argue the authors, is everywhere. It plagues professions including criminal justice, insurance underwriting, forensic science, futurism, medicine, and human resources. Kahneman et al. (2021) document in detail the large and growing body of literature demonstrating how people working side-by-side in the same job can make widely varied judgements about similar cases that often prove costly, life changing, and in some industries, fatal.

Minimising ‘noise’ has been a key objective in the application of psychometrics. The challenges in evaluating individual differences in abilities, competencies and personalities and their significance for personal, team and organisational success are considerable. Subjectivity, inconsistency, and bias are major contributors to the noise in personnel decision making. Psychometric assessment’s aim for consistency in process, methodology and measurement criteria parallel the aim of ‘noise’ reduction. In the context of psychometrics, the parallel term is ‘reliability’. “An indicator of the consistency which a test or procedure provides. It is possible to quantify reliability to indicate the extent to which a measure is free from error” Arnold et al. (2020).

Noise and bias are inevitable, but efforts should still be made to reduce these distortions by maximising reliability and validity. There are several sub-categories of reliability that psychometrics must consider, including test retest, internal consistency, and split half reliability. The more ‘noisy’ a test is, the greater amount of unwanted variance its results will generate. An example of this in test-retest terms would be the desire to minimise differences in results between two test administrations for the same participant. A more thorough breakdown of these reliability sub-categories can be found in a PCL blog piece here.

 

The most relevant area to our psychometric assessment industry that Kahneman et al. (2021) focus on is selection. Unsurprisingly, unstructured interviews come in for heavy criticism. Using competency-based interviews with multiple assessors can significantly boost quality, but a degree of noise and bias will always be present. A meta-analysis of employment interview reliability found an average inter-assessor correlation of .74, suggesting that even when two interviewers observe the same two candidates, they will still disagree about who was stronger about one-quarter of the time (Huffcutt et al., 2013).

One specific example resulted from a candidate who was asked in two separate interviews by two separate assessors why he had left his short-lived former CFO role. His response was that he had a “strategic disagreement with the CEO”. When the two assessors met to compare notes, the first had perceived the candidate’s decision as a sign of integrity and courage, yet the second had construed the response as a sign of inflexibility and potential immaturity. Even when stimuli are consistent, prior attitudes influence our interpretation of facts.

Kahneman et al. (2021) also discuss how prior attitudes can influence whether salient facts emerge to begin with. Research has indicated the striking impact first impressions have on interviewers’ perceptions of candidates and may dictate whether certain evidence is sought. Even if competencies around team working are identified as predictors of role performance, interviewers may be less inclined to ask tougher questions in pursuit of this information if they perceive the candidate to be cheerful and gregarious in the opening informal exchanges.

These examples illustrate that, whilst structured competency-based interviews improve the accuracy of selection decisions, it is advisable to apply ‘decision hygiene’ by supplementing them with additional methods. Psychometrics are used to improve selection processes, not least because they can reduce noise. If two or more candidates submit the same responses to a set of psychometric items, they are guaranteed to receive identical outputs regardless of the gender, age, or ethnicity of the candidate or the assessor.

 

So far, we have established that selection judgements will be more accurate when several sources of relevant information are considered and aggregated, preferably by multiple assessors. But when considering multiple metrics, how they are aggregated is key.

Kahneman et al. (2021) differentiate between ‘clinical’ and ‘mechanical’ aggregation and demonstrate this distinction by presenting the reader with ratings received by two subsequently hired candidates (see Table 1. below). Two years later, the reader is asked to predict the stronger performer.



Source link

By admin

Leave a Reply