People Analytics

Which of the variables listed above (second paragraph) would you use to define success of sales employees in order to develop your model? (10 points)
The choice of independent variables for your model has to be based on their power to predict the dependent variable AND their availability in job candidates’ resumes that you intend to screen. Which of the variables listed above (second paragraph) would you test as predictors of success in your model? (10 points)
Which parameter of the model, i.e., sensitivity, specificity, precision, or accuracy, describes the ability of your model to identify candidates who are actually qualified? Which parameter of the model describes the ability of your model to identify candidates who are actually unqualified? Which parameter describes the ability of your model to correctly predict unknown candidates as being qualified? (5 points)
What are the false positive rate and the false negative rate? Describe what false positive and false negative mean. (10 points)
If your goal is to develop a model to screen resumes and identify candidates to be invited for an interview, which type of error is worse – false positive or false negative? Explain the rationale for your answer. (10 points)
If you want to improve the performance of your model to identify candidates to be invited for an interview, which parameter (sensitivity, specificity, precision, accuracy) would you use to guide the selection of candidates? Explain your rationale based on what you are trying to achieve with your predictions. (10 points)
If your goal is to develop a model to identify candidates who will receive a job offer, which type of error is worse – false positive or false negative? Explain the rationale for your answer. (10 points)
If you want to improve the performance of your model to identify candidates to receive a job offer, which parameter would you use? Explain your rationale based on the what you are trying to achieve with your predictions. (10 points)
Fast forward one year. The company deployed the model that you developed with the 3 years of data on current employees, and people were hired based on your predictions. The Head of HR has now come back to you with a concern that not all of the new hires were “good”. Twelve of the 100 people hired were not qualified, and did not work out. What parameter in the confusion matrix would you use to understand if your model worked better than you expected, as well as you expected or worse than you expected? How well did the model work? Do you agree that accuracy not the best parameter to use? If so, why not? Explain the rationale for your answers. (15 points)
Describe any limitations and/or concerns associated with your approach for this new business opportunity. (10 points)

 

Sample Solution

 

 

Selecting Independent Variables

The choice of independent variables for the model should be based on their ability to predict the dependent variable (sales success) and their availability in job candidates’ resumes. Based on the variables listed in the second paragraph, the following factors could be considered as potential predictors of sales success:

  • Education: Level of education and relevant certifications can indicate a candidate’s knowledge and understanding of the industry or products they will be selling.

  • Experience: Prior sales experience, particularly in the industry or with a similar product line, can provide insights into a candidate’s ability to build relationships, negotiate deals, and close sales.

  • Skills: Technical skills, such as proficiency in sales software or industry-specific tools, can demonstrate a candidate’s ability to effectively manage their sales process and track their progress.

  • Soft skills: Communication, interpersonal skills, and problem-solving abilities are crucial for building rapport with clients, understanding their needs, and tailoring sales strategies.

Evaluating Model Parameters

The parameters of a predictive model assess its performance in different aspects. The relevant parameters for evaluating the model’s ability to identify qualified candidates are:

  • Sensitivity: The proportion of truly qualified candidates correctly identified by the model as qualified.

  • Specificity: The proportion of truly unqualified candidates correctly identified by the model as unqualified.

False Positive and False Negative Rates

The false positive rate (FPR) represents the proportion of unqualified candidates incorrectly identified as qualified by the model. False positives can lead to wasted resources on interviewing and onboarding unsuitable candidates.

The false negative rate (FNR) represents the proportion of qualified candidates incorrectly identified as unqualified by the model. False negatives can result in missing out on potentially successful sales representatives.

Choosing Error Types

In the context of screening resumes for interview invitations, false positives are generally considered less detrimental than false negatives. This is because it is more cost-effective to interview a few unqualified candidates than to miss out on potentially successful ones.

However, if the goal is to identify candidates for job offers, false positives become more concerning. Hiring unqualified individuals can lead to performance issues, increased turnover costs, and reputational damage. Therefore, minimizing false positives is crucial in this context.

Optimizing Model Performance

To improve the model’s performance in identifying candidates for an interview, focusing on sensitivity would be appropriate. This would ensure that a higher proportion of truly qualified candidates are not missed during the initial screening process.

Evaluating Model Performance for Job Offers

When predicting job offer outcomes, accuracy may not be the best parameter to consider. Accuracy measures the overall proportion of correct predictions, but it doesn’t distinguish between false positives and false negatives.

In this scenario, precision is a more relevant parameter. Precision indicates the proportion of candidates identified as qualified who are actually qualified. A high precision score ensures that candidates receiving job offers are likely to be successful.

Analyzing Model Performance After Deployment

The parameter to evaluate the model’s performance after deployment would be the FNR. The FNR indicates the proportion of qualified candidates who were not identified as such by the model and therefore missed out on interview opportunities.

If the FNR is high, it suggests that the model is overly conservative and may be excluding potentially successful candidates. A higher FNR could be acceptable if the company is willing to sacrifice some potential hires to minimize false positives.

Limitations and Concerns

The approach of using a predictive model for sales employee selection has several limitations and concerns:

  • Data Bias: The model’s predictive power may be biased by the data used to train it. If the data is not representative of the overall candidate pool, the model may make inaccurate predictions for candidates from different backgrounds or experiences.

  • Lack of Human Judgment: While the model can provide insights, it cannot replace the expertise and judgment of human recruiters in assessing a candidate’s overall suitability for the role.

  • Ethical Considerations: The use of personal data and algorithms for candidate selection raises ethical concerns regarding privacy, fairness, and discrimination.

  • Continuous Monitoring and Improvement: The model’s performance needs to be continuously monitored and updated as the company’s hiring needs and market conditions evolve.

This question has been answered.

Get Answer