Code / NLP / Transformers

TalentMap AI: Ethical Talent Matching with DistilBERT

By Santiago Botero García (@LePeanutButter) · Updated December 2025

▲ 42

Ethical and technical AI application for matching job candidates with vacancies through semantic analysis. TalentMap AI leverages Transformer-based embeddings (DistilBERT) to evaluate compatibility between résumés and job descriptions beyond keyword matching.

The project emphasizes fairness, transparency, and interpretability, addressing algorithmic bias while improving employment outcomes.

Developed for the Principles of Artificial Intelligence Technologies (PTIA) course at Escuela Colombiana de Ingeniería Julio Garavito.

Live Demo, Class Diagrams & Paper Overview

This video presents TalentMap AI, including a live frontend demo of the semantic recruitment system, along with an overview of the project paper and class diagrams.

It showcases how candidate CVs are processed and matched with job descriptions using semantic similarity, as well as the system's architecture and design structure behind the implementation.

Source code

Explore the repositories that implement this project:

Full-stack AI MVP
TalentMap AI Monorepo
Django and DistilBERT application for semantic resume/job matching with ethical AI controls.
LePeanutButter/talent-map-ai →

Background

Modern recruitment systems often rely on keyword-based matching, which fails to capture the deeper semantic relationships between a candidate's skills and a job's requirements. TalentMap AI addresses this limitation by combining machine learning, semantic embeddings, and ethical AI design to improve job-candidate compatibility.

This project aims to:

Develop a semantic model using DistilBERT for candidate-job matching.
Integrate fairness and bias auditing tools to ensure responsible AI behavior.
Provide a web-based MVP demonstrating real-time compatibility scoring.

In [1]

# Install baseline environment dependencies
!pip install django transformers jquery

Out [1]

Installing collected packages: Django, transformers, jQuery Successfully installed Django-5.0 transformers-4.38 jQuery-3.7.1

Architecture

The TalentMap AI system follows a modular architecture:

Frontend (SPA): HTML, CSS, JS for visualization of recommendations, using jQuery for interactivity.
Backend (Django REST): API for résumé and job description processing.
AI Engine: DistilBERT embeddings for semantic similarity.
Ethics Layer: Bias detection, anonymization, and explainability mechanisms.

Note: The full system analysis and underlying principles are available in the comprehensive report at docs/talentmap-ai.pdf. The artifact is written in Spanish and includes the state of the art, details on training data, architecture selection, and an environmental analysis using the PEAS model.

Ethical Framework

TalentMap AI follows UNESCO's Recommendation on the Ethics of Artificial Intelligence (2021), applying principles of:

Fairness: Avoiding bias by anonymizing and auditing datasets.
Transparency: Explaining how recommendations are generated.
Accountability: Ensuring human oversight and responsible AI design.

In [2]

from talentmap_ai.engine import DistilBertTrainer
trainer = DistilBertTrainer(model_id="test_model", mode="cosine")
trainer.fit(epochs=2, batch_size=4, lr=0.0002, freeze_bert=True)

Out [2]

Training Configuration:
Model ID: test_model | Mode: cosine | Device: cpu
Training samples: 8 | Validation samples: 2

Epoch 1/2 -> Train Loss: 0.3157 | Val Loss: 0.3781
Epoch 2/2 -> Train Loss: 0.2045 | Val Loss: 0.3811

Model training took 173.68 seconds.
Compressed model saved to: test_model/test_model_cosine_20251125_173520.pt.xz

In [3]

test_results = trainer.evaluate_batch_prediction()

Out [3]

Score	Job Title	Resume Description
0.8458	Python ML engineer...	Expert in Python and machine learning
0.6147	Marketing manager...	Software engineer with 5 years...
0.6766	Data scientist...	PhD in statistics, ML experience...
0.6620	Sales representative...	Frontend developer...

SUCCESS: Loaded model parameters match the saved model. All tests completed successfully in 190.272s.

Maintainers & Contributors

This repository and model execution run exist thanks to the academic research and implementation work of:

Santiago Botero García

Maintainer · @LePeanutButter

Andrés Felipe Calderón Ramírez

Contributor · @andrescalderonr

License

Training Configuration: Model ID: test_model | Mode: cosine | Device: cpu Training samples: 8 | Validation samples: 2 Epoch 1/2 -> Train Loss: 0.3157 | Val Loss: 0.3781 Epoch 2/2 -> Train Loss: 0.2045 | Val Loss: 0.3811 Model training took 173.68 seconds. Compressed model saved to: test_model/test_model_cosine_20251125_173520.pt.xz

Score

Job Title

Resume Description

0.8458

Python ML engineer...

Expert in Python and machine learning

0.6147

Marketing manager...

Software engineer with 5 years...

0.6766

Data scientist...

PhD in statistics, ML experience...

0.6620

Sales representative...

Frontend developer...