Model | Fine-tuned bert-base-uncased | Fine-tuned bert-base-uncased | Fine-tuned bert-base-uncased |
Source embedding | Concatenate O*NET title and description | Separate embeddings for O*NET title and O*NET title + description | Separate embeddings for O*NET title and O*NET title + description |
Target embedding | Average pf concatenated permutations of ESCO (non-)preferred term(s) | Separate embeddings for ESCO (non-)preferred term(s) and description | Separate embeddings for ESCO (non-)preferred term(s) and description |
Scoring | Cosine similarity (cs) | 0.4 * max(cs(ESCO pt/npts, O*NET title)) + 0.3 * median(cs(ESCO pt/npts, O*NET title)) + 0.3 * (cs(ESCO desc, O*NET title+desc)) | 0.3 * max(cs(ESCO pt/npts, O*NET title)) + 0.15 * median(cs(ESCO pt/npts, O*NET title)) + 0.55 * (cs(ESCO desc, O*NET title+desc)) |
Batch size | 24 | 24 | 24 |
Model updates | 2,395,282 | 2,395,282 | 2,826,735 |
Training size | 25,376,400 | 25,376,400 | 29,756,109 |