SGPT-1.3B-weightedmean-msmarco-specb-bitfit
Usage
For usage instructions, refer to our codebase: https://github.com/Muennighoff/sgpt
Evaluation Results
For eval results, refer to the eval folder or our paper: https://arxiv.org/abs/2202.08904
Training
The model was trained with the parameters:
DataLoader:
torch.utils.data.dataloader.DataLoader
of length 62398 with parameters:
{'batch_size': 8, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
Loss:
sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss
with parameters:
{'scale': 20.0, 'similarity_fct': 'cos_sim'}
Parameters of the fit()-Method:
{
"epochs": 10,
"evaluation_steps": 0,
"evaluator": "NoneType",
"max_grad_norm": 1,
"optimizer_class": "<class 'transformers.optimization.AdamW'>",
"optimizer_params": {
"lr": 0.0002
},
"scheduler": "WarmupLinear",
"steps_per_epoch": null,
"warmup_steps": 1000,
"weight_decay": 0.01
}
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 300, 'do_lower_case': False}) with Transformer model: GPTNeoModel
(1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': True, 'pooling_mode_lasttoken': False})
)
Citing & Authors
@article{muennighoff2022sgpt,
title={SGPT: GPT Sentence Embeddings for Semantic Search},
author={Muennighoff, Niklas},
journal={arXiv preprint arXiv:2202.08904},
year={2022}
}
- Downloads last month
- 1,137
Spaces using Muennighoff/SGPT-1.3B-weightedmean-msmarco-specb-bitfit 5
Evaluation results
- accuracy on MTEB AmazonCounterfactualClassification (en)test set self-reported65.209
- ap on MTEB AmazonCounterfactualClassification (en)test set self-reported29.592
- f1 on MTEB AmazonCounterfactualClassification (en)test set self-reported59.971
- accuracy on MTEB AmazonPolarityClassificationtest set self-reported73.206
- ap on MTEB AmazonPolarityClassificationtest set self-reported67.367
- f1 on MTEB AmazonPolarityClassificationtest set self-reported72.904
- accuracy on MTEB AmazonReviewsClassification (en)test set self-reported34.956
- f1 on MTEB AmazonReviewsClassification (en)test set self-reported34.719
- map_at_1 on MTEB ArguAnatest set self-reported26.102
- map_at_10 on MTEB ArguAnatest set self-reported40.958
- map_at_100 on MTEB ArguAnatest set self-reported42.033
- map_at_1000 on MTEB ArguAnatest set self-reported42.042
- map_at_3 on MTEB ArguAnatest set self-reported36.332
- map_at_5 on MTEB ArguAnatest set self-reported38.608
- mrr_at_1 on MTEB ArguAnatest set self-reported26.387
- mrr_at_10 on MTEB ArguAnatest set self-reported41.051
- mrr_at_100 on MTEB ArguAnatest set self-reported42.118
- mrr_at_1000 on MTEB ArguAnatest set self-reported42.127
- mrr_at_3 on MTEB ArguAnatest set self-reported36.415
- mrr_at_5 on MTEB ArguAnatest set self-reported38.720
- ndcg_at_1 on MTEB ArguAnatest set self-reported26.102
- ndcg_at_10 on MTEB ArguAnatest set self-reported49.680
- ndcg_at_100 on MTEB ArguAnatest set self-reported54.258
- ndcg_at_1000 on MTEB ArguAnatest set self-reported54.486
- ndcg_at_3 on MTEB ArguAnatest set self-reported39.864
- ndcg_at_5 on MTEB ArguAnatest set self-reported43.980
- precision_at_1 on MTEB ArguAnatest set self-reported26.102
- precision_at_10 on MTEB ArguAnatest set self-reported7.781
- precision_at_100 on MTEB ArguAnatest set self-reported0.979
- precision_at_1000 on MTEB ArguAnatest set self-reported0.100
- precision_at_3 on MTEB ArguAnatest set self-reported16.714
- precision_at_5 on MTEB ArguAnatest set self-reported12.034
- recall_at_1 on MTEB ArguAnatest set self-reported26.102
- recall_at_10 on MTEB ArguAnatest set self-reported77.809
- recall_at_100 on MTEB ArguAnatest set self-reported97.866
- recall_at_1000 on MTEB ArguAnatest set self-reported99.644
- recall_at_3 on MTEB ArguAnatest set self-reported50.142
- recall_at_5 on MTEB ArguAnatest set self-reported60.171
- v_measure on MTEB ArxivClusteringP2Ptest set self-reported43.384
- v_measure on MTEB ArxivClusteringS2Stest set self-reported33.710
- map on MTEB AskUbuntuDupQuestionstest set self-reported58.133
- mrr on MTEB AskUbuntuDupQuestionstest set self-reported72.109
- cos_sim_pearson on MTEB BIOSSEStest set self-reported86.622
- cos_sim_spearman on MTEB BIOSSEStest set self-reported83.015
- euclidean_pearson on MTEB BIOSSEStest set self-reported86.004
- euclidean_spearman on MTEB BIOSSEStest set self-reported83.856
- manhattan_pearson on MTEB BIOSSEStest set self-reported85.830
- manhattan_spearman on MTEB BIOSSEStest set self-reported83.866
- accuracy on MTEB Banking77Classificationtest set self-reported82.058
- f1 on MTEB Banking77Classificationtest set self-reported82.019
- v_measure on MTEB BiorxivClusteringP2Ptest set self-reported35.059