Inference speed is much longer than reported
#25 opened about 3 hours ago
		by
		 jeff-gao
jeff-gao
Use DirectML for microsoft/phi-1_5
#24 opened 1 day ago
		by
		 shiqi1031
shiqi1031
raise error when `use_cache = True`
#23 opened 1 day ago
		by
		 wjfwzzc
wjfwzzc
Adding _set_gradient_checkpointing for compatibility
#22 opened 1 day ago
		by
		 vriveras
vriveras
Any plan to release phi-1.5 web mentioned in your paper?
#21 opened 1 day ago
		by
		 sanqiang
sanqiang
Adding `safetensors` variant of this model
#18 opened 5 days ago
		by
		 SFconvertbot
SFconvertbot
 SFconvertbot
SFconvertbotCreating a RetNet (Retentive Network) version is planned?
#16 opened 6 days ago
		by
		 guyko81
guyko81
Adding tf lite variant
#13 opened 7 days ago
		by
		 0xrk
0xrk
Could one potentially train a mini-model based on this concept on synthetic structural data?
						4
#11 opened 7 days ago
		by
		 Mr8BitHK
Mr8BitHK
tokenizer.model file
						4
#10 opened 8 days ago
		by
		 hanisaf
hanisaf
Attention mask for generation function in the future?
						3
#7 opened 8 days ago
		by
		 rchan26
rchan26
 rchan26
rchan26Unofficial dataset
						3
#2 opened 8 days ago
		by
		 SinanAkkoyun
SinanAkkoyun
