Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Error in model, scaling only q matrix not qK.T dot product (qk.T/sqrt(dim_per_head))#357

Open
BenoitDalFerro wants to merge 1 commit intofacebookresearch:mainfrom
BenoitDalFerro:patch-1
Open

Error in model, scaling only q matrix not qK.T dot product (qk.T/sqrt(dim_per_head))#357
BenoitDalFerro wants to merge 1 commit intofacebookresearch:mainfrom
BenoitDalFerro:patch-1

Commits

Commits on Feb 14, 2023