This repository was archived by the owner on Oct 31, 2023. It is now read-only.
Error in model, scaling only q matrix not qK.T dot product (qk.T/sqrt(dim_per_head))#357
Open
BenoitDalFerro wants to merge 1 commit intofacebookresearch:mainfrom
Open
Error in model, scaling only q matrix not qK.T dot product (qk.T/sqrt(dim_per_head))#357BenoitDalFerro wants to merge 1 commit intofacebookresearch:mainfrom
BenoitDalFerro wants to merge 1 commit intofacebookresearch:mainfrom