Philip's blog

https://blog.philip-huang.tech/?page=reason-to-reject 

 



研究首次系統性地探討了使用語言反饋（**判斷**）來對齊 LLM 的可能性，提出了 Contrastive Unlikelihood Training (CUT) 框架。

實驗結果表明，CUT 僅需 1317 筆訓練資料便能超越 175B 的 DaVinci003。並且進一步分析表明，**判斷**在LLM對齊中具有比 RL 獎勵更大的潛力。

   問題設定

假設有一組**指令**-**回應**-**判斷**三元組 $(x, y, j)$，其中指令 $x = [x_1, \ldots, x_M]$，**回應** $y = [y_1, \ldots, y_N]$，**判斷** $j = [j_1, \ldots, j_Q]$ 為長度分別為 $M$、$N$ 和 $Q$ 的符號序列。**回應**可能存在缺陷或被認為完全滿意。**判斷**提供了對**回應**的優缺點的分析，這些分析可以由人類或 AI 模型起草。將 LLMs 與**判斷**對齊的目標是使 LLMs 保留在優點中提到的適當行為，更重要的是，解決缺點以防止未來的不當行為。

    可能的解決方案
     Forwar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Philip's blog #42

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Philip's blog #42

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions