• Work on the stem presentation (need to visualize it first…) ✅ 2026-03-14

  • Polish the presentation ✅ 2026-03-16

  • The word “bias” still needs some workshop,,, I think Ask Haeun this on Sunday morning. Maybe we can use the word preference instead?

  • It is technically not really bias because bias is always wrong. In this case, we don’t discriminate whether it is wrong or not. We just measure preference towards certain context characteristics.

  • For example, if claim-evidence overlap increases… alright, is that a bad thing? Not necessarily! It can be that it is a good thing, right?

  • We are not looking into LLM judge, but LLM bias in general — need to somehow emphasize this.

  • Think about what we wanted to do too — seeing what is being inherited, reinforced and forgotten via post-training. How can we fit in that in here?

    • Tough. Let’s just not, lmao.
  • Need to probably check on the Tulu SFT dataset. What is it actually teaching.