idea

Things To Do

Thoughts

  1. How do we measure reasoning coherence?

    • An easy way out is just to get an LLM to judge it, then cross reference with human evaluation.
    • Is there a more concrete, perhaps in the domain of knowledge-graph to quantify reasoning correctness?
  2. Should we instead use different tasks to measure reasoning accuracy and context utilization.

    • Use a more straightforward “what is the conclusion” task for reasoning accuracy.
    • Maintain the same method to determine context utilization.