• TheButtonJustSpins@infosec.pub
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    20 hours ago

    To tell it to alter its responses, they would need to recognize that what they expect is incorrect. (Assuming we trust LMMs to tell us the truth, which we shouldn’t, but I think that’s orthogonal to this.)

    • tal
      link
      fedilink
      English
      arrow-up
      3
      ·
      17 hours ago

      To tell it to alter its responses, they would need to recognize that what they expect is incorrect.

      Nah, I structured that input specifically to avoid that. All they need to believe is that Marjorie Taylor Greene has a correct view of things.