• tourist@lemmy.world
    link
    fedilink
    English
    arrow-up
    25
    arrow-down
    1
    ·
    5 个月前

    The participants judged GPT-4 to be human a shocking 54 percent of the time.

    ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time

    Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.

    Try talking to the thing: https://web.njit.edu/~ronkowit/eliza.html

    I refuse to believe that 22% didn’t misunderstand the task or something.

    • webghost0101@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      1
      ·
      edit-2
      5 个月前

      The public versions of the ais used in Turing tests usually have less computing power. The test itself is often also highly specific in what and how questions can be asked.

      This hardly news because models have passed the test before and as a result the test is made more difficult. It says nothing about intelligence and only about the ability to convincingly simulate a human conversation.

    • Downcount@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      5 个月前

      Okay, 22% is ridiculously high for ELIZA. I feel like any half sober adult could clock it as a bot by the third response, if not immediately.

      I did some stuff with Eliza back then. One time I set up an Eliza database full of insults and hooked it up to my AIM account.

      It went so well, I had to apologize to a lot of people who thought I was drunken or went crazy.

      Eliza wasn’t thaaaaat bad.

      • catastrophicblues@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 个月前

        Forget psychotherapy, it can barely understand anything:

        > Hello, I am Eliza. I'll be your therapist today. 
        * Hello! How are you?
        > Would you prefer if I were not ? 
        * Not what?
        > Have you asked such questions before? 
        * No
        > You are being a bit negative and that won't help us move forward. 
        * What are you smoking?
        > Perhaps in your fantasies I am smoking?
        
    • technocrit@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      5 个月前

      It was a 5 minute test. People probably spent 4 of those minutes typing their questions.

      This is pure pseudo-science.

    • catastrophicblues@lemmy.ca
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 个月前
      > Hello, I am Eliza. I'll be your therapist today. 
      * Hello! How are you?
      > Would you prefer if I were not ? 
      * Not what?
      > Have you asked such questions before? 
      * No
      > You are being a bit negative and that won't help us move forward. 
      * What are you smoking?
      > Perhaps in your fantasies I am smoking?
      

      Yeah, it took me one message lol