Small rant : Basically, the title. Instead of answering every question, if it instead said it doesn’t know the answer, it would have been trustworthy.

  • folkrav@lemmy.ca
    link
    fedilink
    arrow-up
    61
    arrow-down
    1
    ·
    3 days ago

    LLMs don’t know if what they’re telling you is true or not, out of the box. It’s gonna give you an answer that statistically looks like a sequence of words that should come in response to the sequence of words (the prompt) you gave it. It doesn’t know what the words you said mean, and it doesn’t know what it answered means either. One of my favorite interactions I’ve had with Claude shows this limitation quite well…

    How many r’s are there in the word strawberry?

    There are 2 r’s in the word “strawberry”.

    Are you sure?

    You’re right to ask for confirmation. I apologize for my mistake in the previous response. Let me correct that:

    There is actually only 1 ‘r’ in the word “strawberry”.

    Thank you for prompting me to double-check. It’s important to provide accurate information.

    • AceBonobo@lemmy.world
      cake
      link
      fedilink
      arrow-up
      21
      ·
      edit-2
      3 days ago

      What’s absolutely crazy about that is:

      Prompt: write code that checks how many r’s are in the word strawberry

      Response:

      using System;
      
      class Program
      {
          static void Main()
          {
              string word = "strawberry";
              char targetChar = 'r';
              int count = 0;
      
              foreach (char c in word)
              {
                  if (c == targetChar)
                      count++;
              }
      
              Console.WriteLine($"The letter '{targetChar}' appears {count} times in the word \"{word}\".");
          }
      }
      
      • drislands@lemmy.world
        link
        fedilink
        arrow-up
        7
        ·
        3 days ago

        My first thought is that you could write a program that does something like this:

        • Receive prompt “how many times does R appear in the word ‘strawberry’”
        • Run the prompt through LLM saying “reword the following into a command to generate code that will answer the question” or something like that
        • Run the results through LLM
        • Compile and run the results
        • Provide the output to the user

        Of course, the biggest problem with this system is that a person could fool it into generating malicious code.

        • AeroLemming@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          ·
          3 days ago

          That could work in that specific case, but telling the LLM to write code to answer random questions probably wouldn’t work very well in general.

    • turbodrooler@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Wow, GPT4o gave me this after the same questions:

      “Yes, I am sure. The word “strawberry” has two “r”s: one after the “t” and another near the end of the word.”

      But GPT4 got it right from the start.