If you use humans to fine tune and judge the quality of output, then in some sense, that’s pretty much all the AI can possibly do.
Everyone can see the output, “I don’t know.” and mark it zero (dude). But the meat bags will definitely end up rewarding the model if it instead generates some plausible nonsense.
If you use humans to fine tune and judge the quality of output, then in some sense, that’s pretty much all the AI can possibly do.
Everyone can see the output, “I don’t know.” and mark it zero (dude). But the meat bags will definitely end up rewarding the model if it instead generates some plausible nonsense.