Saturday, February 4, 2023

ChatGPT Fastest to 100 Million Users Ever - Too Bad it Sucks So Bad

Word got around this week that OpenAI's Chatbot software ChatGPT is the fastest growing program in Internet history, reaching 100 million users in two months of existence.  In that respect, it has been an online sensation, potentially influencing millions of users.

In a way, it's too bad the software is such a piece of crap when it comes to figuring out things.  That ends up making all of us old codgers who are warning that there's too much fuss over AI come across as wise old codgers.  All the talk about AI coming and that it's going to make everything better and usher in a new era of progress is just over-the-top marketing hype.

The article comes from a research piece done by National Public Radio, NPR, of all places and referred to in Ars Technica's weekly Rocket Report.  In short, AI might be able to transform some industries, just don't ask it anything technical, where being right is everything.  NPR put ChatGPT to the test with the help of actual, real, human rocket scientists to judge the AI application. It failed spectacularly.  

[Tiera] Fletcher is a professional rocket scientist and co-founder of Rocket With The Fletchers, an outreach organization. She agreed to review text and images about rocketry generated by the latest AI technology, to see whether the computer programs could provide people with the basic concepts behind what makes rockets fly.

The results were far from stellar. In virtually every case, ChatGPT – the recently released chatbot from the company OpenAI – failed to accurately reproduce even the most basic equations of rocketry. Its written descriptions of some equations also contained errors. And it wasn't the only AI program to flunk the assignment. Others that generate images could turn out designs for rocket engines that looked impressive, but would fail catastrophically if anyone actually attempted to build them.

Here we come to one of the key, fundamental problems with the ways these AI systems are programmed.  They're given tons of input to read, but have no ability to decide if things are higher priority or can't be changed at will.  They can't determine facts.  

"There are some people that have a fantasy that we will solve the truth problem of these systems by just giving them more data," says Gary Marcus, an AI scientist and author of the book Rebooting AI.

But, Marcus says, "They're missing something more fundamental."

They're missing knowledge of which things they read are rock hard physical law and which are more like suggestions, or that can be made secondary.  A simple example was that in addition to flubbing up "The Rocket Equation," the name of which implies it comes close to a complete description of how a rocket works, it flubbed up simple things like thrust to weight ratio.   

The strange results reveal how the programming behind the new AI is a radical departure from the sorts of programs that have been used to aid rocketry for decades, according to Sasha Luccioni, a research scientist for the AI company Hugging Face. "The actual way that the computer works is very, very different," she says.

A traditional computer used to design or fly rockets comes loaded with all the requisite equations. Programmers explicitly tell it how to respond to different situations, and carefully test the computer programs to make sure they behave exactly as expected.

By contrast, these new systems develop rules of their own. They study a database filled with millions, or perhaps billions, of pages of text or images and pull out patterns. Then they turn those patterns into rules, and use the rules to produce new writing or images they think the viewer wants to see.

ChatGPT gets good reviews for things that it has done in areas where there are no right or wrong answers; things more like poems or songs, rather than areas where the wrong answers might kill a bunch of people and destroy a mission costing many millions of dollars.  The "AI" is mimicking the pile of physics textbooks that it's been exposed to," she says. It can't tell if the mashed-up text it's produced is factually correct. And that means anything it generates can contain an error.  Until the need to produce right answers is more important than just producing an answer. it can't be trusted.

AI researcher Gary Marcus worries that the public may be radically overestimating these new programs. "We're very easily pulled in by things that look a little bit human, into thinking that they're actually human," he says. But these systems, he adds, "are just autocomplete on steroids."

"We need an entirely different architecture that reasons over facts," he says. "That doesn't have to be the whole thing, but that has to be in there."

If you do a search for "ChatBot" you generally come up with way too anthropomorphic images, like this one from Gludy.com. It's really absurd but people seem to prefer the anthropomorphic art rather than a picture of a text interface on a computer - or simply headphones and a microphone.  A real AI chatbot doesn't need a body that's designed to look like anything in particular, such as human or dog.



7 comments:

  1. This does not surprise me one bit. Not at all.

    Can the AI 'write' better stories than an average journalist? Probably. Especially from the atrocious writings of the standard journo.

    But hard writing, like scientific and professional papers where facts and figures matter more than feelings, yeah, that needs to be written by hoomans. Preferably hoomans who actually know what they are writing about (which wipes out about 3/4s of the 'scientific' papers written every year.)

    ReplyDelete
  2. I remember when "Expert Systems" were the next Big Thing. Sometimes they produced very humorous results. These systems generally can't figure out when it's OK the fudge the rules a bit, something only experience can teach.

    ReplyDelete
  3. Current AI tech is nothing more than a fancy set of algorithms for mixing paint. Remember the craze with making images out of hundreds of thousands of smaller images, like pictures of people, based on their overall color content? AI can do this with general information, but the big "images" are as much nonsense as DALL-E drawing cars with wheels sticking out all directions.

    ReplyDelete
  4. All us old codgers are just getting in the way of progress. :)

    My personal gripe is hanging the tag "AI" on every damn shitty automated system out there. Customer service chatbot? AI! Unfortunately, in some cases, it's no worse than the live people in the call center. But hey, it's new and cool, and thus, somehow, automatically "good". Or more than good!

    When thinking about the future created by reliance on these things, somehow, I'm getting a mental image of the movie "Brazil". Yeah, it's not really applicable, except in its surreality, when thinking of how the results achieved by over-reliance on supposed machine intelligence produces bizarre results. And how the populace somehow thinks this is normal.

    I'm impressed by how quickly the "AI" moniker has entered the lexicon, and now gets tossed around with no discrimination whatsoever. Yes, I know, the concept has been around for a long while, but it used to be confined to techies and people who had some notion of the difficulties.

    ReplyDelete
  5. I agree. These are nowhere near any real definition of AI, just like everything but nuclear and hydro is Renewable Energy in any real meaning of the word.
    These are really just semi expert systems. Anything close to real AI wouldn't have the shortcomings described here.
    The best Expert Systems I've read about have been specific to a facility or company and required lots of review and double checking.

    ReplyDelete
    Replies
    1. The fundamental problem they have is applying a completely wrong model to intelligence. They talk about this AI "rocket science" application as developing rules. What it needs to do is be given the set of rules and Never Do Anything that violates a rule. Laws of Physics are called that because they can't be broken.

      ChatGPT is more suited to things like writing fiction or essays on why blue is the absolute best color for something. It isn't for trying to derive ways around the rocket equation.

      Delete