“Apple calls Bullshit on the AI Revolution”

Apple has earned its reputation for innovating the technology world. It is what solidified their status as a tech giant, and now their iPod, Macbook, iPhone, iPad, and Siri have yielded a variety of copycats. So, why aren’t they leading the AI revolution? Apple is actually the only big tech company that is not fully embracing the current AI frenzy. Surely they would want to be at the forefront of this generational-defining technology revolution? Well, we now might have a clue as to why. Apple AI scientists have published a paper that shows that even our most advanced Large Language Model (LLM) AIs still lack basic reasoning skills and, therefore, are not as useful as their creators claim. So, how did they figure this out? And what does this mean for Apple and the AI revolution?

These scientists tested several cutting-edge LLM models from Meta and OpenAI, including OpenAI’s latest o1 model, which automates a kind of prompting known as the chain-of-thought to give it cutting-edge reasoning ability (read more here). These tests were designed to probe how well the AI “understood” simple mathematical questions by adding tangential information.

You might think this involves difficult mathematical questions, but no. Instead, they resemble simple elementary/primary school math questions that even those who struggle with numbers would find easy. Yet the results were worrying.

https://archive.md/gUc4S#selection-329.0-329.41

Comment: This article is by Will Locket. It was published in Predict and Medium. In order for you to read the full article, I’ve linked to a screenshot on archive.md.  The point of this article is something I’ve been aware of for some time. Below is a presentation by John Sowa and Arun Majumdar, two AI researchers that I’ve known for twenty years when they first presented their project to me. Take it from this knuckle dragger, their stuff is far better than any of these well known and publicized LLM AI models. What impressed me most about Arun and John’s AI was that it was’t a black box. Even twenty years ago, their AI would present its reasoning process and evidence along with its answer to your question. Another point was that it worked in the wild, not in a data center or on a high end computer. It worked off a run of the mill laptop during the first demonstration I saw. During that first demonstration, I noticed some Prolog III code as Arun was scrolling through. That I noticed that impressed him, but that was the limit of my deep understanding. Much of this is mentioned in this close to two hour “nerdathon “presentation by John and Arun from last year. If you can sit through it, you’ll learn a lot, a lot abouyt AI, human cognition and the “wierdness” of these high level AI creators. The idea of emotional intelligence in AI is pretty interesting.

As I was writing this, another AI breakthrough is being touted in the industry press. The Chinese AI company DeepSeek released an open version of their reasoning model R1. It sounded pretty wild since it can run on a stand alone laptop just like Arun’s machine did twenty years ago. That is until I read this in a TechCrunch article.

Venture capitalist Marc Andreessen, for example, posted that DeepSeek is “one of the most amazing and impressive breakthroughs I’ve ever seen.” R1 seemingly matches or beats OpenAI’s o1 model on certain AI benchmarks. And the company claims one of its models only cost $5.6 million to train, compared to the hundreds of millions of dollars that leading American companies pay to train theirs.

Training costs $5.6 million? What the hell is involved in that? Perhaps someone here can enlighten me. Arun’s machine sent out hundreds or thousands of autonomous agents that taught themselves what was necessary to solve the assigned task in the wild. As a young contractor still in search of a contract, he certainly didn’t have $5.6 million to spend on training his AI.

TTG

This entry was posted in Cyber, Science, Technology, TTG. Bookmark the permalink.

25 Responses to “Apple calls Bullshit on the AI Revolution”

  1. jld says:

    Disagree.

    BTW $5.6 million is darn cheap for training an LLM, by nearly 2 orders of magnitude.

  2. A Portuguese Man says:

    I can tell you why it costs so much. Machines and the power to run them.

    The algorithm(s) used to produce the matrix i.e. the weights that are the model, need a lot of Graphical Processing Units – aka graphics cards chips. By a lot I mean thousands and thousands of them. If you take Nvidia’s supposed latest and greatest DGX H200, it draws 700 W.

    The algorithms take weeks, possibly well over a month, to run. That is 24h/24h at full draw possibly only moderated by some smart scheduling for temperature control. Right, and you need to add in power costs to cool all of this which will not be insignificant, far from it.

    There are also human costs around curating and labelling the training data. We are talking about maybe the whole text content of the open Internet, together with libgen and other pirate ebook libraries. Multimodal models will include also images and video/audio.

    So, that’s where the money goes. What I have read superficially about the Chinese model is that it is possibly distilled, which is a technique for using a model to “teach” another model. Maybe they are also using different chips and/or can get cheaper energy.

    I work in ML so I know more or less what I am talking about. But I am not involved in the operations of any of these LLMs.

  3. A Portuguese Man says:

    Regarding the “reasoning” of these models, it’s all bollocks and also hubris.

    LLMs or any other machine does not do any reasoning in any classical meaning of the word.

    There is a difference between your researcher’s work and the LLMs, and it could be said that it is “closer” to reasoning, but that really is meaningless. It would be analogous to saying a robot arm holding a brush is “closer” to painting than an LLM generating an image based on Rafael, which it can do quite easily. It’s still not painting as art.

    Reasoning requires reason. And machines, by definition, do not have it.

    What leads people to error and confusion is that very intelligent people come up with ways by machines are able to perform tasks that, for humans, inherently require intelligence to solve.

    The distinction lies in the fact that humans always need intelligence to accomplish such tasks, while machines do not.

    So the machines aren’t getting intelligent. We’re simply finding mechanisms (electrochemical) to perform task that we would use intelligence for to accomplish.

    But people trip on the fallacy that if a machine is able to perform a task for which a person needs intelligence, then it has become intelligent. The actual straight reasoning is that if we have such a machine then we have only proved that the task doesn’t actually need intelligence to be performed.

  4. A Portuguese Man says:

    Now, technically speaking, your researchers are trying to model human reasoning, using logic and so forth. That is not how LLMs work at all.

    LLMs are based on probabilities. At the very core of it is Bayes’ Theorem which you will surely know. This is not new, and neural networks from before the AI Winter already worked like that.

    What is new is the so-called transformer architecture and the context embeddings that they use.

    These embeddings are the “meat” of the system, so to speak. I will try to describe the general idea in crude terms. One proceeds as follows:

    Take any word from English, and go through every text in your training data – which is now the whole Internet and all ebooks – and every single word in them, and compute a number meaning how many times every pair of works were seen together across the entire set of documents.

    This will produce a matrix i.e. a table containing a number of every pair of words in the English language.

    You can start to see why so many machines are needed and why they have to run for so long.

    This is a very crude approximation of what is going on. The frequency calculation is way more complex, taking into account all the other words in the document, and all the other documents too and all the words in the immediate vicinity of each occurrence of each word. So you get not a number but a vector i.e. a sequence of numbers.

    In the end, you have captured a model of the probabilistic relationship between words in English insofar as they were ever written. These are called context word embeddings. There is a lot more than this of course, but irrelevant for the discussion.

    So when you demand something of the model, what is happening is that the system is being requested to “complete” or fill in the blanks of the text of your request. So it will go look in the matrix and start drawing words, evaluating at each point the whole conditional probability of what it has and what it is about to add.

    Show it enough text, and it will generate more text with an uncanny resemblance to a human.

    And since we humans typically require intelligence to speak and write, people are caught in the fallacy that this demonstrates the machine’s intelligence. The truth is, intelligence is not needed to write, and good writers have often wrote about it. LOL.

    Orwell’s essay “Politics and the English Language” comes to mind.

  5. A Portuguese Man says:

    Now, here’s the thing.

    The people making these machines know all this, and they *know* the machine is not thinking.

    But.

    Everyone else thinks the machine is thinking and is in awe of it. So they start drinking the kool-aid:

    – what if the probabilistic relationship of all the words in English were the same as thinking and reasoning, and so forth?
    – what if we are actually creating intelligence?
    – etc, etc

    Pure hubris and deficient or non-existent philosophical education.

    Of course, if one is a mass-produced materialist of modern days, then you will be able to brush enough crap under the rug to fit this wherever you want to. That’s what one already does anyways…

    You are a serious materialist you will begin to see serious problems.

    If you are not a materialist then this is just ridiculous and the proof that one can be technically proficient in logic but still dumb as rock, depending on the circumstances.

  6. A Portuguese Man says:

    In the end, do not underestimate these mechanisms.

    They are extremely capable and can thus be extremely useful and extremely dangerous. Especially when we start talking about images, audio and video.

    We will have to revise the admissibility of such elements as elements of proof in court, for instance. It is just inevitable now. These mechanisms will be able to generate the likeness of anyone doing anything, easily. And, with a determined and skilful operator, do it a way impossible to distinguish from an actual likeness captured by any kind of digital sensor.

  7. A Portuguese Man says:

    Another analogy for the fallacy of AI would be to consider illusionism to be actual magic only because the illusions are extremely compelling.

    • frankie p says:

      A Portuguese Man,

      Thank you for writing out your insights on this topic. It was very valuable to the layman.

      Frankie P

  8. jld says:

    DeepSeek AI can be pretty “philosophical”, venturing into Buddhism.

    https://x.com/mpshanahan/status/1883189053497184728

  9. voislav says:

    Cost of training is the cost of renting the data center to run the model while it’s fed training data. Most of these LLM’s are fed a generalized training data set, so not targeted at any particular application, to allow it to run its solver in real time afterwards. These data sets are very large, covering a range of applications. Sounds like Arun’s AI machine was collecting task-specific training data after being issued a task, this is more efficient, but getting the answer takes longer.

    In my experience AI is quite useful for specific applications where quality training data is available. I am seeing a lot of use of these in my field (chemistry) and it works well, short-circuiting a lot of the development work that would’ve otherwise taken months or years. We run on systems similar to Arun’s model, training data sets are small, tailored for specific tasks and not generalized. These kind of AI models have been around for more than 10 years, but there is no big money in these because they are highly tailored.

  10. Lesly says:

    Don’t remember the name but an economist floated the idea a year ago that throwing billions at AI research will force China to do more with less. DeepSeek seems poised to pop the bubble. Wonder how Trump will impose tariffs on opensource tech. Wouldn’t be surprised if China is keeping additional AI tech under wraps to avoid admitting the true $ amount spent on government research using black market sources and Nvidia’s “weell akshully, this slightly modified GPU is technically not in violation of U.S. sanctions” footsie play.

    When I first heard our jobs were in danger I thought, good luck sorting out what the client wants vs. what they ask for on paper and God bless.

    • TTG says:

      Lesly,

      More than 20 years ago, the Chinese were scouring the academic world for experts in geometric algebra. They offered them lucrative jobs back in China. That’s probably paying dividends now. I obviously don’t understand the stuff, but Arun’s and John Sowa’s machine uses things like geometric algebra and conceptual graphs in their more visual approach to knowledge representation and pattern spotting. Their stuff relied on the software rather than high end hardware like ChatGPT and other similar AIs.

  11. elkern says:

    I agree with Apple and with A Portuguese Man: AI is BS, at least so far.

    One of my favorite anecdata points on AI is the “ads” for AI on NPR *brag* that their product is “hallucination-free”, which – even if true – does not inspire confidence. (Would you hire a Human who put that on their resume???)

    But Wall Street needs Bubbles to chase, and those Techbois def give good bubble. Opaque technobabble is great bait; some Whales actually believe it, some just buy in because they believe others will.

    AI is particularly attractive to Wall Street now, because it holds out the promise of freeing Corporations from dependence on expensive (and unreliable) Humans. Engineers probably don’t have to worry about the competition *yet*, but Hollywood has already been using AI to avoid paying screen writers (that was a big issue in the strike a couple years ago).

    OTOH, I also agree with voislav, that computers can be very useful when “tailored for specific tasks and not generalized”.

    OTOOH, I look forward to watching our Mechlizard Masters eventually replace all the overpaid CEOs with “more efficient” [read: “cheaper”] AIs.

    • Lesly says:

      “AI is particularly attractive to Wall Street now, because it holds out the promise of freeing Corporations from dependence on expensive (and unreliable) Humans.”

      This is the real answer: AI is the solution to the problem of squandering profit on payroll. FAANG mopped up too much talent during COVID. They’ve shed a lot of jobs since and these guys are having a hard time finding gainful employment. Some are turning to work overseas in places like… China.

      As for our techbro overlords I don’t get the fandom. For example Musk likes to promote himself as the most efficient techbro around, but when you look at his random firing fits, he seems to excel at dumping human capital.

    • LeaNder says:

      that their product is “hallucination-free”, which – even if true – does not inspire confidence.

      elkern I stumbled across that subject, I think, in a article by NZZ (Zurich paper, Neue Züricher Zeitung).

      ‘Deep’ apparently avoids certain political questions, starting to answer and then suddenly stops while some AI occasionally wanders of the path of facts and starts “hallucinating”, which I assume means making up answers. 😉

      Relevant passages via machine translation:
      Deepseek:Questions about facts that are affected by censorship in China show a conspicuous feature: First the tool visibly starts to write an answer, but a few seconds later Deepseek deletes the answer and writes a one-liner: “Sorry, this is beyond my scope. Let’s talk about something else.” (See video.)

      Reports are also circulating on programming forums that R1 produces better “thought processes” when it provides answers. 01 uses the same “chain of thought” technique to list the individual steps in solving a task and correct itself if necessary. Open AI hopes to base the success of o1 and its successor o3 on this approach. But apparently Deepseek can do this pretty well too.

      Other users are impressed by the quality of the results when they ask R1 to write creative texts. Others note that R1 is less prone to making up “facts” – the “hallucinating” known from other chatbots is therefore less frequent.
      Translated with DeepL.com (free version)

    • James says:

      elkern,

      My software development team and our devOps team are using ChatGPT and Microsoft Copilot every day to do our jobs. Its become a must have tool for software developers.

  12. scott s. says:

    I actually had an emphasis in AI in my CS masters program at Naval PostGrad School in the end of the 70s. AI was in a bubble at the time. The big idea was “expert systems”. The methodology was to distill domain knowledge into “rules”, then developing efficient search mechanisms to traverse the knowledge space. This was in back of the days of Marvin Minsky and the like. “Understanding natural language” was one great goal, but the problem was creating rule sets that could incorporate everything from complex ideas to “everyday / common sense”.

    • TTG says:

      scott s,

      Expert systems was what I first encountered in the 1990s as a case officer. One of my Polish sources was an AI researcher. In addition to reading everything I could find on the subject, I picked up a copy of Borland’s Prolog to learn a little background in the field. Actually programmed a rudimentary expert system with it. It’s why I was able to recognize Prolog III code in Arun’s machine years later.

  13. Lars says:

    When I was a computer programmer 50+ years ago, one of the first things we had to deal with was GIGO (Garbage In, Garbage Out). Comparatively, it was rather primitive then. Essentially, if a data point was a number, you made sure it was. What they are dealing with now has to be much harder, but if open source is used, how do they deal with all the fake crap that is sloshing around? I have used some AI mainly to do searches on the Net and it seems to work fine, even if limited. But I agree that much of the concerns are based on hype and as far as the stock markets are concerned, they have seldom based much on real data. After all, once there was a thriving market for pet rocks.

  14. Efficiency is to do more with less.
    Until we reach peak efficiency and can do everything with nothing.
    That’s why the feedback loops need circuit breakers.
    Not like this is new. Debt jubilees were circuit breakers to the feedback loop of compound interest.
    The problem over the long term is that when it all crashes and burns, it will be the warlords in control, not the bankers. Then after awhile they declare themselves kings and we start over again.
    History repeats, because we are too thickheaded to learn the first time, or the second…
    The nodes are synchronization, the networks are harmonization.
    Black holes and black body radiation.
    We are the resonances and reverberations in the middle.

    • Tidewater says:

      John Merryman,
      “…and we start over again.”
      Not with a Double Blue Ocean event. I am afraid we don’t.

      • Given there won’t be much of that two billions years of stored sunlight left, it will be pretty far down the hole, but people evolved to adapt, even if it’s just the Amish.
        Stephen Jay Gould’s Punctuated Equilibrium; The equilibrium stage selects for complexity and specialization, as every niche is filled and resource is used. Then the punctuation stage selects for adaptability and resilience.

  15. drifter says:

    Imagine a synthesis of all human knowledge.

    Garbage in, garbage out.

    Food for thought.

  16. Jim. says:

    Interesting Discussion…EIXO
    I Know Nothing About AI…Except that Its Gone Through Infancy..Child hood Development…And Like the Big Foot We Saw..Made it to young Adulthood…

    Im Sure..Its Become The New Action Figure..Beenie Baby…Thing to Do…

    The Desire is to Be the First to Teach Artificial (Machine) Intelligence..More Data Than Any one..Great or Small….Look at Our Modern World..How it Advances As Directed…
    Just Condsider The Period from 1900AD to 2025 AD..All Industrial..Social..Economic..Technological ( Broad Tech Evolutions) And National…Revolutions.
    .Creating Confusion..masking….Exploitation…Advancing Nation Master Minding ..
    by Agents…Known and Unknown..like a Powerful Alien..Enemy of Mankind..

    .By All The Next twentieth Century..Tool For Money and Power….THE INTERNET…WWW..Electronic…Mind Control…Right into The
    Minds…Brains..Cross Fires..Thinking..Behaving Minds of Any One or Every One…The World…The Battle… A..Living Intelligence….VS….Artifical Intelligence..
    The Advancements and Revelations Will Obviously ..Double..Capabilitys,,Perhaps More.. We Are …only particles…inside a nuclear Reactor..Inside….Looking Out.
    See..Are They Out There..Cloaked.?..I Think its possible..Bigfoot.. Is the Smartest Guy.

Comments are closed.