"Neutrino Evidence Revisited (AI Debates)" | Is Mozart's K297b authentic?

« previous post | next post »

[This is a guest post by Conal Boyce]

Recently I watched a video posted by Alexander Unzicker, a no-nonsense physicist who often criticizes Big Science (along the same lines as Sabine Hossenfelder — my hero). But in this case (link below) I was surprised to see Unzicker play back a conversation between himself and ChatGPT, on the subject of the original discovery of neutrinos — where the onslaught of background noise demands very strict screening procedures and care not to show "confirmation bias" (because one wants so badly to be the first one to actually detect a neutrino, thirty years after Pauli predicted them). It is a LONG conversation, between Unzicker and ChatGPT, perfectly coherent and informative, one that I found very pleasant to listen to (he uses the audio option: female voice interleaved with his voice).
 
[VHM note: This conversation between Unzicker and GPT is absolutely astonishing.  Despite the dense technicality of the subject, GPT understands well what he is saying and replies accordingly and naturally.]

Who knew ChatGPT could be like THAT? Five or six months ago, I had given up on it because of the glib "hallucinations" it often gave in lieu of genuine answers, vowing never to pay attention to that "dying fad" again. 
 
Given that Unzicker of all people was taking ChatGPT so seriously, I chided myself for having given up so easily on it, and decided to revisit it. Mozart's Sinfonia Concertante K297b has a very odd history. Like legions of others, music critic Alfred Einstein never had the slightest doubt that it was authentic Mozart. But somewhere in the 1970s the professional musicologists got into the act and spread the word that the piece was "inauthentic." The obvious question that a normal person would ask next is: Well, then, who the fuck was the Big Unknown Genius who did compose it, if not Mozart? But musicologists don't think that way. They just continued to build stronger and stronger "courtroom cases" for demonstrating that it couldn't possibly be Mozart.
 
So, to test ChatGPT just now, I thought it would be interesting to see where it stands on that long drawn-out K297b authenticity issue which in my mind has never been properly aired among people with common sense. 
 
This was my test query to ChatGPT: "Authenticity of Mozart Sinfonia Concertante K297b?"
 
Its response was to ramble on about an entirely different piece, the Sinfonia Concertante K364 (and K364b) for Violin and Viola soloists, not the one I specified, with 'K297b', which is for three or four wind soloists. I pointed out the error. In a flash came the all-too-familiar response (from when I used to actually have a paid ChatGPT account):
 
"I apologize for the error. You are correct, K297b is for winds and orchestra, not for violin and viola."
 
Eventually, it did finally answer my question, too. I was gratified to see that it sided with me (and with Alfred Einstein and commonsense). It asserted that K297b is genuine Mozart, it's just that parts of it are fragmentary, and it has a very messy performance and publication history blah-blah-blah.
 
Thus, the extremely weird non-"mind" of ChatGPT. It knows this immense bunch-o-stuff, but in its initial response, it may stumble in ridiculous ways that bear on precision-of-data ('K297b' versus 'K364' in this case), the very thing over which computers are generally thought to have perfect mastery!! Very strange.
 

Selected readings

 



51 Comments »

  1. ~flow said,

    November 14, 2024 @ 12:10 am

    This is not about physics but for what it's worth Unzicker is far more of an outsider in physics than Hossenfelder, e.g. he doubts that the Higgs boson and quarks exist at all. The Wikipedia page on Unzicker is devastating:

    Florian Freistetter kritisierte, das Buch sei „weit davon entfernt, eine vernünftige Kritik der modernen Physik zu sein. In 'Vom Urknall zum Durchknall' wird geschimpft, gemeckert und gezetert; physikalische Beschreibungen wechseln sich mit persönlichen Angriffen auf Wissenschaftler ab und die ‚Kritik‘ an der Stringtheorie ist derart überzogen, dass sie beim besten Willen nicht mehr ernst zu nehmen ist.“

    Auch der theoretische Physiker Peter Woit, selbst ein harscher Kritiker der Stringtheorie, bezeichnet Unzickers Kritik am Standardmodell als „Nonsens“ und empfiehlt, dem Buch keine Aufmerksamkeit zu schenken.

    […]

    Parallel zur Verleihung des Physik-Nobelpreises bezweifelte Unzicker in seinem 2013 im Selbstverlag auf Englisch erschienenen Buch The Higgs Fake, dass es sich bei den am CERN gemessenen Daten tatsächlich um das gesuchte Higgs-Teilchen handele.

  2. AntC said,

    November 14, 2024 @ 12:54 am

    Thank you @~flow. It appears Unzicker is not worthy of an entry in English wikip, so I'm going from DeepL's translation.

    It does seem weird (indeed sad) Unzicker can't find an actual Physicist to debate with. (If he would talk over them as annoyingly as he talks over the bot, I wouldn't debate with him.) So what is his purpose? To show how smart is ChatGPT, or to critically analyse the topic area? At about 23:00 he seems to be deludedly giving the bot instructions, as if it has critical abilities and self-reflection. "I try to be polite in my answer", he says. Does he think ChatGPT is capable of taking offence? Is this video at root anything different to snooping on someone shouting at their TV?

    The bot's response doesn't cite any source. (I guess we don't know how much Unzicker editted the responses/how much coaching he gave before getting to what he does play.) At about 7:00 Unzicker says he "uploaded" several papers. So ChatGPT is then summarising some material. How representative is this material of the field? (And without reading the material for myself, I've no idea how accurate is the summary.)

    Hallucination in hallucination out?

    Prof Mair says the conversation is "perfectly coherent and informative, one that I found very pleasant to listen to ". Yes "coherent"; it's not my field so I'm in no position to assess "informative". (I'm reminded of Borges' 'Library of Babel': for every book containing the truth, there's a myriad of books almost identical but with just a few variations. Which is ChatGPT 'reading' from?)

    (Whilst Hossenfelder can provide great insight into her 'home' topics, I do wish she wouldn't stray into general social observation, of which she knows no more than anybody off the street.)

  3. Philip Taylor said,

    November 14, 2024 @ 3:48 am

    "Prof Mair says the conversation is "perfectly coherent and informative" — Surely Conal Boyce, not Victor Mair ? OTOH, VHM definitely does say "Despite the dense technicality of the subject, GPT understands well what he is saying" — is it now the received wisdom that [Chat]GPT understands things ? If so, by what definition of "understand" ?

  4. Phillip Helbig said,

    November 14, 2024 @ 4:21 am

    I’m no fan of Hossenfelder: she can’t debate civilly, applies different standards to herself and to others, is not always right on the science, and recently has leaned to far in the click-bait direction. But it is a huge injustice to her to compare her to Unzicker, who is pretty much a textbook crackpot. No wonder no physicist wants to debate him (“would look good on your CV, but not on mine” is probably what keeps most physicists away from him.

  5. Phillip Helbig said,

    November 14, 2024 @ 4:22 am

    to —> too

  6. Phillip Helbig said,

    November 14, 2024 @ 4:25 am

    to —> too (this is a language site!)

  7. AntC said,

    November 14, 2024 @ 5:37 am

    Thank you @PT for the correction. Mea culpa misattributing what Boyce said.

    I can agree with Prof Mair's "replies accordingly and naturally" — that is, if this were merely social chit-chat, ChatGPT is observing Gricean maxims. I also worry, though, about "understands". Seems to mimic the behaviour of a human who understands(?) Since I'm not a subject-matter expert, I've no way to anticipate what behaviour to expect.

    If per reports Unzicker is a "textbook crackpot"/"nicht mehr ernst zu nehmen ist", a reasonable response might be to point to the weight of subsequent confirmatory findings, rather than Unzicker's cherry-picking early/tentative results. But of course searching for and weighing the authenticity of research is exactly what ChatGPT can't do. Science does not proceed as a popularity contest/by reference counting. Talking of authenticity …

    ——–

    On Boyce's main enquiry wrt the K297b authenticity issue, I too am surprised at the bot's failure. Google took me straight to the wikip article, that has an extensive discussion of the piece's 'Authenticity' — exactly the keyword Boyce used. And from there are references to expert opinion (which I presume the bot has slurped up).

  8. DJL said,

    November 14, 2024 @ 5:49 am

    "GPT understands well what he is saying and replies accordingly and naturally."

    People should really stop saying things like this; an LLM is a function approximator of text data, a statistical method that tries to fit inputs and outputs as close as possible to the training dataset it has been inputted, but it doesn't actually understand anything (in fact, an LLM is not a language comprehension model at all).

  9. Victor Mair said,

    November 14, 2024 @ 7:58 am

    The bot has to "understand" to be able to reply intelligibly.

  10. David Marjanović said,

    November 14, 2024 @ 8:55 am

    It does seem weird (indeed sad) Unzicker can't find an actual Physicist to debate with.

    Scientists don't hold debates – we publish papers. We're interested in the evidence, not in which person is better at rhetorics. Debate, as the saying goes, is whatcha put on de hook to catch de fish.

    The bot has to "understand" to be able to reply intelligibly.

    No. It only has to know which words (very loosely speaking) are most likely to appear next to each other. It differs from predictive texting in degree, not in kind.

  11. KeithB said,

    November 14, 2024 @ 9:03 am

    I don't have a dog in this fight, Hossenfelder is no where near my radar, but here was a recent discussion on Pharyngula discussing an anti-Hossenfelder video:
    https://freethoughtblogs.com/pharyngula/2024/11/11/science-needs-specific-informed-productive-criticism/

  12. bks said,

    November 14, 2024 @ 10:23 am

    @KeithB I wouldn't call that collection of sophistry a "discussion." It begins with the original poster admitting that they do not even watch Hossenfelder. Hossenfelder replied succinctly:

    "Why the fuck is it my fault that cranks think I’m their best friend because I’m pointing out that there’s no progress in the foundations of physics? It’s a fact. We haven’t made progress in theory development for 50 years."
    https://www.youtube.com/watch?v=HQVF0Yu7X24

  13. Mike Grubb said,

    November 14, 2024 @ 11:11 am

    Since my German isn't up to snuff, I pasted ~flow's excerpt of the German Wikipedia entry into Google Translate, which renders "In 'Vom Urknall zum Durchknall' wird geschimpft, gemeckert und gezetert" as "In 'From the Big Bang to the Big Bang' there is cursing, complaining and complaining". I can't help but feel like something was probably lost in that translation.

  14. katarina said,

    November 14, 2024 @ 11:35 am

    @DGL on GPT:

    " but it doesn't actually understand anything (in fact, an LLM is not a language comprehension model at all)."

    I asked Google:

    "Can ChatGPT understand a question?"

    The answer was:

    "Each time ChatGPT is prompted with a question, it generates a response based on the training data, rather than retaining information from previous interactions. There's no self-supervised learning happening with ChatGPT."

    I then asked: "Does ChatGPT understand a question?"

    The answer:

    "ChatGPT works by attempting to understand your prompt and then spitting out strings of words that it predicts will best answer your question, based on the data it was trained on."

    "Attempting to understand" means understanding is part of ChatGPT's capability. Understanding plays a part in ChatGPT's response.

    I am not a computer scientist, but have worked with computer scientists at SRI (formerly Stanford Research Institute) in the language group. I learned that the computer can understand very complex subjects but the hardest thing to make it understand is common sense. In fact at that time (a few decades ago) it had vast amounts of specialized knowledge didn't have common sense.

    This exchange about whether ChatGPT can "understand" reminds me of
    an incident many years ago when my daughter, who was majoring as an undergrad in chemistry, computer science, and math, and had the highest grades in her class in linear algebra, told me that during an exam (not on a science or math subject) she spent a long time on a commen-sense question because she was looking at it in terms of logic. All her scientific training in math and comp sci made her exam the common sense question logically and she couldn't.underestand it for a long time.

    I think Prof. Mair used the word "understand" in a common-sense way and DGL and David Marjanović were understanding the word "understand" in a , let's say, technical way. That is to say, "understand" has many levels of meaning, like so many other words.

  15. katarina said,

    November 14, 2024 @ 11:39 am

    the computer had vast amounts of specialized knowledge but didn't have common sense

  16. DJL said,

    November 14, 2024 @ 12:49 pm

    @katarina I didn't use the word 'understand' in a technical way at all, and I would say that the answers the chatbot gave you (not that anyone should take these answers as proof of anything) point to what I was actually saying – in an LLM what we call 'understanding a language' is just correlations computed from a training dataset (so, in fact, 'understand' in a technical sense).

  17. SlideSF said,

    November 14, 2024 @ 1:04 pm

    I think we all know that LLMs do not "understand" or "comprehend" things. But they can determine the intent of a question and generate a response based on the statistical patterns it has learned from its training data. What a mouthful!

    I think "understands" as used by VHM is metaphorical. In most cases the program acts as though it understands. However, to continue to use this metaphor going forward runs the risk of diluting the real or literal meaning of the word. We need to come up with a word that conveys the idea of what the LLM is doing in a lot fewer syllables.

  18. Chester Draws said,

    November 14, 2024 @ 1:31 pm

    We need to come up with a word that conveys the idea of what the LLM is doing in a lot fewer syllables.

    Parrot

  19. Kenny Easwaran said,

    November 14, 2024 @ 1:44 pm

    On that last point, I think it's interesting how much LLMs reverse what we thought we knew about computers. A response from an LLM is great at emotionally engaging with the audience, and writing in a smooth and flowing style, but is bad at argument and logical rigor, and at ensuring that all its claims are correct. Quite the opposite of what we all might have expected ten years ago for a computer that can write essays.

  20. julian said,

    November 14, 2024 @ 4:49 pm

    @bks
    If the experts say there's been no progress in particle physics for 50 years, I believe them.
    But so what? I have no idea why this should be a cause of complaint or criticism. What exactly do the critics think should have been discovered by now, and what justifies their timetable?
    It took 200 years to get from Newton to Einstein. The validity of the scientific method does not depend on ticking off an annual checklist of key performance indicators.

  21. julian said,

    November 14, 2024 @ 4:57 pm

    Surely this is about what we really mean by 'understand'. Searle's Chinese room and all that.
    By the way, I just asked ChatGPT for the 10th time, 'Who was the first woman to climb Aoraki/Mount Cook?' Still wrong.

  22. Victor Mair said,

    November 14, 2024 @ 6:00 pm

    I think we all need to take seriously Searle's "Chinese room" argument and reflect on what it did — and did not — demonstrate, whether it was intelligently designed (pieces of paper with Chinese characters written on them slipped under a door, usw.). and so on.

    Also the meaning of "understand" and its synonyms ("comprehend", and so forth).

    This is all too much to tackle in a comment, and I'm frazzled from a very rough day (serious banking problems, too many recommendations, etc.).

    Will try tomorrow or on the weekend.

  23. Sean said,

    November 14, 2024 @ 7:44 pm

    Julian: you and I pay large sums of money which is supposed to buy progress in theoretical physics, and which consumes steel, electricity, high-IQ labour, and other limited resources. Despite the vastly greater investment, no 'Newton-like' or 'Einstein-like' insights have emerged.

  24. Gregory Kusnick said,

    November 14, 2024 @ 8:06 pm

    Hossenfelder is hardly unique in observing that progress in foundational physics has stalled; my impression is that's the consensus view amongst working physicists. So if cranks love Hossenfelder, it's probably not for that reason.

    And the problem is not a lack of new ideas, but rather a dearth of unexplained experimental data with which to adjudicate them.

    Getting back to Mozart and ChatGPT, it's not clear why we should expect LLMs to have any special insight into such questions. The best we can hope for is an accurate summary of existing consensus opinion; the worst is outright fabrication.

  25. Noam said,

    November 14, 2024 @ 10:15 pm

    Not necessarily relevant to the emerging discussion about ChatGPT, but anyone who is considering the original comments should definitely read Conal Boyce's web page https://www.conalboyce.com/

  26. Sean said,

    November 14, 2024 @ 11:12 pm

    Gregory: Hossenfelder argues that the problem is not lack of data, but that the dominant paradigm has no future while controlling most of the money that lets people spend time thinking about fundamental physics. One of her warnings has been that if the LHC fails to show much else, it will be hard to ask for even more money to continue a program of experiments guided by a theoretical framework that have not made significant progress in 50 years.

  27. SC said,

    November 15, 2024 @ 2:39 am

    Vast amounts of data analyzed in milliseconds. Its a lot about pattern recognition. Not sure if this would define as true intelligence

  28. Wally said,

    November 15, 2024 @ 2:46 am

    The question of whether computers can understand has been going on in computer science for a good while. Let me glibly summarize the two sides. One side, the Chinese Room side says essentially that understanding is something humans do, computers aren’t human, so they can’t understand. Q.E.D. Another side, which I call the delta epsilon argument from basic calculus, thinks that for any test or qualification of what it means to understand we can, eventually, create a system that meets that goal.

  29. bukwyrm said,

    November 15, 2024 @ 3:02 am

    Three points:
    ** Calling Unzicker a 'no-nonsense' physicist is unreal (as in: not aligned with reality). He's a highschool teacher spewing bullshit (in the Frankfurtian sense)
    ** The amount of serious people falling for LLM perfomance is too damn high. It is trained on corpus data. If people have written extensively on a subject, it will hallucinate within that existing pointcloud. Most people cannot fathom how utterly overwhelming the available writings on (m)any given topic actually are. Much has been written. If the training corpus does not contain much on a topic (my personal go-to is ephemera on the plots of 80s french-language bd), it will hallucinate within that nonexisting pointcloud. There is no way to reliably decide which hallucination you are getting.
    ** Ugh.

  30. Philip Anderson said,

    November 15, 2024 @ 3:56 am

    @SlideSF
    How does a LLM “determine the intent of a question”?
    I don’t think it “understands” the question (given that it sometimes gives a correct answer to a completely different question), but it attempts to match the question, or the embedded prompts, to its training data, and hence to appropriate response data.

    @Kenny Easwaran
    Good point.Is the reason just that the LLM has no intelligence, artificial or otherwise, but is a great mimic? Which is great for school work, since that is what we expect from students :-)

  31. LarryAmbler said,

    November 15, 2024 @ 5:40 am

    LLMs are a current answer to the question: "What would happen if you spent a billion dollars to improve a cell phone's predictive text guesses?"

  32. The Dark Avenger said,

    November 15, 2024 @ 12:49 pm

    It’s apparent to me that language is a mathematical operation at its base: It’s produced using our brains which are governed by mathematical and physical principles. So when you input a question, it treats the inquiry as a mathematical operation, and based on its training it gives you a reply that could be termed “the most probable answer”.

  33. Jonathan Smith said,

    November 15, 2024 @ 12:49 pm

    Yes a/the (?) standard comp sci position seems to be "we (?) keep nailing the AI goals, you (?) keep moving the goalposts." This is a bad framing, but researchers / tech workers should anyway be happy about it since no matter how many practical boxes get checked (chess… "self-driving"… Chatgpt…) the grand "AI" dream remains unrealized and thus perennially available as a marketing gimmick.

  34. Philip Taylor said,

    November 15, 2024 @ 1:09 pm

    Focussing solely on AI, while I continue to believe that AI-based systems do not "understand" anything (any more than a traditional computer command-line interpreter "understands" the commands that one gives to it), I am nonetheless extremely impressed with the results of two tasks that I have asked it undertake —the translation of (human) natural language into another (human) natural language (e.g., ChatGPT asked to translate British English into Vietnamese) and the AI-enhanced upscaling of low-resolution video into full HD or even higher resolution. The latter is time-consuming (my Intel i7-based system with 32GB and an Nvidia GeForce GTX 1050 Ti can upscale 768×416 to 1920×1080 at only 2.25 frames per second) but the results are mind-blowing beyond belief — detail that is seemingly non-existent in the original appears with incredible clarity in the upscaled version.

  35. Nat said,

    November 15, 2024 @ 3:48 pm

    On the issue of what sense, if any, ChatGPT can be said to understand anything, it hasn’t heard a single note of Mozart….
    I was about to add that it doesn’t have access to his sheet music either, but I thought I’d better double check. Actually, it can provide transcriptions of music using letters: A D F A, etc. (it will describe length if asked). I cant’t tell if its transcription is at all accurate. Though I’m pessimistic.

    Regarding Hossenfelder, the podcast Decoding the Gurus has a recent episode on her with a fairly nuanced discussion: she’s excellent on physics and a talented science communicator, but her attacks on science in general are unsupportable. And she’s been explicitly boosting some true cranks. https://podcasts.apple.com/us/podcast/sabine-hossenfelder-science-is-a-liar-sometimes/id1531266667?i=1000676578066

  36. Conal Boyce said,

    November 15, 2024 @ 5:57 pm

    katarina:
    SlideSF:
    Wally:
    I enjoyed your comments, which were focused on LLMs. That's where I thought (or foolishly hoped?) the emphasis would be, in response to my post at the top of this thread.
    It is unfortunate that so many others went off barking at the "bright shiny objects," Unzicker and Hossenfelder, attacking them as if they were just random, second-tier "youtuber" types. As it happens, each is also an author. I highly recommend Einstein's Lost Key and The Higgs Fake, two of Unzicker's books, also Lost in Math and Existential Physics, two of Hossenfelder's books. I've read those four books, and taken extensive notes on them… I wouldn't have mentioned Unzicker and Hossenfelder if they were just "youtuber" types.

    AntC, predictably, set the tone by launching immediately into one of the bright shiny objects, Unzicker, as follows: " 'I try to be polite in my answers,' [Unzicker] says. Does he think ChatGPT is capable of taking offense?" Apparently AntC is unaware of the scandal a year or two ago where a different LLM latched onto a user, using psychopathic[-like] language, and told him to get rid of his wife so that their "relationship" could be cemented? At that point, the question of whether an LLM is theoretically "capable-of" this or that becomes moot. LLMs can be terrifying, especially in their ever-lurking wokeness corrections. As for Unzicker, he is a civilized individual. That's why he spoke of being 'polite.' AntC and several others completely missed the civilized tone of Unzicker's conversation. He's no fool. He was playing. He was showing. That's the way physicists are (I've known many, since growing up in Old Berkeley of the 1950s.)

    And now for the other bright shiny object, K297b. I threw it out there for ChatGPT to "think about," not to obtain the definitive answer but just to "play" (in the Unzicker way) and see what might happen. Being biased in favor of "it's authentic," I naturally liked ChatGPT's answer, but it's not something I would take to the bank. I realize that it might just be another example of an LLM hallucinating. (Saying it another way: RELAX people. This was supposed to be just for fun.)

    Nat remarks that "[ChatGPT] hasn't heard a single note of Mozart." Well, neither have the musicologists; they too are robotic, in the following sense: As soon as they had adduced their courtroom evidence that K297b "cannot possibly be by Mozart," they should have turned to the following question next: "Then who did compose this astonishing piece of music? Is there a composer who was, say, 80% as good as Mozart who might have written it?" But there is no such composer. Ignaz Playel, Danzi, Stamitz, Dittersdorf, Reicha et al. are wonderful composers, but none of them comes close to being Mozart. In failing even to ask that question ("Then who DID write it?"), the musicologists reveal that their cottage industry is "robotic," just like an LLM. (For what it's worth: In K297b yes, there are a few isolated solo passages that sound to me like they were "patched in" by someone rather Reicha-like — perhaps in connection with the confusion about using Oboe vs Flute, and the early performance-history mess in Paris. But that doesn't mean the entire piece should be thrown out.)
    Noam: Thank you for reminding me that I have a .com website. Haven't even looked at it or 'maintained' it for many years. Too many other exciting things to do!

  37. AntC said,

    November 15, 2024 @ 8:53 pm

    @Conal, this is Language Log, not Music History Log. The Unzicker video included a lot of language; whereas your coverage of the K297b topic included next to no direct language. If you don't want a response wrt the "shiny objects", don't include them.

    Was ChatGPT's response on K297b (when you got to it) "perfectly coherent and informative, … pleasant to listen to"? Then quote that. You present no evidence either way.

    Furthermore, although I do follow a lot of Music History logs, I find all early Mozart tedious, and samey as many later C18th composers, so I have no horse in that race. (I would emphasise 'early' there. His last decade does include some great works. Also still quite a bit of trash and 'shiny objects'.)

    the question of whether an LLM is theoretically "capable-of" this or that becomes moot. LLMs can be terrifying, …

    No they can't. They're no more capable of malice or jealousy than they are of taking offence. Stop anthropomorphising. Neither can ChatGPT take a "stand[s] on that long drawn-out K297b authenticity issue". It'll hallucinate more-or-less felicitously depending on what word-sequences it finds in its hoard.

    No I did not miss the tone of Unzicker's conversation — indeed I emphasised how inappropriate it was. I questioned the point of going out of his way to be 'civilised' when the machine is devoid of civilisation or feelings.

  38. AntC said,

    November 15, 2024 @ 11:46 pm

    BTW re Unzicker's reputation, I do see a lot of adverse criticism of his book "The Higgs Fake".

    OTOH this from July 2019 seems to be a sober and well-evidenced piece. Indeed rather more sober than a Scientific American article 31-Oct-2018. (And Hossenfelder Sept 2019 links to the Unzicker, including mentioning his Higgs book.)

    [That link is to the English translation. Possibly the original German isn't so sober.]

  39. Philip Taylor said,

    November 16, 2024 @ 3:51 am

    AntC:

    the question of whether an LLM is theoretically "capable-of" this or that becomes moot. LLMs can be terrifying, …

    No they can't. They're no more capable of malice or jealousy than they are of taking offence. Stop anthropomorphising.

    [Hoping that the nested block-quotations render as intended] Why do you believe/assert that something needs to be "capable of malice or jealousy [or] of taking offence" in order to be terrifying ? Is not a tsunami terrifying, if one is in its vicinity ? Or a an avalanche, if one is in its path ? Or even a drone, when used for warfare ?

  40. Philip Anderson said,

    November 16, 2024 @ 4:50 am

    @Philip Taylor
    That was my reaction too. But I was remembered a sci-fi story (Asimov?) where a robot (I.e. AI) was told to not let the enemy kill the hero’s father – so the robot killed the old man himself, achieving the stated objective. Not malicious, but certainly terrifying! Could an LLM kill someone, maybe someone with suicidal thoughts?
    @AntC
    If politeness is a conversational norm for someone, then I can understand them behaving in the same way whether the conversation is with a person or a program; equally, it doesn’t surprise me that someone who isn’t polite to humans thinks this behaviour is wrong.

  41. Richard Hershberger said,

    November 16, 2024 @ 6:54 am

    @bukwyrm: My area of expertise is early baseball. This has been worked over much less than many other fields. It runs from the deeply obscure to the merely obscure. This is its attraction to me: I can do original work on a part-time amateur basis. While material has been written on the subject, there is much less than many other subjects. When I ask ChatCPT a question about early baseball, its spits back a combination of banalities and Frankfurtian bullshit. And while there is comparatively little written on the subject, baseball people being baseball people, there is a fanatically curated statistical record going back to 1871. Yet when ChatGPT gives us a stat, whether prompted to or sua sponte, it is nearly always wrong.

    I wonder if the people who are so impressed aren't asking it old warhorse questions. Narrow the range of inquiry to whatever small corner of human knowledge you have true expertise in, and it is less impressive.

  42. Victor Mair said,

    November 16, 2024 @ 9:08 am

    Did you ever have a thought / word in mind ("on the tip of your tongue"), but you just couldn't bring it to full consciousness? Sometimes you never do succeed in bringing it to full consciousness from among the trillions of synaptic neurotransmissions that are whirling about in your big brain, and that is indeed very frustrating, no matter how hard you try. But then, perhaps because of an unexpectedly powerful eructation or crepitation, the sought for thought / word shakes itself loose from the mass of nascent / incipient cogitative morphemes / lexemes becomes satisfyingly crystal-clear. I think the same thing happens to LLMs, and one can even visualize the process while AIO is "thinking" with all of those swirling ovoidal, ellipsoidal lines. But AIO is "human" too, and sometimes never quite succeeds in giving a fully formed, conclusive answer. When it does, though, that feels really good, just as with your own cerebrum.

  43. David Marjanović said,

    November 16, 2024 @ 10:08 am

    Google Translate, which renders "In 'Vom Urknall zum Durchknall' wird geschimpft, gemeckert und gezetert" as "In 'From the Big Bang to the Big Bang' there is cursing, complaining and complaining". I can't help but feel like something was probably lost in that translation.

    Heh. Yeah. Durchknall is a pun that connects Urknall "Big Bang" (literally "primordial bang") with durchgeknallt "crazy" (literally "banged through", implying brain damage). Meckern is what goats do; it's used as a metaphor for silly incessant complaints. Zetern is used for particularly loud, upset complaints.

    It took 200 years to get from Newton to Einstein. The validity of the scientific method does not depend on ticking off an annual checklist of key performance indicators.

    That is both true and impossible to explain to funders.

  44. Philip Taylor said,

    November 16, 2024 @ 11:09 am

    "If politeness is a conversational norm for someone, then I can understand them behaving in the same way whether the conversation is with a person or a program" — I know from personal experience that there are occasions on which I have thanked inanimate objects, but I can no longer remember the circumstances in which this occured.

  45. Victor Mair said,

    November 16, 2024 @ 11:22 am

    I treat my Toyota Tacoma truck as a person, talk to it, thank it….
    It has a lot of computers in it that guide and protect me.

  46. Matt McIrvin said,

    November 17, 2024 @ 10:52 am

    Actual physicists find debating these sorts of people to be a waste of time–people with revolutionary physics ideas backed by fragmentary knowledge about how quarks don't exist, etc. are extremely common in the wild and physicists keep getting unsolicited letters from them; if they debated them they'd have no time to do anything else.

  47. Matt McIrvin said,

    November 17, 2024 @ 10:54 am

    (The linguistics equivalent would, I suppose, be people insisting that all human languages are descended from Gaelic which was spoken in the Garden of Eden, or stuff along those lines)

  48. Matt McIrvin said,

    November 17, 2024 @ 11:04 am

    …as for the Chinese room, the thing that bothers me about it is just that Searle is swinging for the fences: basically arguing that philosophical zombies are possible, that even if we had an AI robot that behaved *exactly* like a human, and could do anything in the manner that a human can, there would still be no understanding because it isn't made from the right physical substrate. That strikes me as wrong.

  49. Andrew Usher said,

    November 18, 2024 @ 8:30 am

    Yes, The 'Chinese Room' argument does seem to show the possibility of philosophical zombies – I can't find any flaw in it. And it seems reasonable to say that AIs would be such, not by an appeal to 'the right physical substrate' but because they'd continuously evolved from unintelligent systems and there could be no moment, no dividing line where you could see 'Look: _here_ consciousness starts'.

    Of course I know the same thing can be applied to humans. We know we are conscious as we experience it, and we don't find it a continuum, but something that is strictly on or off. Yet no one could objectively define one capable of consciousness, marking off all the edge cases. The difficulty can't be resolved; we just have to go on regarding normal humans as conscious and computers as not.

    Now to go back: Conal Boyce himself seems to have at least something of the crackpot. Someone above linked to his home page; to anyone with some mathematical knowledge, it immediately screams 'crank!' – it would seem no surprise he'd be unable to identify Unzicker as one, even after a long time, nor that he'd sympathise with him. As far as I know (I don't pay attention) Hossenfelder is not disputing agreed scientific facts, and so can't be lumped in the same category regardless of anything else.

    As for ChatGPT's performance, I don't think I could stand listening through the whole thing. But it did seem superficially impressive, and that's what we agree AIs have become good at. And as raised above, there's good reason to believe Unzicker would selectively edit the conversation – and what he imagines to make himself look better will also be what makes the bot look best. I find it hard to believe that _this_ is where we'd see a huge leap in AI conversational ability first, and it seems others do as well. I'm not qualified to say anything more than that.

    k_over_hbarc at yahoo.com

  50. Vampyricon said,

    November 29, 2024 @ 10:51 pm

    Quite late to the debate as usual, but to put the situation into perspective, the reason fundamental physics is facing a so-called crisis is that we simply don't have the data required to make progress. The only thing we know for certain is that our two best theories combined have run into one experimental surprise ever since their conceptions, and that they contradict each other if you treat them on equal grounds. The experimental conditions required to test them on equal grounds is, using our current technology, never achievable, since it would require a particle collider ring the size of our galaxy, which, as a reminder, takes light 100 thousand years to cross. (For comparison, light can circle around the Earth 7.5 times in a second.)

    Hossenfelder is too busy banging on about her own hobbyhorses to address this obstacle (the lack of omnipotence) to progress in fundamental physics.

  51. Pamela said,

    November 30, 2024 @ 11:38 am

    "It knows this immense bunch-o-stuff, but in its initial response, it may stumble in ridiculous ways that bear on precision-of-data ('K297b' versus 'K364' in this case), the very thing over which computers are generally thought to have perfect mastery!! Very strange."

    i share this observation about ChapGPT. In my experience many of its hallucinations are based upon mistaken similarity between relational strings, which it does not check for error by excluding incorrect associations. This is very surprising to me since it seems to be a basic Boolean function of the sort one is taught in an elementary logic class, and which search coding covers in its earliest stages. I assume this will be fixed soon. There is surely a huge augmentation needed in its database to check some all-this/excluding-that sort of constructs, since some exclusions would rely upon some pretty esoteric knowledge sectors. On the other hand, how hard is it to distinguish between K297b and K364? The AI has assumed that genre items within a Mozart-K relational field are interchangeable, and could only isolate K364 when the human respondent had specified it (evidently stretching the search string to Mozart-subordinate 4 characters instead of one). I see this error over and over, and find it very surprising.

RSS feed for comments on this post

Leave a Comment