The word reliably is doing a lot of work here. I was using one of the bigger llms (honestly I can't remember which one) after they started putting citations into their responses. I thought this is great now I can look up the actual source if I need more I depth understanding...
Well a couple of prompts later after I asked it some details about some signal processing algorithm, it tells me "for more in discussion of the algorithm look at citation a (a very general dsp book that likely did not cover the specific topic in depth) or the special issue on [topic of my question] in IEEE journal of X"
So I think "great there's a special issue on this topic" that's just what I need. A quick Google does not result in anything so I prompt the AI, "Can you provide a more specific reference to the special issue in...". The answer: "There is no special issue on [topic]...". So llm s make up citations just as they make up everything else.
I asked Claude to translate a book title from Hebrew (well not translate exactly but locate the original English title of the same book).
That's not a language I speak or generally have anything else to do with.
I then asked it an unrelated question about a science topic and it returned something with a citation. When I clicked on the citation, not only was it not relevant to the science question it claimed it was cited to support, it was basically a conspiracy theory from the 1970s about Jews controlling the media.
Which somehow seems even worse than my usual experience of the link being totally made up dead end.
Seems apt because people's relationship with journalists and facts seem to be about the same - most people take it at face value and SMEs decry poor reporting
That's not the type of citation they're talking about. Gemini uses a tool call to the Google search engine and thus can cite and read proper links. You're talking about an LLM that just hallucinates citations which don't exist.
Is Gemini the same thing that shows up in google search AI box? Because that thing is wrong all the time.
Just the other day I was searching for some details about the metal graphics api language, and something weird caught my eye as I scrolled past the AI stuff. Curious, I engaged, asking more basic questions and they were just.. wrong.
Even right now, “what is the default vertex winding order in Metal?” is wrong. Or how about “does metal use a left or right handed coordinate system for the normalized device coordinates?”. I mean this is day one intro level stuff, and easily found on Apple’s dev site.
And the “citations” are ridiculous. It references some stack overflow commentary or a Reddit thread where someone asks a similar question. But the response is “I don’t know about Metal, but Vulcan/D3D use (something different)”. Seriously wtf.
GPT4 gives the same wrong answers with almost the same citations. GPT5 gets it right, for at least the examples above.
Either way, it’s hard to trust it for things you don’t know, when you can’t for things you do.
Maybe it's Gemini, maybe it's another one of their models, but I'm specifically talking about LLMs like Gemini, or, if you want a better example, Perplexity, which crawls web pages first and then cites them, so that there aren't bogus citations.
Well a couple of prompts later after I asked it some details about some signal processing algorithm, it tells me "for more in discussion of the algorithm look at citation a (a very general dsp book that likely did not cover the specific topic in depth) or the special issue on [topic of my question] in IEEE journal of X"
So I think "great there's a special issue on this topic" that's just what I need. A quick Google does not result in anything so I prompt the AI, "Can you provide a more specific reference to the special issue in...". The answer: "There is no special issue on [topic]...". So llm s make up citations just as they make up everything else.