Google Bard says the Earth is flat (LLM bunk)

I'd be interested to know what you folks find if you do a similar request for a ChatGPT or a Google Gemini about some other area where you have enough interest and knowledge to spot what they get right and what they get wrong.
I asked ChatGPT about the Schwimmwagen. It knew it featured a VW engine, but failed to mention that the whole car was produced by Volkswagen. It also claims that one of its roles was to tow small artillery pieces, which it absolutely cannot do, and missed its role as command transport for tank units. "open top for easy access"—yeah, no, doors would provide easy access, if it had them.

Generally speaking, its response was mostly unspecific and ripe with generalities, like a highschooler bullshitting their essay the night before it is due. The wikipedia article on the subject is detailed, informative, and correct.
 
Last edited:
Google has more festivals, and given the search terms, I feel that's a valid result, including the videos it delivers.

Maybe Goggle is better at it. I was being lazy and using Bing as that's what my browser defaults to. This is low budget PC running a locked version of Windows, so I can't install Chrome, though I can open a Google page and search form there. I wonder if you being in Germany makes a difference?

I get called 3-4 times a day from various telemarketers wanting to help me with my Google business listing and try to boost my search results. That is, move my companies name up in searches for contractors in my area I'm guessing. I don't need it, but it'd be interesting to see what they can do and for how much. And is it localized? I would think a search optimization scheme for smaller local businesses would have to localize. Amazon wants to be a hit on every search across the country, but my name popping up in a search for contractors by @Ann K in Ohio, would be useless to both of us.

Might be a fun experiment thread where we all type in the exact same search term to Google or Bing or Duck Duck Goose if someone has it, and see what different results we get around the world.
 
Article:
Amazon conducted the test earlier this year for Australia's corporate regulator the Securities and Investments Commission (ASIC) using submissions made to an inquiry.

The test involved testing generative AI models before selecting one to ingest five submissions from a parliamentary inquiry into audit and consultancy firms. The most promising model, Meta's open source model Llama2-70B, was prompted to summarise the submissions with a focus on ASIC mentions, recommendations, references to more regulation, and to include the page references and context.

Ten ASIC staff, of varying levels of seniority, were also given the same task with similar prompts. Then, a group of reviewers blindly assessed the summaries produced by both humans and AI for coherency, length, ASIC references, regulation references and for identifying recommendations. They were unaware that this exercise involved AI at all.

These reviewers overwhelmingly found that the human summaries beat out their AI competitors on every criteria and on every submission, scoring an 81% on an internal rubric compared with the machine's 47%.

Reviewers told the report's authors that AI summaries often missed emphasis, nuance and context; included incorrect information or missed relevant information; and sometimes focused on auxiliary points or introduced irrelevant information. Three of the five reviewers said they guessed that they were reviewing AI content.

The reviewers' overall feedback was that they felt AI summaries may be counterproductive and create further work because of the need to fact-check and refer to original submissions which communicated the message better and more concisely.
 
Back
Top