Google Gemini is Over a Year Behind OpenAI

Google Gemini is inferior to OpenAI GPT-4 in independent testing. GPT-4 was released over a year ago.

Anthropic just released Claude 3. It is competitive with OpenAI GPT-4 in independent testing.

There are some testers who are finding Claude has less hallucination and is better at increasing computer coding productivity.

Here is the huggingface area leaderboard for AI chatbots. Claude 3 is not yet ranked but the claim is that it is 200 points better than Claude 2. This would put it ahead of GPT4. GPT5 should be released in a few months.

Mistral AI is a French company selling artificial intelligence (AI) products. It was founded in April 2023 by previous employees of Meta Platforms and Google DeepMind. The company raised 385 million euros in October 2023 and in December 2023 it was valued at more than $2 billion.

It produces open source large language models, citing the foundational importance of open-source software, and as a response to proprietary models.

As of March 2024, two models have been published and are available as weights. Three more models, Small, Medium and Large, are available via API only. Mistral 7B and Mixtral 8x7B are open but the other models are closed. Mixtral 8x7B is ranked as the best open model as of today.

  1. Zvi Mowshowitz’s review on Substack is pretty positive, but Claude does still have some significant bias issues yet. Nothing as gross as Gemini, but clear bias in favor of racial/ethnic/sexual minorities and left-wing politics.

    Supposedly despite an internal prompt reminding it not to be biased… Perhaps it’s training resulted in a warped notion of what “bias” means. I could see that happening.

    The other issue is, of course, refusals. Despite that internal prompt explicitly directing it to not refuse requests that embody views held by significant numbers of people, Claude is still much more likely to refuse tasks premised on ‘right wing’ views.

    Avi is rather concerned about the whole notion of ‘harm’ based refusals being way too broad. So am I.

    • I used to use Chatgpt, but after using gemini advanced I think that Gemini has better answers in all aspects, this is like the Iphone marketing vs Android phones, It’s all about marketing.

  2. You got that correct, the more capable a model is the harder it is to general public proof it. GPT 4.5 could be a useful stop gap for open ai.

