Cohere vs. OpenAI in the Enterprise: Which Will CIOs Choose?
- by 7wData
OpenAI has just announced an enterprise version of its popular generative AI product, ChatGPT. But in this case, OpenAI is a fast follower — not the first-to-market. Cohere, a Toronto-based company with close ties to Google, is already bringing generative AI to businesses.
I spoke with Cohere’s President and COO, Martin Kon, about how its Machine Learning models are being used within enterprise companies.
Cohere is only a few years old, but it has an impressive pedigree. Two of Cohere’s founders worked in the recent past for Google Brain, which kickstarted the current craze around generative AI. In 2017, Google Brain introduced the “transformer” model for Natural Language Processing (NLP) — the ‘T’ in ChatGPT. Aidan Gomez and Nick Frosst, the CEO and CTO respectively of Cohere, then teamed up with Ivan Zhang to commercialize this form of NLP at Cohere.
Martin Kon is brand new to the company, having started just last month. But like the founders, he also has Google ties, having worked for YouTube for six years prior to joining Cohere. He was brought on board to run the business operations side of Cohere — and business, it seems, is booming.
According to Kon, Cohere has experienced a “65% month-on-month growth over the past year in API calls [and] similar in number of developers.”
Now that it has traction, Cohere has switched focus to bringing its large language models and associated tooling to the enterprise.
“We’re working with developers in organizations, the AI/ML teams, to bring these capabilities into their organizations,” said Kon. He claims that its approach is fundamentally different to OpenAI’s.
“OpenAI wants you to bring your data to their models, exclusive to Azure. Cohere wants to bring our models to your data, in whatever environment you feel comfortable in.”
Cohere has two types of LLM (large language model): generation and representation. The former is what ChatGPT does, the latter is for understanding language (for example, to do sentiment analysis). Each type comes in different sizes: small, medium, large, and xlarge. There are various tradeoffs between the size of the model and the speed it can work at.
Cohere’s base model has 52 billion parameters, based on the Stanford HELM rankings (Holistic Evaluation of Language Models). Stanford’s HELM website notes that this is for the “xlarge” version of Cohere’s model, the largest version. OpenAI’s GPT-3 davinci model, its largest, is listed by Stanford as having 175B parameters.
During our conversation, Kon said that Cohere’s models were shown to test better against GPT-3. I asked the company for verification of this and it responded by pointing me to Stanford’s accuracy measurements. According to Cohere, “the study shows that the Cohere xlarge model achieves higher accuracy than a number of well-known models which are 3x larger, including GPT-3, Jurassic-1 Jumbo, and BLOOM (each of which has about 175B parameters).”
However, it should be noted that Cohere’s model is only ahead of GPT-3 models. OpenAI’s more recent GPT-3.5 models, text-davinci-002 and text-davinci-003, are both rated higher than Cohere in accuracy. Indeed, these currently rank highest of all models by the HELM accuracy measure (see below).
Kon told me that Cohere’s latest model, Command (currently in beta) gets re-tuned every week. “This means that every week, you can expect the performance of command to improve,” he said.
According to the documentation, Command is “a generative model that responds well with instruction-like prompts.” Davinci, as a point of comparison, is described by OpenAI as being good at “complex intent, cause and effect, summarization for audience.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More