Until this year, it was possible to learn a lot about artificial intelligence technology simply by reading research documentation published by Google and other AI leaders with each new program they released. Open disclosure was the norm for the AI world.
All that changed in March of this year, when OpenAI elected to announce its latest program, GPT-4, with very little technical detail. The research paper provided by the company obscured just about every important detail of GPT-4 that would allow researchers to understand its structure and to attempt to replicate its effects.
Last week, Google continued that new obfuscation approach, announcing the formal release of its newest generative AI program, Gemini, developed in conjunction with its DeepMind unit, which was first unveiled in May. The Google and DeepMind researchers offered a blog post devoid of technical specifications, and an accompanying technical report almost completely devoid of any relevant technical details.
Much of the blog post and the technical report cite a raft of benchmark scores, with Google boasting of beating out OpenAI's GPT-4 on most measures and beating Google's former top neural network, PaLM.
Neither the blog nor the technical paper include key details customary in years past, such as how many neural net "parameters," or, "weights," the program has, a key aspect of its design and function. Instead, Google refers to three versions of Gemini, with three different sizes, "Ultra," "Pro," and "Nano." The paper does disclose that Nano is trained with two different weight counts, 1.8 billion and 3.25 billion, while failing to disclose the weights of the other two sizes.
Numerous other technical details are absent, just as with the GPT-4 technical paper from OpenAI. In the absence of technical details, online debate has focused on whether the boasting of benchmarks means anything.
OpenAI researcher Rowan Zellers wrote on X (formerly Twitter) that Gemini is "super impressive," and added, "I also don't have a good sense on how much to trust the dozen or so text benchmarks that all the LLM papers report on these days."
Tech news site TechCrunch's Kyle Wiggers reports anecdotes of poor performance by Google's Bard search engine, enhanced by Gemini. He cites posts on X by people asking Bard questions such as movie trivia or vocabulary suggestions and reporting the failures.
The sudden swing to secrecy by Google and OpenAI is becoming a major ethical issue for the tech industry because no one knows, outside the vendors -- OpenAI and its partner Microsoft, or, in this case, Google's Google Cloud unit -- what is going on in the black box in their computing clouds.
Google's lack of disclosure, while not surprising given its commercial battle with OpenAI, and partner Microsoft, for market share, is made more striking by one very large omission: model cards.
Model cards are a form of standard disclosure used in AI to report on the details of neural networks, including potential harms of the program (hate speech, etc.) While the GPT-4 report from OpenAI omitted most details, it at least made a nod to model cards with a "GPT-4 System Card" section in the paper, which it said was inspired by model cards.
Google doesn't even go that far, omitting anything resembling model cards. The omission is particularly strange given that model cards were invented at Google by a team that included Margaret Mitchell, formerly co-lead of Ethical AI at Google, and former co-lead Timnit Gebru.
Instead of model cards, the report offers a brief, rather bizarre passage about the deployment of the program with vague language about having model cards at some point.
If Google puts question marks next to model cards in its own technical disclosure, one has to wonder what the future of oversight and safety is for neural networks.
https://www.zdnet.com/