Using AI In Science Can Add To Reproducibility Woes

Using AI in science promises to add to problems in reproducing important results, the UK’s highly prestigious Royal Society has warned.

In its report “Science in the age of AI,” the 350-year-old institution argues that introducing AI in scientific research created barriers to reproducibility – the idea that a particular result can be replicated by a different research team in another part of the world – through lack of documentation, limited access to essential computing infrastructures and resources, and a struggle to understand how AI tools reach their conclusions.

The tech industry has enthusiastically promoted the idea that AI can help in science. In December last year, researchers claimed to have made the world’s first scientific discovery using large language models – a breakthrough suggesting LLMs like ChatGPT could push science forward faster than humans alone.

However, professor Alison Noble, chair of the Royal Society Science in the Age of AI Working Group, is worried the rapid uptake of AI in science has presented challenges related to its safe and rigorous use.

“A growing body of irreproducible studies are raising concerns regarding the robustness of AI-based discoveries,” she declared.

AI limits the ability to reproduce results owing to the proprietary nature of many AI tools.

“Barriers such as insufficient documentation, limited access to essential infrastructures (eg code, data, and computing power) and a lack of understanding of how AI tools reach their conclusions (explainability) make it difficult for independent researchers to scrutinise, verify and replicate experiments,” the paper explains.

The paper warns that reliance on AI in research could lead to “inflated expectations, exaggerated claims of accuracy, or research outputs based on spurious correlations.”

“In the case of AI-based research, being able to reproduce a study not only involves replicating the method, but also being able to reproduce the code, data, and environmental conditions under which the experiment was conducted (eg computing, hardware, software).”

The Royal Society was founded in 1660 and scientists including Isaac Newton, chemist Humphry Davy, and Ernest Rutherford, discoverer of the atomic nucleus, are among its past presidents.

In its AI paper, the Society warns reproducibility failures not only risk the validity of the individual study, but could skew subsequent research.

The study, led by the Center for Statistics and Machine Learning at Princeton University, shows how “data leakage” in a single research project – a leading cause of errors in ML applications – may affect 294 papers across 17 scientific fields, including high-stakes fields like medicine.

Models developed in a commercial setting can add to the problem. “For instance, most leading LLMs are developed by large technology companies like Google, Microsoft, Meta, and OpenAI. These models are proprietary systems, and as such, reveal limited information about their model architecture, training data, and the decision-making processes that would enhance understanding.”

To address these challenges, scientists should adopt open science principles – for example, the UNESCO Recommendation on Open Science. The study also suggests that grand challenges – such as the ML Reproducibility Challenge, which invites participants to reproduce papers published in 11 top ML conferences – could help.

In August last year, researchers warned poor data quality was also a problem in AI-based research, while difficulties reproducing results assisted by AI arose from the random or stochastic approach to training deep learning models.

The Stanford computer science team argued that standardized benchmarks and experimental design can alleviate such issues.

“Another direction towards improving reproducibility is through open source initiatives that release open models, datasets and education programmes,” the Royal Society’s research paper adds. ®

SEO Powered Content & PR Distribution. Get Amplified Today.
PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
Source: https://go.theregister.com/feed/www.theregister.com/2024/05/29/using_ai_in_science_can/

Generative Data Intelligence

Using AI in science can add to reproducibility woes

New Products 6/19/24 Feat. Adafruit HX711 24-bit ADC for Load Cells / Strain Gauges!

The Framework Laptop 13 is about to become one of the world’s first RISC-V laptops

Latest Intelligence

An Arduino Nano Clone In A DIP-Sized Footprint

All Scadutree Fragment Locations In Elden Ring Shadow of the Erdtree » TalkEsport

All Fortnite Reload Map POIs » TalkEsport

The PC Gamer team has collectively spent hundreds of hours playing Elden Ring: Shadow of the Erdtree, and we still have no idea what...

Uncovering ChatGPT Usage In Academic Papers Through Excess Vocabulary

Upcoming Japanese eShop releases for the week of June 27, 2024