Close Menu
Nabka News
  • Home
  • News
  • Business
  • China
  • India
  • Pakistan
  • Political
  • Tech
  • Trend
  • USA
  • Sports

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

2025 World AI Conference kicks off in Shanghai-Xinhua

July 27, 2025

Foreigner arrested for selling donkey meat in Islamabad

July 27, 2025

PM Shehbaz approves digital ecosystem for FBR reforms

July 26, 2025
Facebook X (Twitter) Instagram
  • Home
  • About NabkaNews
  • Advertise with NabkaNews
  • DMCA Policy
  • Privacy Policy
  • Terms of Use
  • Contact us
Facebook X (Twitter) Instagram Pinterest Vimeo
Nabka News
  • Home
  • News
  • Business
  • China
  • India
  • Pakistan
  • Political
  • Tech
  • Trend
  • USA
  • Sports
Nabka News
Home » AI models need more standards and tests, say researchers
Trend

AI models need more standards and tests, say researchers

i2wtcBy i2wtcJune 22, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email WhatsApp Copy Link
Follow Us
Google News Flipboard Threads
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link


As the usage of artificial intelligence — benign and adversarial — increases at breakneck speed, more cases of potentially harmful responses are being uncovered.

Pixdeluxe | E+ | Getty Images

As the usage of artificial intelligence — benign and adversarial — increases at breakneck speed, more cases of potentially harmful responses are being uncovered. These include hate speech, copyright infringements or sexual content.

The emergence of these undesirable behaviors is compounded by a lack of regulations and insufficient testing of AI models, researchers told CNBC.

Getting machine learning models to behave the way it was intended to do so is also a tall order, said Javier Rando, a researcher in AI.

“The answer, after almost 15 years of research, is, no, we don’t know how to do this, and it doesn’t look like we are getting better,” Rando, who focuses on adversarial machine learning, told CNBC.

However, there are some ways to evaluate risks in AI, such as red teaming. The practice involves individuals testing and probing artificial intelligence systems to uncover and identify any potential harm — a modus operandi common in cybersecurity circles.

Shayne Longpre, a researcher in AI and policy and lead of the Data Provenance Initiative, noted that there are currently insufficient people working in red teams.

While AI startups are now using first-party evaluators or contracted second parties to test their models, opening the testing to third parties such as normal users, journalists, researchers, and ethical hackers would lead to a more robust evaluation, according to a paper published by Longpre and researchers.

“Some of the flaws in the systems that people were finding required lawyers, medical doctors to actually vet, actual scientists who are specialized subject matter experts to figure out if this was a flaw or not, because the common person probably couldn’t or wouldn’t have sufficient expertise,” Longpre said.

Adopting standardized ‘AI flaw’ reports, incentives and ways to disseminate information on these ‘flaws’ in AI systems are some of the recommendations put forth in the paper.

With this practice having been successfully adopted in other sectors such as software security, “we need that in AI now,” Longpre added.

Marrying this user-centred practice with governance, policy and other tools would ensure a better understanding of the risks posed by AI tools and users, said Rando.

We're pursing a path of AI development that's extremely harmful to a lot of people, says Karen Hao

No longer a moonshot

Project Moonshot is one such approach, combining technical solutions with policy mechanisms. Launched by Singapore’s Infocomm Media Development Authority, Project Moonshot is a large language model evaluation toolkit developed with industry players such as IBM and Boston-based DataRobot.

The toolkit integrates benchmarking, red teaming and testing baselines. There is also an evaluation mechanism which allows AI startups to ensure that their models can be trusted and do no harm to users, Anup Kumar, head of client engineering for data and AI at IBM Asia Pacific, told CNBC.

Evaluation is a continuous process that should be done both prior to and following the deployment of models, said Kumar, who noted that the response to the toolkit has been mixed.

“A lot of startups took this as a platform because it was open source, and they started leveraging that. But I think, you know, we can do a lot more.”

Moving forward, Project Moonshot aims to include customization for specific industry use cases and enable multilingual and multicultural red teaming.

Higher standards

Pierre Alquier, Professor of Statistics at the ESSEC Business School, Asia-Pacific, said that tech companies are currently rushing to release their latest AI models without proper evaluation.

“When a pharmaceutical company designs a new drug, they need months of tests and very serious proof that it is useful and not harmful before they get approved by the government,” he noted, adding that a similar process is in place in the aviation sector.

AI models need to meet a strict set of conditions before they are approved, Alquier added. A shift away from broad AI tools to developing ones that are designed for more specific tasks would make it easier to anticipate and control their misuse, said Alquier.

“LLMs can do too many things, but they are not targeted at tasks that are specific enough,” he said. As a result, “the number of possible misuses is too big for the developers to anticipate all of them.”

Such broad models make defining what counts as safe and secure difficult, according to a research that Rando was involved in.

Tech companies should therefore avoid overclaiming that “their defenses are better than they are,” said Rando.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email WhatsApp Copy Link
i2wtc
  • Website

Related Posts

Trend

AI is radically changing entry-level jobs, but not eliminating them

July 26, 2025
Trend

Elon Musk’s Grok AI is now partnered with Kalshi and Polymarket

July 25, 2025
Trend

In AI attempt to take over world, there’s no ‘kill switch’ to save us

July 24, 2025
Trend

ServiceNow (NOW) Q2 earnings 2025

July 23, 2025
Trend

AI, Big Tech plays right now from Ark Invest’s top ranked ETFs

July 23, 2025
Trend

Amazon to buy AI company Bee that makes wearable listening device

July 22, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

2025 World AI Conference kicks off in Shanghai-Xinhua

July 27, 2025

House Republicans unveil aid bill for Israel, Ukraine ahead of weekend House vote

April 17, 2024

Prime Minister Johnson presses forward with Ukraine aid bill despite pressure from hardliners

April 17, 2024

Justin Verlander makes season debut against Nationals

April 17, 2024
Don't Miss

Trump says China’s Xi ‘hard to make a deal with’ amid trade dispute | Donald Trump News

By i2wtcJune 4, 20250

Growing strains in US-China relations over implementation of agreement to roll back tariffs and trade…

Donald Trump’s 50% steel and aluminium tariffs take effect | Business and Economy News

June 4, 2025

The Take: Why is Trump cracking down on Chinese students? | Education News

June 4, 2025

Chinese couple charged with smuggling toxic fungus into US | Science and Technology News

June 4, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to NabkaNews, your go-to source for the latest updates and insights on technology, business, and news from around the world, with a focus on the USA, Pakistan, and India.

At NabkaNews, we understand the importance of staying informed in today’s fast-paced world. Our mission is to provide you with accurate, relevant, and engaging content that keeps you up-to-date with the latest developments in technology, business trends, and news events.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

2025 World AI Conference kicks off in Shanghai-Xinhua

July 27, 2025

Foreigner arrested for selling donkey meat in Islamabad

July 27, 2025

PM Shehbaz approves digital ecosystem for FBR reforms

July 26, 2025
Most Popular

PwC loses two-thirds of accounting revenue from mainland China-listed clients

July 18, 2024

Highway bridge collapses, several killed – DW – July 20, 2024

July 20, 2024

Floods and bridge collapse kill at least 25 in China, rescuers search for dozens missing

July 23, 2024
© 2025 nabkanews. Designed by nabkanews.
  • Home
  • About NabkaNews
  • Advertise with NabkaNews
  • DMCA Policy
  • Privacy Policy
  • Terms of Use
  • Contact us

Type above and press Enter to search. Press Esc to cancel.