Breaking news

OpenAI Releases GDPval Benchmark To Gauge AI Performance Against Human Experts

New Benchmark Sheds Light on AI’s Capabilities

OpenAI has unveiled GDPval, a new benchmark designed to evaluate its AI models against human professionals across a broad spectrum of industries. This initiative represents a critical step in understanding how far today’s AI is from matching or surpassing the work quality of experts in sectors such as healthcare, finance, manufacturing, and government.

Methodology and Industry Scope

The GDPval benchmark focuses on nine major industries contributing to America’s gross domestic product and tests AI performance in 44 distinct occupations—from software engineering to nursing and journalism. In its initial version, GDPval-v0, industry professionals compared reports generated by AI models with those produced by their human counterparts. For instance, investment bankers were tasked with evaluating competitor landscape analyses for the last-mile delivery industry, ensuring that the assessment reflects real-world complexity.

Comparative Performance: AI Advances and Limitations

Results indicate promising progress; OpenAI’s GPT-5-high, an enhanced iteration of its flagship model, achieved a win rate of 40.6% when compared head-to-head with industry veterans. More notably, Anthropic’s Claude Opus 4.1 reached nearly 49% on similar criteria. However, OpenAI acknowledges that these models are not yet positioned to replace human labor entirely, as the current iteration of GDPval covers a narrow slice of actual job responsibilities.

Expert Insights and Future Directions

In a discussion with TechCrunch, OpenAI’s chief economist, Dr. Aaron Chatterji, noted that the benchmark’s favorable outcomes suggest professionals may soon delegate routine tasks to AI. This, he argued, will free up valuable time for focusing on higher-impact work. Industry observer Tejal Patwardhan also expressed optimism, emphasizing the significant performance leap from GPT-4’s 13.7% score to nearly triple that figure with GPT-5.

Benchmarking And The Road To Comprehensive AI Evaluation

While GDPval represents an early milestone, it aligns with a broader effort among Silicon Valley titans to create robust testing frameworks, such as AIME 2025 and GPQA Diamond, that better quantify AI proficiency for real-world applications. OpenAI plans to expand GDPval to encapsulate more industries and interactive workflows, aiming to bolster its claims about AI’s growing economic value.

As the benchmark evolves, GDPval could play an instrumental role in the ongoing debate around artificial general intelligence, highlighting the potential and limitations of AI models poised to reshape the modern workforce.

Webflow Strengthens Marketing Suite With Acquisition Of AI-Powered Vidoso

Strategic Acquisition For Enhanced Marketing

Webflow, a leading software platform for website building and hosting, has acquired AI-driven content-generation platform Vidoso to advance its suite of marketing offerings. The move signals Webflow’s strategic shift from being recognized solely as a website builder and CMS provider to emerging as a holistic, agentic marketing platform.

Integrating AI With Content Creation

Vidoso, founded in 2024, uses large language models to help organizations generate marketing materials such as images, presentations, video clips, blog posts and social media content. One of the platform’s features allows users to convert long-form content, including keynote presentations or panel discussions, into shorter formats such as video clips and blog posts. Following the acquisition, Vidoso’s four-person team will join Webflow, and the technology is expected to be integrated into the company’s broader content and marketing tools

Driving Operational Efficiency In A Competitive Market

Webflow has raised more than $330 million in funding and has previously expanded its marketing capabilities through acquisitions and partnerships. Earlier initiatives included the acquisition of personalization platform Intellimize and the launch of integrations with advertising platforms such as Google Ads. The company is operating in an increasingly competitive market as startups develop AI tools for marketing automation. Competitors in this space include companies such as Kana, Hightouch and Blueshift. Webflow CEO Linda Tong said the company aims to build a platform that connects brand management, demand generation, product marketing and content development within a single system.

Closing The Gap With Branded AI Content

Vidoso’s CEO, Sharad Verma, explained that earlier iterations of AI delivered generic content that lacked alignment with individual brand systems. “Frontier models are trained on the average of the internet, not on the specifics of your brand,” Verma stated, emphasizing how Vidoso’s platform addresses this shortfall by ensuring consistent, governed, and production-ready content that aligns with existing marketing workflows.

A Forward-Looking Vision

Webflow views the acquisition as part of a broader shift toward AI-assisted marketing tools that combine content creation with performance insights. According to Tong, integrating these capabilities into a single platform allows companies to create marketing assets while analyzing their performance and refining future campaigns.

Uol
eCredo
The Future Forbes Realty Global Properties
Aretilaw firm

Become a Speaker

Become a Speaker

Become a Partner

Subscribe for our weekly newsletter