Breaking news

OpenAI Releases GDPval Benchmark To Gauge AI Performance Against Human Experts

New Benchmark Sheds Light on AI’s Capabilities

OpenAI has unveiled GDPval, a new benchmark designed to evaluate its AI models against human professionals across a broad spectrum of industries. This initiative represents a critical step in understanding how far today’s AI is from matching or surpassing the work quality of experts in sectors such as healthcare, finance, manufacturing, and government.

Methodology and Industry Scope

The GDPval benchmark focuses on nine major industries contributing to America’s gross domestic product and tests AI performance in 44 distinct occupations—from software engineering to nursing and journalism. In its initial version, GDPval-v0, industry professionals compared reports generated by AI models with those produced by their human counterparts. For instance, investment bankers were tasked with evaluating competitor landscape analyses for the last-mile delivery industry, ensuring that the assessment reflects real-world complexity.

Comparative Performance: AI Advances and Limitations

Results indicate promising progress; OpenAI’s GPT-5-high, an enhanced iteration of its flagship model, achieved a win rate of 40.6% when compared head-to-head with industry veterans. More notably, Anthropic’s Claude Opus 4.1 reached nearly 49% on similar criteria. However, OpenAI acknowledges that these models are not yet positioned to replace human labor entirely, as the current iteration of GDPval covers a narrow slice of actual job responsibilities.

Expert Insights and Future Directions

In a discussion with TechCrunch, OpenAI’s chief economist, Dr. Aaron Chatterji, noted that the benchmark’s favorable outcomes suggest professionals may soon delegate routine tasks to AI. This, he argued, will free up valuable time for focusing on higher-impact work. Industry observer Tejal Patwardhan also expressed optimism, emphasizing the significant performance leap from GPT-4’s 13.7% score to nearly triple that figure with GPT-5.

Benchmarking And The Road To Comprehensive AI Evaluation

While GDPval represents an early milestone, it aligns with a broader effort among Silicon Valley titans to create robust testing frameworks, such as AIME 2025 and GPQA Diamond, that better quantify AI proficiency for real-world applications. OpenAI plans to expand GDPval to encapsulate more industries and interactive workflows, aiming to bolster its claims about AI’s growing economic value.

As the benchmark evolves, GDPval could play an instrumental role in the ongoing debate around artificial general intelligence, highlighting the potential and limitations of AI models poised to reshape the modern workforce.

Strained Household Finances: Eurostat Data Reveals Persistent Payment Delays Across Europe and in Cyprus

Improved Financial Resilience Amid Ongoing Strains

Over the past decade, Cypriot households have significantly increased their ability to manage debts—not only bank loans but also rent and utility bills. However, recent Eurostat data indicates that Cyprus continues to lag behind the European average when it comes to covering financial obligations on time.

Household Coping Strategies and the Limits of Payment Flexibility

While many families are managing their fixed expenses with relative ease, one in three Cypriots struggles to cover unexpected costs. This delicate balancing act highlights how routine payments such as mortgage installments, rent, and utility bills are met, but precariously so, with little room for unplanned financial shocks.

Breaking Down Payment Delays Across the European Union

Eurostat reports that nearly 9.2% of the EU population experienced delays with their housing loans, rent, utility bills, or installment payments in 2024. The situation is more acute among vulnerable groups: 17.2% of individuals in single-parent households with dependent children and 16.6% in households with two adults managing three or more dependents faced payment delays. In every EU nation, single-parent households exhibited higher delay rates compared to the overall population.

Cyprus in the Crosshairs: High Rates of Financial Delays

Although Cyprus recorded a notable 19.1 percentage point improvement from 2015 to 2024 in delays related to mortgages, rent, and utility bills, the island nation still ranks among the top five countries with the highest delay rates. As of 2024, 12.5% of the Cypriot population had outstanding housing loans or rent and overdue utility bills. In contrast, Greece tops the list with 42.8%, followed by Bulgaria (18.7%), Romania (15.3%), Spain (14.2%), and other EU members. Notably, 19 out of 27 EU countries reported delay rates below 10%, with Czech Republic (3.4%) and Netherlands (3.9%) leading the pack.

Selective Improvements and Emerging Concerns

Between 2015 and 2024, the overall EU population saw a 2.6 percentage point decline in payment delays. Despite this, certain countries experienced increases: Luxembourg (+3.3 percentage points), Spain (+2.5 percentage points), and Germany (+2.0 percentage points) saw a rise in payment delays, reflecting underlying economic pressures that continue to challenge financial stability.

Economic Insecurity and the Unprepared for Emergencies

Another critical indicator explored by Eurostat is the prevalence of economic insecurity—the proportion of the population unable to handle unexpected financial expenses. In 2024, 30% of the EU population reported being unable to cover unforeseen costs, a modest improvement of 1.2 percentage points from 2023 and a significant 7.4 percentage point drop compared to a decade ago. In Cyprus, while 34.8% still report difficulty handling emergencies, this marks a drastic improvement from 2015, when the figure stood at 60.5%.

A Broader EU Perspective

Importantly, no EU country in 2024 had more than half of its population facing economic insecurity—a notable improvement from 2015, when over 50% of the population in nine countries reported such challenges. These figures underscore both progress and persistent vulnerabilities within European households, urging policymakers to consider targeted measures for enhancing financial resilience.

For further insights and detailed analysis, refer to the original reports on Philenews and Housing Loans.

The Future Forbes Realty Global Properties

Become a Speaker

Become a Speaker

Become a Partner

Subscribe for our weekly newsletter