Tech & Startup

OpenAI launches new tools to boost AI agent development

OpenAI ChatGPT
The regulatory body has also mandated that OpenAI conduct a six-month public awareness campaign across Italian media to educate the public on how ChatGPT operates, particularly concerning data usage. Image: Rolf Van Root / Unsplash.

OpenAI has introduced a set of new tools aimed at helping developers and enterprises build AI agents—automated systems capable of independently completing tasks—using the company's artificial intelligence models and frameworks. The tools, launched as part of OpenAI's new Responses API, allow businesses to create AI agents that can conduct web searches, scan company databases, and navigate websites, mirroring OpenAI's own Operator product. The Responses API replaces OpenAI's previous Assistants API, which the company plans to phase out by mid-2026. This transition marks OpenAI's latest push to commercialize AI agent technology and encourage broader adoption by developers.

The term "AI agents" has generated significant industry buzz in recent years, though defining and demonstrating their capabilities has proven difficult. This week, Chinese startup Butterfly Effect faced scrutiny after its AI agent platform, Manus, failed to meet users' expectations. OpenAI is now under pressure to deliver AI agents that are both functional and reliable. "It's pretty easy to demo your agent," said Olivier Godement, OpenAI's head of API products, in an interview with TechCrunch. "To scale an agent is pretty hard, and to get people to use it often is very hard."

OpenAI has already introduced two AI agents in ChatGPT: Operator, which helps users navigate websites, and deep research, which compiles research reports. While these tools hinted at the potential of AI agents, their autonomy remained limited. The Responses API aims to expand on these foundations by enabling developers to build their own versions of Operator and deep research, potentially increasing the agents' autonomy and utility.

The Responses API allows developers to access GPT-4o search and GPT-4o mini search, the same AI models that power ChatGPT's web search function. These models browse the web in real-time, providing answers with cited sources. OpenAI claims these models outperform previous iterations in factual accuracy, with GPT-4o search scoring 90% on OpenAI's SimpleQA benchmark, compared to 88% for GPT-4o mini search and 63% for the larger GPT-4.5 model.

The API also features file search functionality, enabling businesses to scan internal documents quickly. OpenAI has assured that its models will not be trained on customer data. Additionally, developers can now access OpenAI's Computer-Using Agent (CUA) model, which powers Operator. This model generates mouse and keyboard actions, allowing for the automation of tasks such as data entry and app workflows.

For enterprises, OpenAI is offering the option to run the CUA model locally within their own systems, while the consumer version of CUA, available in Operator, remains limited to web-based actions.

Despite its advancements, the Responses API does not resolve all the challenges surrounding AI agents. Web search models, while more accurate than traditional AI models, are still prone to errors, and GPT-4o search misanswers 10% of factual queries. AI-driven search tools also struggle with short, navigational queries such as "Lakers score today," and some reports suggest that ChatGPT's citations are not always reliable.

Similarly, OpenAI acknowledges that its CUA model is not yet highly reliable for automating tasks on operating systems and remains susceptible to errors. However, the company says it is actively improving the technology.

To support developers, OpenAI is also launching the Agents SDK, an open-source toolkit designed to help integrate AI agents with internal systems, monitor their performance, and implement safeguards. The toolkit builds on OpenAI's Swarm framework, which was released in late 2023 for multi-agent orchestration.

OpenAI sees AI agents as the next major evolution in artificial intelligence. CEO Sam Altman has even predicted that 2025 will be the year AI agents make their way into the workforce. Godement echoed this optimism, stating, "Agents are the most impactful application of AI that will happen." With the Responses API, OpenAI is shifting its focus from AI agent demonstrations to delivering real-world tools that enterprises and developers can deploy at scale.

Comments

করোনার সংক্রমণ

কোভিড পরীক্ষায় এখনো প্রস্তুত নয় সরকারি হাসপাতালগুলো

অনেক সরকারি হাসপাতাল দীর্ঘ সময় ধরে করোনা পরীক্ষা না করায় তাদের যন্ত্রপাতি পুনরায় ক্যালিব্রেশন করা জরুরি হয়ে পড়েছে।

১ ঘণ্টা আগে