ENFR
8news

Tech • IA • Crypto

TodayMy briefingVideosTop articles 24hArchivesFavoritesMy topics

Computer use in Codex

AIOpenAIMay 12, 2026 at 08:44 PM11:23
0:00 / 0:00

TL;DR

OpenAI’s Codex now operates full computer interfaces, enabling users to automate tasks across apps with speed, multitasking, and app-level permission controls.

KEY POINTS

Codex expands from coding to full computer control

Codex, originally built as a coding assistant, has evolved into a general-purpose digital operator capable of interacting with any application on a user’s computer. It can click, type, and navigate graphical interfaces just as a human would, extending its usefulness beyond code into everyday workflows across local software environments.

Works across any app via graphical interaction

The system operates directly on graphical user interfaces, allowing it to use virtually any application without needing special integrations. This includes tools like UTM, Spotify, messaging apps, and productivity software. By visually interpreting interfaces and executing actions, Codex removes the need for APIs or manual scripting.

True multitasking with parallel app control

A key breakthrough is Codex’s ability to run multiple tasks simultaneously across different applications. It can, for example, create a virtual machine, play music, and set reminders at the same time. Unlike earlier systems that monopolized the device, Codex works in the background without interrupting the user’s own activity.

Independent cursor and non-intrusive operation

Codex uses its own on-screen cursor, distinct from the user’s, allowing both to operate concurrently. This enables seamless multitasking where the user retains control of their device while the AI performs automated actions independently.

Faster performance with new model integration

The introduction of faster models like Codex Spark significantly boosts execution speed. Tasks such as composing and sending messages can now be completed in near-instant time, with performance described as exceeding human speed in certain cases.

Enhanced accuracy through accessibility data

Instead of relying solely on visual screenshots, Codex leverages system accessibility frameworks to extract structured information about interface elements. This allows it to understand context, identify off-screen components, and interact with apps more precisely, improving reliability and efficiency.

Streamlined onboarding and setup

Initial setup is designed to be simple, requiring minimal user interaction. Permissions are granted through a guided interface, enabling users to activate computer control features quickly while maintaining awareness of system changes.

Granular permission-based security model

Codex operates under a strict permission system where each application must be explicitly approved before access. It cannot view or interact with other apps unless authorized, ensuring sensitive data remains isolated and enhancing user trust.

Integration into core AI models

Capabilities once limited to specialized systems have been incorporated into mainline GPT models, making advanced computer interaction features more widely accessible through standard APIs. This unification simplifies development and expands potential use cases.

Productivity gains across workflows

Early use cases include automating repetitive tasks such as managing spreadsheets, configuring development environments, and handling multi-step workflows across several apps. The system is positioned as a time-saving tool that reduces manual effort in complex digital tasks.

Roadmap toward superhuman performance

Development is focused on achieving speeds 2 to 10 times faster than human interaction, with the goal of making AI-driven computer use indispensable for both professional and personal computing tasks.

Current availability and expansion plans

The feature is currently available on macOS, with plans to expand support to Windows systems in the near future, signaling broader adoption across platforms.

CONCLUSION

Codex’s ability to operate full computer interfaces marks a shift toward AI systems that actively execute tasks, positioning it as a central tool for automating complex, multi-application workflows.

Full transcript

And we'll roll in both cameras. >> Great. Thank you. >> Yeah. >> Hi everyone, Roma here. Codex has quickly evolved from a coding agent into a real teammate. But not just a coding teammate anymore. You can literally use Codex for any tasks and computer use is a big part of that shift. It takes Codex beyond your tools and files and into the real work you do with your local apps. Today I'm joined by Ari who has spent a lot of time thinking about this problem. So Arie, why computer use? Tell me more about how this works. >> Yeah, I'm so excited about computer use. Codeex already had the ability to do so many things on your computer because it could run commands. It could write code. So it can solve all kinds of problems for you. What's new is that there's all this software on your computer that is a graphical user interface. It's sort of something that as a human you use by looking at it, by moving your mouse, by clicking, by typing. And now Codeex can do all of that for you also. So it can use literally any application on your computer which is so powerful. It's really exciting to get to sort of see this come together and and make something that people can use for so many different things. >> One of the things that I that I found very delightful was the on boarding. So for people watching this who want to get started with codeex and want to try these like amazing features, the very first onboarding screen is very easy, right? Do you want to show it to us? >> Yeah, I'd love to. Yeah. Let's say this is the first time I'm using computer use. It's going to ask for my permission first, right? >> And so when it does that, I'll get this window that says enable codeex computer use. And when I press allow, >> it animates the panel straight into the settings window, which just helps helps you know where to look and and what you're supposed to do next. >> It tells you how to drag it >> and it tells you how to drag the list. And then you have to authorize because you're making changes to your system settings. And now, um, I was able to set up the whole thing in two drags. And then now you can see it's going and clicking and and doing the task for me >> and it's now done. >> Amazing. >> Yeah. >> Cool. So let's see computers in action. Um do you have one task on top of your head that you want to show us? >> Yeah, absolutely. So one thing that I do every so often is I need to test software in older Mac operating systems. And so for that I use virtual machines. And I have an app I love called UTM. But it's a pain to create a virtual machine. I have to click through a bunch of things. I have to run the Mac OS setup assistant. And so, um, >> sounds like a perfect use case. >> Perfect use case. So, now I can save a whole bunch of time by having the agent do it for me. So, I'm going to go into codeex and I'm going to say make a new Mac VM in UTM. And so, when I type at, it shows me a list of the apps I have on my computer and I can run the query and then it'll actually um start using the the app I selected. So, um, in this case, it's going to spin up UTM. And what we can see here is once it starts the app and once it starts using the app, you'll see the cursor fly in. >> That's awesome. >> It's so cool. What's cool about it is that it's different from my cursor. So Codeex can click around without interrupting what I'm doing on my computer. >> So you can keep on using your computer while Codeex is working in the background. >> Yeah, that's exactly right. You know, a lot of computer use implementations, in fact, every computer use implementation I've ever seen, takes over your entire computer. So, you can't use your computer while the agent is using your apps. >> And now it's already done. It sounds like it's downloading Mac OS. >> It's downloading Mac OS. So, um you know, once Mac OS finishes downloading, it can also complete the next step, which is actually um setting up Mac OS for me. >> Um which so much time. >> Should we like uh try to do another uh of these computer use like tasks in the background? Can you do multiple? >> Absolutely. So, I want to focus play some good music for uh for work for me in in Spotify. So, now the agent's going to start using Spotify. Um, but what's super powerful about this is it can actually do things um across multiple applications. It can do multiple curses in multiple apps at the same time. So, I'm going to say add a reminder in the reminders app to uh tonight to look through my tax documents. The music's coming. >> Music's going in. Spotify >> starting to add some reminders for me. Um, so now all of a sudden my Mac is this multitasking environment where I can do uh so many things at once and have agents do all the things that I don't want to be spending my time on. >> Yeah. >> Okay. That's that's so cool. Now you have like three apps in the background that Codeex has been driving and everything you've done with the cursor has been so delightful too. Like do you want to tell us more about this? Yeah, we wanted to make something that felt fun to use, that felt natural. And so the motion of the cursor is something that is important when you're watching it use your apps. You sort of want to understand what it's doing. And so we put some effort into finding these uh curves of motion that feel natural and feel kind of whimsical where the arrow turns in the direction of motion so it looks like it's swimming across your screen. It's makes it fun to use. >> Yeah, it's really delightful also to like have a better sense and understanding of what the agent is actually doing with every one of these apps. >> Yeah, I really love it. Um, one thing that I want to touch on is you can use computer use with a faster model like Spark, right? Tell me more about how you guys start about multimodal and accessibility combined. >> Yeah, we've put together some really exciting things uh with the way that it works with the model and it's just only the beginning of the kind of work that we're able to do. Historically, computer use has been only something that works with screenshots. It takes advantage of the power of multimodal models. So the model can see the interface and and click and type by coordinate. Um, which is is great, but it turns out there's all this hidden information that is possible to extract about the interface of an application through the accessibility framework. And so we have spent a lot of time figuring out how to make use of this in a way that enhances the model's abilities. We pull a bunch of information that is textual describing the interface and the model can use that to see things even that are scrolled off screen. it can understand more deeply the role of each element that's on screen. And so this just makes the model super accurate at performing tasks. And then the other benefit of it which you were alluding to is that because it doesn't require images necessarily, we can use non- multimodal models like codec spark which are super fast. And so all of a sudden you have this experience where computer use can when you use one of those models use use software even faster than you can. >> That's amazing. Do you want to try do you want to try one of these tasks for instance? were like we would switch the model to Spark. >> Yeah, absolutely. >> Relied remain in messages to try computer use for debugging apps. And so what we'll see is before computer use was, you know, pretty performant, but now with this with the Spark model, it's like super human. It uses the software literally faster than than than a human would. We see it here like open the text uh type the message now to me >> and in a second it sent >> that's pretty incredible. >> Pretty sick. So it just did this in the background. I was able to do other things on my computer at the same time and it's super >> and we got it. >> I have it. >> Very nice. >> Asking me to try computer use for debugging apps. >> Sick. >> Incredible. You brought so much from your knowledge of Sky into the Codex app. Now this is incredible. U working with the research team now at OpenAI. Where do you see the the future uh of computer use? >> Yeah, earlier products like operator and chatbt agent for those products we used to train dedicated models for computer use. And since then the research team has done this amazing work to actually bring those capabilities into the main GPT models. And so now we're actually building this on codeex on the same models that are available through the API and everyone can build these amazing computer use capabilities. So that's been um super super nice and also great streamlining for our workflow internally. I think that, you know, it's amazing how fast we've been able to get this to work, um, you know, with the mainline models and with Spark. Um, but I think we are going to want to get to a place where computer use is superhuman. You know, I think that we can get to a place where computer use can operate a computer two, five, 10 times as fast as a person. And I think that's where it's going to become indispensable. You know, you're going to want to use it um for for so many computing tasks, for really everything you do in your life. and it's going to save you so much time and let you focus on the things that are important. Um, and so I'm I'm really excited about what the road map looks like there. >> One thing I wanted to touch on maybe that people might be curious about is like the safety approach to all of this, you know, like you have these amazing capabilities for Codeex to now kind of drive some apps on your Mac. How are you guys all thinking about like safety? >> Yeah, it's such a good question. And I feel like the this type of technology has the potential to be kind of scary because it's actually taking over, you know, the actions that you would do on your computer and it has access to so much stuff. So we feel like it's so important that people feel really comfortable using this technology and so we've been spend a lot of time thinking about how to do that. One of the things that we've done here is we've made computer use such that it can only access applications that you allow. Every time Codeex goes to use an app for the first time, it asks for your permission, right? And when you say yes, Codeex can see and type into that app, but it can't see or interact with any other app on your computer. So, if you have some stuff that's, you know, maybe a little bit sensitive um in in one of your applications, you can feel very confident knowing that Codeex can access your developer applications and your productivity applications without accessing, you know, anything that's that's more sensitive. And so, that just builds a lot of trust, I think, for the user. >> Absolutely. Yeah, that's pretty amazing cuz it's not like streaming your entire desktop or accessing all of your files, anything like that. It's very much like case by case, app by app. As you're trying to be productive, you're giving Codeex the permission to do so. I mean, obviously this is a simple task, but now that we've seen the power of kind of computer use, I'm curious like what have you used uh comput? Um what were kind of your magical moments that you've experienced with it? Yeah, I have all these spreadsheets that I use for like financial tracking and now I actually ask Codex to update them for me and I don't have to do it myself anymore. Super super powerful. >> That's incredible. I mean, it's hard to even imagine these days starting a task without Codex. >> Yeah, that's so true. Like nowadays when I want to start something new, whether it's programming or or even something else on my computer, I I feel like I want to turn to Codeex first because it saves me so much time. and we had the file system, we had the plugins to access all of these services online. It feels like the missing piece was computer use to be able to access the local apps. >> I definitely think so. Especially for me, I use a really wide variety of applications. I use a lot of web applications. I use a lot of Apple native apps. I actually track my spreadsheets in the numbers app. And so now this just brings all of that online, all of it into a place where Codex can uh can access it end to end. >> Pretty incredible. Thanks, Harry. Computer news is one of those capabilities that are hard to fully appreciate until you try it. All of the sudden, your computer works in a whole new way. And it's not just codecs moving around your computer. It's Codex actually doing real work for you in the background without breaking your flow. So try it on your hardest task. Maybe the one that has you bouncing around five apps and eats multiple hours of your day. We genuinely can't wait to see what you think. Computer use is available for a Mac today and we cannot wait to bring it to Windows users very soon. Thank you so much Harry. See you next time.

More from AI