melvincarvalho on Nostr: OmniParser is a model that you can run your self and splits the screen into logical ...
OmniParser is a model that you can run your self and splits the screen into logical elements. You can then use that to create actions that will interact with applications, on behalf of the user. Pay a bill, book a restaurant, create some art. Lots of utility in there. The future is Agentic.
https://microsoft.github.io/OmniParser/
https://microsoft.github.io/OmniParser/