Here is your LLMs training other models: https://arxiv.org/abs/2501.18096 "We present ...

2025-02-04 19:27:41

Here is your LLMs training other models:

https://arxiv.org/abs/2501.18096

"We present MILS: Multimodal Iterative LLM Solver, a surprisingly simple, training-free approach, to imbue multimodal capabilities into your favorite LLM. Leveraging their innate ability to perform multi-step reasoning, MILS prompts the LLM to generate candidate outputs, each of which are scored and fed back iteratively, eventually generating a solution to the task. This enables various applications that typically require training specialized models on task-specific data. In particular, we establish a new state-of-the-art on emergent zero-shot image, video and audio captioning. MILS seamlessly applies to media generation as well, discovering prompt rewrites to improve text-to-image generation… "

Author Public Key

npub19wavu4f7l6l43h24jyskn7fvzy37kcfp67aqjtmv2qgy4lp34nhsda8p6k

Seen on

wss://nos.lol wss://relay.damus.io

Show more details

Published at

2025-02-04 19:27:41

Kind type

1 Short Text Note

Event JSON

{ "id": "97272386f92af46929262b3713a3e0dce31cadefc4fc6df408a633efa34bc598", "pubkey": "2bbace553efebf58dd55912169f92c1123eb6121d7ba092f6c50104afc31acef", "created_at": 1738697261, "kind": 1, "tags": [ [ "e", "c6f2cd1b4d54a9005531e5648230f3a82aa15bd5245b9a47c66982832b71d5df", "wss://relay.wellorder.net", "root" ], [ "p", "f728d9e6e7048358e70930f5ca64b097770d989ccd86854fe618eda9c8a38106" ], [ "r", "wss://nostr.wine/" ], [ "r", "wss://e.nos.lol/" ], [ "r", "wss://eden.nostr.land/" ], [ "r", "wss://nos.lol/" ], [ "r", "wss://nostr21.com/" ], [ "r", "wss://nostrue.com/" ], [ "r", "wss://purplepag.es/" ], [ "r", "wss://pyramid.fiatjaf.com/" ], [ "r", "wss://relay.bitcoinpark.com/" ], [ "r", "wss://relay.damus.io/" ], [ "r", "wss://relay.mostr.pub/" ], [ "r", "wss://relay.nostr.band/" ], [ "r", "wss://relay.nostrplebs.com/" ], [ "r", "wss://relay.primal.net/" ], [ "r", "wss://relay.snort.social/" ], [ "r", "wss://soloco.nl/" ], [ "r", "wss://bevo.nostr1.com/" ], [ "r", "wss://bitcoinmaximalists.online/" ] ], "content": "Here is your LLMs training other models:\n\nhttps://arxiv.org/abs/2501.18096\n\n\"We present MILS: Multimodal Iterative LLM Solver, a surprisingly simple, training-free approach, to imbue multimodal capabilities into your favorite LLM. Leveraging their innate ability to perform multi-step reasoning, MILS prompts the LLM to generate candidate outputs, each of which are scored and fed back iteratively, eventually generating a solution to the task. This enables various applications that typically require training specialized models on task-specific data. In particular, we establish a new state-of-the-art on emergent zero-shot image, video and audio captioning. MILS seamlessly applies to media generation as well, discovering prompt rewrites to improve text-to-image generation… \"", "sig": "16c392941bbe24b618d2f3fdfd3906e637aad8efc086ebc0a4428499232d4a3a32bbb0b03fc74adca1adb1b1cf30d5558636b8a0c22f57f310e9873bf81a08bf" }

jcorgan on Nostr: Here is your LLMs training other models: https://arxiv.org/abs/2501.18096 "We present ...