What is Nostr?
jcorgan / Johnathan Corgan
npub19wa…8p6k
2025-02-04 19:27:41
in reply to nevent1q…wup8

jcorgan on Nostr: Here is your LLMs training other models: https://arxiv.org/abs/2501.18096 "We present ...

Here is your LLMs training other models:

https://arxiv.org/abs/2501.18096

"We present MILS: Multimodal Iterative LLM Solver, a surprisingly simple, training-free approach, to imbue multimodal capabilities into your favorite LLM. Leveraging their innate ability to perform multi-step reasoning, MILS prompts the LLM to generate candidate outputs, each of which are scored and fed back iteratively, eventually generating a solution to the task. This enables various applications that typically require training specialized models on task-specific data. In particular, we establish a new state-of-the-art on emergent zero-shot image, video and audio captioning. MILS seamlessly applies to media generation as well, discovering prompt rewrites to improve text-to-image generation… "
Author Public Key
npub19wavu4f7l6l43h24jyskn7fvzy37kcfp67aqjtmv2qgy4lp34nhsda8p6k