Simon Willison on Nostr: Two new Mistral models today: Codestral Mamba is an Apache 2 licensed 7B code model ...
Two new Mistral models today: Codestral Mamba is an Apache 2 licensed 7B code model using a Mamba architecture (not transformers!), and MathΣtral is for "math reasoning and scientific discovery"
My notes here, including an update to my llm-mistral plugin:
https://simonwillison.net/2024/Jul/16/codestral-mamba/Published at
2024-07-16 17:07:28Event JSON
{
"id": "9c3106a5fca68d59d668b25a87e8ef22b46d87206a0377f1da79f95eb6fa2cf2",
"pubkey": "8b0be93ed69c30e9a68159fd384fd8308ce4bbf16c39e840e0803dcb6c08720e",
"created_at": 1721149648,
"kind": 1,
"tags": [
[
"proxy",
"https://fedi.simonwillison.net/users/simon/statuses/112797263380231483",
"activitypub"
]
],
"content": "Two new Mistral models today: Codestral Mamba is an Apache 2 licensed 7B code model using a Mamba architecture (not transformers!), and MathΣtral is for \"math reasoning and scientific discovery\"\n\nMy notes here, including an update to my llm-mistral plugin: https://simonwillison.net/2024/Jul/16/codestral-mamba/",
"sig": "6d71a30a1766ba4f3ee153bc0d76119b2d4bfe59e317f39c773947128576fd3c96157a8df97071557184d97e0178cafd023e29ba03738595912aaa4401a9158e"
}