Sherry on Nostr: The word open source actually is slightly different when used in bitcoin world and AI ...
The word open source actually is slightly different when used in bitcoin world and AI world
in #bitcoin, open source means you can download the source code, run it in your own machine, and everytime it runs, no matter where, it should have the same result. (think about it, it will be a huge desaster if one bitcoin address could have different amount of satoshis)
however, in #AI world, when an LLM claims to be open source, it usually means you can download the model parameters for free, and you can use it in your own project.
does it means it have the same level of transparency than other open source proejects having a long history like linux or bitcoin core? no. you could image AI model parameters as the result of data + training algorithms + training parameters. all of them could end up with different results. and usually, for those models who chalimed to be open source, they only open source the find result, aka, the parameters.
One of the reason that people are crazy about deepseek is that it opens up many details during training process, which is confirmed by open ai scientists that they are using same method(but open ai keep it as a secret). However, we still dont know what kind of data they are using during training, how they process the data, and will it have any, (i dont know how to use strong words), lets say, political bias, so that the model can influence the society to the direction the creator wants.
of course, technically, it is possible to train a model from scratch with uncensorshiped data, but it costs at least 5 million dollars for one batch.
in #bitcoin, open source means you can download the source code, run it in your own machine, and everytime it runs, no matter where, it should have the same result. (think about it, it will be a huge desaster if one bitcoin address could have different amount of satoshis)
however, in #AI world, when an LLM claims to be open source, it usually means you can download the model parameters for free, and you can use it in your own project.
does it means it have the same level of transparency than other open source proejects having a long history like linux or bitcoin core? no. you could image AI model parameters as the result of data + training algorithms + training parameters. all of them could end up with different results. and usually, for those models who chalimed to be open source, they only open source the find result, aka, the parameters.
One of the reason that people are crazy about deepseek is that it opens up many details during training process, which is confirmed by open ai scientists that they are using same method(but open ai keep it as a secret). However, we still dont know what kind of data they are using during training, how they process the data, and will it have any, (i dont know how to use strong words), lets say, political bias, so that the model can influence the society to the direction the creator wants.
of course, technically, it is possible to train a model from scratch with uncensorshiped data, but it costs at least 5 million dollars for one batch.