Jack Rusher on Nostr: “Our experiments reveal that LSTM and Transformer language models (i) ...
“Our experiments reveal that LSTM and Transformer language models (i) systematically underestimate the probability of sequences drawn from the target language, and (ii) do so more severely for less-probable sequences.”
https://arxiv.org/abs/2203.12788Published at
2025-02-11 15:54:04Event JSON
{
"id": "56698a0e688617412e8c3ed40e3b14906d1f36025b85f8a16c55a6ee0af6cf45",
"pubkey": "a0a11fb47760fd47108c96d1674aaa25803dfc6459fa2bcee6e13d57534f119e",
"created_at": 1739289244,
"kind": 1,
"tags": [
[
"proxy",
"https://berlin.social/users/jack/statuses/113986059895988280",
"activitypub"
]
],
"content": "“Our experiments reveal that LSTM and Transformer language models (i) systematically underestimate the probability of sequences drawn from the target language, and (ii) do so more severely for less-probable sequences.”\n\nhttps://arxiv.org/abs/2203.12788",
"sig": "02c572e8aa29f282629d422430c4f17e8f78c1b9c509e86f8a36fd431afc32911174866039a2965a2c7d2280889b0a12320e51a3e3b9dd347cd4509a1e65b170"
}