Russell O'Connor [ARCHIVE] on Nostr: 📅 Original date posted:2022-02-08 📝 Original message:On Mon, Jan 31, 2022 at ...
📅 Original date posted:2022-02-08
📝 Original message:On Mon, Jan 31, 2022 at 8:16 PM Anthony Towns <aj at erisian.com.au> wrote:
> On Fri, Jan 28, 2022 at 08:56:25AM -0500, Russell O'Connor via bitcoin-dev
> wrote:
> > >
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019243.html
> > For more complex interactions, I was imagining combining this TXHASH
> > proposal with CAT and/or rolling SHA256 opcodes. If TXHASH ended up
> > supporting relative or absolute input/output indexes then users could
> > assemble the hashes of the particular inputs and outputs they care about
> > into a single signed message.
>
> That's certainly possible, but it sure seems overly complicated and
> error prone...
>
Indeed, and we really want something that can be programmed at redemption
time.
That probably involves something like how the historic MULTISIG worked by
having list of input / output indexes be passed in along with length
arguments.
I don't think there will be problems with quadratic hashing here because as
more inputs are list, the witness in turns grows larger itself. The amount
of stack elements that can be copied is limited by a constant (3DUP).
Certainly care is needed here, but also keep in mind that an OP_HASH256
does a double hash and costs one weight unit.
That said, your SIGHASH_GROUP proposal suggests that some sort of
intra-input communication is really needed, and that is something I would
need to think about.
While normally I'd be hesitant about this sort of feature creep, when we
are talking about doing soft-forks, I really think it makes sense to think
through these sorts of issues (as we are doing here).
> > I don't think there is much in the way of lessons to be drawn from how we
> > see Bitcoin Script used today with regards to programs built out of
> > reusable components.
>
> I guess I think one conclusion we should draw is some modesty in how
> good we are at creating general reusable components. That is, bitcoin
> script looks a lot like a relatively general expression language,
> that should allow you to write interesting things; but in practice a
> lot of it was buggy (OP_VER hardforks and resource exhaustion issues),
> or not powerful enough to actually be interesting, or too complicated
> to actually get enough use out of [0].
>
> TXHASH + CSFSV won't be enough by itself to allow for very interesting
> > programs Bitcoin Script yet, we still need CAT and friends for that,
>
> "CAT" and "CHECKSIGFROMSTACK" are both things that have been available in
> elements for a while; has anyone managed to build anything interesting
> with them in practice, or are they only useful for thought experiments
> and blog posts? To me, that suggests that while they're useful for
> theoretical discussion, they don't turn out to be a good design in
> practice.
>
Perhaps the lesson to be drawn is that languages should support multiplying
two numbers together.
Having 2/3rd of the language you need to write interesting programs doesn't
mean that you get 2/3rd of the interesting programs written.
But beyond that, there is a lot more to a smart contract than just the
Script. Dmitry Petukhov has a fleshed out design for Asset based lending
on liquid at https://ruggedbytes.com/articles/ll/, despite the limitations
of (pre-taproot) Elements Script. But to make it a real thing you need
infrastructure for working with partial transactions, key management, etc.
> but
> > CSFSV is at least a step in that direction. CSFSV can take arbitrary
> > messages and these messages can be fixed strings, or they can be hashes
> of
> > strings (that need to be revealed), or they can be hashes returned from
> > TXHASH, or they can be locktime values, or they can be values that are
> > added or subtracted from locktime values, or they can be values used for
> > thresholds, or they can be other pubkeys for delegation purposes, or they
> > can be other signatures ... for who knows what purpose.
>
> I mean, if you can't even think of a couple of uses, that doesn't seem
> very interesting to pursue in the near term? CTV has something like half
> a dozen fairly near-term use cases, but obviously those can all be done
> just with TXHASH without a need for CSFS, and likewise all the ANYPREVOUT
> things can obviously be done via CHECKSIG without either TXHASH or CSFS...
>
> To me, the point of having CSFS (as opposed to CHECKSIG) seems to be
> verifying that an oracle asserted something; but for really simply boolean
> decisions, doing that via a DLC seems better in general since that moves
> more of the work off-chain; and for the case where the signature is being
> used to authenticate input into the script rather than just gating a path,
> that feels a bit like a weaker version of graftroot?
>
I didn't really mean this as a list of applications; it was a list of
values that CSFSV composes with. Applications include delegation of pubkeys
and oracles, and, in the presence of CAT and transaction reflection
primitives, presumably many more things.
> I guess I'd still be interested in the answer to:
>
> > > If we had CTV, POP_SIGDATA, and SIGHASH_NO_TX_DATA_AT_ALL but no
> OP_CAT,
> > > are there any practical use cases that wouldn't be covered that having
> > > TXHASH/CAT/CHECKSIGFROMSTACK instead would allow? Or where those would
> > > be significantly more convenient/efficient?
> > >
> > > (Assume "y x POP_SIGDATA POP_SIGDATA p CHECKSIGVERIFY q CHECKSIG"
> > > commits to a vector [x,y] via p but does not commit to either via q so
> > > that there's some "CAT"-like behaviour available)
>
I don't know if this is the answer you are looking for, but technically
TXHASH + CAT + SHA256 awkwardly gives you limited transaction reflection.
In fact, you might not even need TXHASH, though it certainly helps.
> TXHASH seems to me to be clearly the more flexible opcode compared to
> CTV; but maybe all that flexibility is wasted, and all the real use
> cases actually just want CHECKSIG or CTV? I'd feel much better having
> some idea of what the advantage of being flexible there is...
>
The flexibility of TXHASH is intended to head off the need for future soft
forks. If we had specific applications in mind, we could simply set up the
transaction hash flags to cover all the applications we know about. But it
is the applications that we don't know about that worry me. If we don't
put options in place with this soft-fork proposal, then they will need
their own soft-fork down the line; and the next application after that, and
so on.
If our attitude is to craft our soft-forks as narrowly as possible to limit
them to what only allows for given tasks, then we are going to end up
needing a lot more soft-forks, and that is not a good outcome.
But all that aside, probably the real question is can we simplify CTV's
> transaction message algorithm, if we assume APO is enabled simultaneously?
> If it doesn't get simplified and needs its own hashing algorithm anyway,
> that would be probably be a good reason to keep the separate.
>
> First, since ANYPREVOUT commits to the scriptPubKey, you'd need to use
> ANYPREVOUTANYSCRIPT for CTV-like behaviour.
>
> ANYPRVOUTANYSCRIPT is specced as commiting to:
> nVersion
> nLockTime
> nSequence
> spend_type and annex present
> sha_annex (if present)
> sha_outputs (ALL) or sha_single_output (SINGLE)
> key_version
> codesep_pos
>
> CTV commits to:
> nVersion
> nLockTime
> scriptSig hash "(maybe!)"
> input count
> sequences hash
> output count
> outputs hash
> input index
>
> (CTV thus allows annex malleability, since it neither commits to the
> annex nor forbids inclusion of an annex)
>
> "output count" and "outputs index" would both be covered by sha_outputs
> with ANYPREVOUTANYSCRIPT|ALL.
>
> I think "scriptSig hash" is only covered to avoid txid malleability; but
> just adjusting your protocol to use APO signatures instead of relying on
> the txid of future transactions also solves that problem.
>
> I believe "sequences hash", "input count" and "input index" are all an
> important part of ensuring that if you have two UTXOs distributing 0.42
> BTC to the same set of addresses via CTV, that you can't combine them in a
> single transaction and end up sending losing one of the UTXOs to fees. I
> don't believe there's a way to resolve that with bip 118 alone, however
> that does seem to be a similar problem to the one that SIGHASH_GROUP
> tries to solve.
>
It was my understanding that it is only "input count = 1" that prevents
this issue.
SIGHASH_GROUP [1] would be an alternative to ALL/SINGLE/NONE, with the exact
> group of outputs being committed to determined via the annex.
> ANYPREVOUTANYSCRIPT|GROUP would commit to:
>
> nVersion
> nLockTime
> nSequence
> spend_type and annex present
> sha_annex (if present)
> sha_group_outputs (GROUP)
> key_version
> codesep_pos
>
> So in that case if you have your two inputs:
>
> 0.42 [pays 0.21 to A, 0.10 to B, 0.10 to C]
> 0.42 [pays 0.21 to A, 0.10 to B, 0.10 to C]
>
> then, either:
>
> a) if they're both committed with GROUP and sig_group_count = 3, then
> the outputs must be [0.21 A, 0.10 B, 0.10 C, 0.21 A, 0.10 B, 0.10
> C], and you don't lose funds
>
> b) if they're both committed with GROUP and the first is
> sig_group_count=3 and the second is sig_group_count=0, then the
> outputs can be [0.21 A, 0.10 B, 0.10 C, *anything] -- but in that
> case the second input is already signalling that it's meant to be
> paired with another input to fund the same three outputs, so any
> funds loss is at least intentional
>
> Note that this means txids are very unstable: if a tx is only protected
> by SIGHASH_GROUP commitments then miners/relayers can add outputs, or
> reorganise the groups without making the tx invalid. Beyond requiring
> the signatures to be APO/APOAS-based to deal with that, we'd also need
> to avoid txs getting rbf-pinned by some malicious third party who pulls
> apart the groups and assembles a new tx that's hard to rbf but also
> unlikely to confirm due to having a low feerate.
>
> Note also that not reusing addresses solves this case -- it's only a
> problem when you're paying the same amounts to the same addresses.
>
> Being able to combine additional inputs and outputs at a later date
> (which necessarily changes the txid) is an advantage though: it lets
> you add additional funds and claim change, which allows you to adjust
> to different fee rates.
>
> I don't think the SIGHASH_GROUP approach would work very well without
> access to the annex, ie if you're trying to do CTV encoded either in a
> plain scriptPubKey or via segwit/p2sh.
>
> I think that would give 16 different sighashes, choosing one of four
> options for outputs,
>
> ALL/NONE/SINGLE/GROUP
> -- which outputs are committed to
>
> and one of four options for inputs,
>
> -/ANYONECANPAY/ANYPREVOUT/ANYPREVOUTANYSCRIPT
> -- all inputs committed to, specific input committed to,
> scriptpubkey/tapscript committed to, or just the
> nseq/annex/codesep_pos
>
> vs the ~155,000 sighashes in the TXHASH proposal.
>
> I don't think there's an efficient way of doing SIGHASH_GROUP via tx
> introspection opcodes that doesn't also introduce a quadratic hashing
> risk -- you need to prevent different inputs from re-hashing distinct but
> overlapping sets of outputs, and if your opcodes only allow grabbing one
> output at a time to add to the message being signed you have to do a lot
> of coding if you want to let the signer choose how many outputs to commit
> to; if you provide an opcode that grabs man outputs to hash, it seems
> hard to do that generically in a way that avoids quadratic behaviour.
>
> So I think that suggests two alternative approaches, beyond the
> VERIFY-vs-PUSH semantic:
>
> - have a dedicated sighash type for CTV (either an explicit one for it,
> per bip119, or support thousands of options like the proposal in this
> thread, one of which happens to be about the same as the bip119 idea)
>
> - use ANYPREVOUTANYSCRIPT|GROUP for CTV, which means also implementing
> annex parsing and better RBF behaviour to avoid those txs being
> excessively vulnerable to pinning; with the advantage being that
> txs using "GROUP" sigs can be combined either for batching purposes
> or for adapting to the fee market after the signature has been made,
> and the disadvantage that you can't rely on stable txids when looking
> for CTV spends and have to continue using APO/APOAS when chaining
> signatures on top of unconfirmed CTV outputs
>
> Cheers,
> aj
>
> [0] Here's bitmatrix trying to multiply two numbers together:
>
> https://medium.com/bit-matrix/technical-how-does-bitmatrix-v1-multiply-two-integers-in-the-absence-of-op-mul-a58b7a3794a3
>
> Likewise, doing a point preimage reveal via clever scripting
> pre-taproot never saw an implementation, despite seeming
> theoretically plausible.
>
> https://lists.linuxfoundation.org/pipermail/lightning-dev/2015-November/000344.html
>
> [1]
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019243.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20220207/848ec59e/attachment-0001.html>
📝 Original message:On Mon, Jan 31, 2022 at 8:16 PM Anthony Towns <aj at erisian.com.au> wrote:
> On Fri, Jan 28, 2022 at 08:56:25AM -0500, Russell O'Connor via bitcoin-dev
> wrote:
> > >
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019243.html
> > For more complex interactions, I was imagining combining this TXHASH
> > proposal with CAT and/or rolling SHA256 opcodes. If TXHASH ended up
> > supporting relative or absolute input/output indexes then users could
> > assemble the hashes of the particular inputs and outputs they care about
> > into a single signed message.
>
> That's certainly possible, but it sure seems overly complicated and
> error prone...
>
Indeed, and we really want something that can be programmed at redemption
time.
That probably involves something like how the historic MULTISIG worked by
having list of input / output indexes be passed in along with length
arguments.
I don't think there will be problems with quadratic hashing here because as
more inputs are list, the witness in turns grows larger itself. The amount
of stack elements that can be copied is limited by a constant (3DUP).
Certainly care is needed here, but also keep in mind that an OP_HASH256
does a double hash and costs one weight unit.
That said, your SIGHASH_GROUP proposal suggests that some sort of
intra-input communication is really needed, and that is something I would
need to think about.
While normally I'd be hesitant about this sort of feature creep, when we
are talking about doing soft-forks, I really think it makes sense to think
through these sorts of issues (as we are doing here).
> > I don't think there is much in the way of lessons to be drawn from how we
> > see Bitcoin Script used today with regards to programs built out of
> > reusable components.
>
> I guess I think one conclusion we should draw is some modesty in how
> good we are at creating general reusable components. That is, bitcoin
> script looks a lot like a relatively general expression language,
> that should allow you to write interesting things; but in practice a
> lot of it was buggy (OP_VER hardforks and resource exhaustion issues),
> or not powerful enough to actually be interesting, or too complicated
> to actually get enough use out of [0].
>
> TXHASH + CSFSV won't be enough by itself to allow for very interesting
> > programs Bitcoin Script yet, we still need CAT and friends for that,
>
> "CAT" and "CHECKSIGFROMSTACK" are both things that have been available in
> elements for a while; has anyone managed to build anything interesting
> with them in practice, or are they only useful for thought experiments
> and blog posts? To me, that suggests that while they're useful for
> theoretical discussion, they don't turn out to be a good design in
> practice.
>
Perhaps the lesson to be drawn is that languages should support multiplying
two numbers together.
Having 2/3rd of the language you need to write interesting programs doesn't
mean that you get 2/3rd of the interesting programs written.
But beyond that, there is a lot more to a smart contract than just the
Script. Dmitry Petukhov has a fleshed out design for Asset based lending
on liquid at https://ruggedbytes.com/articles/ll/, despite the limitations
of (pre-taproot) Elements Script. But to make it a real thing you need
infrastructure for working with partial transactions, key management, etc.
> but
> > CSFSV is at least a step in that direction. CSFSV can take arbitrary
> > messages and these messages can be fixed strings, or they can be hashes
> of
> > strings (that need to be revealed), or they can be hashes returned from
> > TXHASH, or they can be locktime values, or they can be values that are
> > added or subtracted from locktime values, or they can be values used for
> > thresholds, or they can be other pubkeys for delegation purposes, or they
> > can be other signatures ... for who knows what purpose.
>
> I mean, if you can't even think of a couple of uses, that doesn't seem
> very interesting to pursue in the near term? CTV has something like half
> a dozen fairly near-term use cases, but obviously those can all be done
> just with TXHASH without a need for CSFS, and likewise all the ANYPREVOUT
> things can obviously be done via CHECKSIG without either TXHASH or CSFS...
>
> To me, the point of having CSFS (as opposed to CHECKSIG) seems to be
> verifying that an oracle asserted something; but for really simply boolean
> decisions, doing that via a DLC seems better in general since that moves
> more of the work off-chain; and for the case where the signature is being
> used to authenticate input into the script rather than just gating a path,
> that feels a bit like a weaker version of graftroot?
>
I didn't really mean this as a list of applications; it was a list of
values that CSFSV composes with. Applications include delegation of pubkeys
and oracles, and, in the presence of CAT and transaction reflection
primitives, presumably many more things.
> I guess I'd still be interested in the answer to:
>
> > > If we had CTV, POP_SIGDATA, and SIGHASH_NO_TX_DATA_AT_ALL but no
> OP_CAT,
> > > are there any practical use cases that wouldn't be covered that having
> > > TXHASH/CAT/CHECKSIGFROMSTACK instead would allow? Or where those would
> > > be significantly more convenient/efficient?
> > >
> > > (Assume "y x POP_SIGDATA POP_SIGDATA p CHECKSIGVERIFY q CHECKSIG"
> > > commits to a vector [x,y] via p but does not commit to either via q so
> > > that there's some "CAT"-like behaviour available)
>
I don't know if this is the answer you are looking for, but technically
TXHASH + CAT + SHA256 awkwardly gives you limited transaction reflection.
In fact, you might not even need TXHASH, though it certainly helps.
> TXHASH seems to me to be clearly the more flexible opcode compared to
> CTV; but maybe all that flexibility is wasted, and all the real use
> cases actually just want CHECKSIG or CTV? I'd feel much better having
> some idea of what the advantage of being flexible there is...
>
The flexibility of TXHASH is intended to head off the need for future soft
forks. If we had specific applications in mind, we could simply set up the
transaction hash flags to cover all the applications we know about. But it
is the applications that we don't know about that worry me. If we don't
put options in place with this soft-fork proposal, then they will need
their own soft-fork down the line; and the next application after that, and
so on.
If our attitude is to craft our soft-forks as narrowly as possible to limit
them to what only allows for given tasks, then we are going to end up
needing a lot more soft-forks, and that is not a good outcome.
But all that aside, probably the real question is can we simplify CTV's
> transaction message algorithm, if we assume APO is enabled simultaneously?
> If it doesn't get simplified and needs its own hashing algorithm anyway,
> that would be probably be a good reason to keep the separate.
>
> First, since ANYPREVOUT commits to the scriptPubKey, you'd need to use
> ANYPREVOUTANYSCRIPT for CTV-like behaviour.
>
> ANYPRVOUTANYSCRIPT is specced as commiting to:
> nVersion
> nLockTime
> nSequence
> spend_type and annex present
> sha_annex (if present)
> sha_outputs (ALL) or sha_single_output (SINGLE)
> key_version
> codesep_pos
>
> CTV commits to:
> nVersion
> nLockTime
> scriptSig hash "(maybe!)"
> input count
> sequences hash
> output count
> outputs hash
> input index
>
> (CTV thus allows annex malleability, since it neither commits to the
> annex nor forbids inclusion of an annex)
>
> "output count" and "outputs index" would both be covered by sha_outputs
> with ANYPREVOUTANYSCRIPT|ALL.
>
> I think "scriptSig hash" is only covered to avoid txid malleability; but
> just adjusting your protocol to use APO signatures instead of relying on
> the txid of future transactions also solves that problem.
>
> I believe "sequences hash", "input count" and "input index" are all an
> important part of ensuring that if you have two UTXOs distributing 0.42
> BTC to the same set of addresses via CTV, that you can't combine them in a
> single transaction and end up sending losing one of the UTXOs to fees. I
> don't believe there's a way to resolve that with bip 118 alone, however
> that does seem to be a similar problem to the one that SIGHASH_GROUP
> tries to solve.
>
It was my understanding that it is only "input count = 1" that prevents
this issue.
SIGHASH_GROUP [1] would be an alternative to ALL/SINGLE/NONE, with the exact
> group of outputs being committed to determined via the annex.
> ANYPREVOUTANYSCRIPT|GROUP would commit to:
>
> nVersion
> nLockTime
> nSequence
> spend_type and annex present
> sha_annex (if present)
> sha_group_outputs (GROUP)
> key_version
> codesep_pos
>
> So in that case if you have your two inputs:
>
> 0.42 [pays 0.21 to A, 0.10 to B, 0.10 to C]
> 0.42 [pays 0.21 to A, 0.10 to B, 0.10 to C]
>
> then, either:
>
> a) if they're both committed with GROUP and sig_group_count = 3, then
> the outputs must be [0.21 A, 0.10 B, 0.10 C, 0.21 A, 0.10 B, 0.10
> C], and you don't lose funds
>
> b) if they're both committed with GROUP and the first is
> sig_group_count=3 and the second is sig_group_count=0, then the
> outputs can be [0.21 A, 0.10 B, 0.10 C, *anything] -- but in that
> case the second input is already signalling that it's meant to be
> paired with another input to fund the same three outputs, so any
> funds loss is at least intentional
>
> Note that this means txids are very unstable: if a tx is only protected
> by SIGHASH_GROUP commitments then miners/relayers can add outputs, or
> reorganise the groups without making the tx invalid. Beyond requiring
> the signatures to be APO/APOAS-based to deal with that, we'd also need
> to avoid txs getting rbf-pinned by some malicious third party who pulls
> apart the groups and assembles a new tx that's hard to rbf but also
> unlikely to confirm due to having a low feerate.
>
> Note also that not reusing addresses solves this case -- it's only a
> problem when you're paying the same amounts to the same addresses.
>
> Being able to combine additional inputs and outputs at a later date
> (which necessarily changes the txid) is an advantage though: it lets
> you add additional funds and claim change, which allows you to adjust
> to different fee rates.
>
> I don't think the SIGHASH_GROUP approach would work very well without
> access to the annex, ie if you're trying to do CTV encoded either in a
> plain scriptPubKey or via segwit/p2sh.
>
> I think that would give 16 different sighashes, choosing one of four
> options for outputs,
>
> ALL/NONE/SINGLE/GROUP
> -- which outputs are committed to
>
> and one of four options for inputs,
>
> -/ANYONECANPAY/ANYPREVOUT/ANYPREVOUTANYSCRIPT
> -- all inputs committed to, specific input committed to,
> scriptpubkey/tapscript committed to, or just the
> nseq/annex/codesep_pos
>
> vs the ~155,000 sighashes in the TXHASH proposal.
>
> I don't think there's an efficient way of doing SIGHASH_GROUP via tx
> introspection opcodes that doesn't also introduce a quadratic hashing
> risk -- you need to prevent different inputs from re-hashing distinct but
> overlapping sets of outputs, and if your opcodes only allow grabbing one
> output at a time to add to the message being signed you have to do a lot
> of coding if you want to let the signer choose how many outputs to commit
> to; if you provide an opcode that grabs man outputs to hash, it seems
> hard to do that generically in a way that avoids quadratic behaviour.
>
> So I think that suggests two alternative approaches, beyond the
> VERIFY-vs-PUSH semantic:
>
> - have a dedicated sighash type for CTV (either an explicit one for it,
> per bip119, or support thousands of options like the proposal in this
> thread, one of which happens to be about the same as the bip119 idea)
>
> - use ANYPREVOUTANYSCRIPT|GROUP for CTV, which means also implementing
> annex parsing and better RBF behaviour to avoid those txs being
> excessively vulnerable to pinning; with the advantage being that
> txs using "GROUP" sigs can be combined either for batching purposes
> or for adapting to the fee market after the signature has been made,
> and the disadvantage that you can't rely on stable txids when looking
> for CTV spends and have to continue using APO/APOAS when chaining
> signatures on top of unconfirmed CTV outputs
>
> Cheers,
> aj
>
> [0] Here's bitmatrix trying to multiply two numbers together:
>
> https://medium.com/bit-matrix/technical-how-does-bitmatrix-v1-multiply-two-integers-in-the-absence-of-op-mul-a58b7a3794a3
>
> Likewise, doing a point preimage reveal via clever scripting
> pre-taproot never saw an implementation, despite seeming
> theoretically plausible.
>
> https://lists.linuxfoundation.org/pipermail/lightning-dev/2015-November/000344.html
>
> [1]
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-July/019243.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20220207/848ec59e/attachment-0001.html>