What is Nostr?
Pieter Wuille [ARCHIVE] /
npub1tje…tl6r
2023-06-07 18:20:36
in reply to nevent1q…ysdf

Pieter Wuille [ARCHIVE] on Nostr: 📅 Original date posted:2019-09-18 📝 Original message:Hi all, In the draft for ...

📅 Original date posted:2019-09-18
📝 Original message:Hi all,

In the draft for bip-tapscript (see [1], current version [2]), we
propose removing the per-block sigops limit for tapscript scripts, and
replacing it with a "every script gets a budget of sigops based on its
witness size (one per 50 WU)". Since signatures (plus pubkeys) take
more WU than that, this is not a restriction for anything but
pathologically constructed scripts. Simultaneously, it removes the
multi-dimensional optimization problem that theoretically needs to be
solved to maximize revenue in block template construction.

With our recent work on Miniscript (see [3]), we discovered that the
variety of other script resource limits also introduce (weaker)
complex optimization requirements, but for script constructors instead
of miners. An overview:
1) Scripts are limited to 10000 bytes (and 3600 by standardness currently)
2) The total number of non-push opcodes in a script + the number of
keys participating in executed OP_CHECKMULTISIG(VERIFY) opcodes must
not exceed 201.
3) The size of the stack + altstack combined cannot exceed 1000
elements during execution (and the initial stack is limited to 100
elements by standardness currently)
4) The maximum size of elements on the stack is 520 bytes (and 80
bytes in the initial stack by standardness)

In a discussion about this with Andrew Poelstra we wondered whether
all these limits are still necessary in bip-tapscript. I believe the
only relevant ones are those that reduce total memory usage, or
verification CPU usage per witness byte. Total script verification CPU
usage isn't relevant I believe, because the same effect can be
accomplished by having a transaction (or block) with multiple inputs.

So let's go over the above resource limits, and see how they help with
limiting memory usage or CPu usage per byte.

# Script size limit

Memory usage for validation can grow with larger scripts, but only
indirectly by constructing extra stack data. Since those are
independently limited by (3), we don't need to consider those here.

There used to be a way through which larger scripts would cause larger
per byte verification cost, but it no longer applies, I believe. Due
to the scriptCode being replaced with a pre-hashed tapleaf hash, the
per-sigop hashing cost is now easily made independent of the size of
the script in implementations.

My suggestion is to drop the script size limit in tapscript, and
instead have it only be implicitly limited by transaction size limits.

# The 201 non-push opcodes limit

Ignoring how more opcodes can grow the stack and altstack (which are
already restricted by 3), I believe there is only one way that
additional (executed) opcodes can increase per-opcode execution time
in the current Bitcoin Core implementation [4], namely the "vfExec"
stack that keeps track of what sides of IF/NOTIF/ELSE/ENDIF execution
is currently passing through. As pointed out by Sergio Demian Lerner
[5], an O(1) algorithm can do this just as well (a variant of which is
implemented in PR 16902 [6]).

Taking such a patch into account, I don't think there are any problems
with removing the 201 ops limit for bip-tapscript scripts. Especially
given its strange semantics around OP_CHECKMULTISIG(VERIFY) (the keys
participating in those are each counted as 1 towards the 201 limit,
but only when executed, while all non-push opcodes are counted as 1
even when not executed), I think this is a nice simplification.

# The 1000 element limit for stack + altstack

A limit for the number of elements on the stack/altstack directly
affects memory usage. In a naive implementation without deduplication
as is used in Bitcoin Core now, every OP_3DUP can add 120 bytes of
memory usage plus the size of the data in the created elements
themselves (which can be a multiple of that number), leading to
several GB of memory usage for executing a maximal 4 MB script
(multiplied by the number of parallel executions). Even when using
reference-counting techniques to reduce duplication, 100 MB memory
usage is not unreasonable. I don't think those are acceptable numbers.

The stack size can also directly affect per-opcode execution time for
OP_ROLL, again shown by [5]. A block full of the most tightly packed
OP_ROLLS (which I believe is a repetition of OP_3DUP OP_ROLL OP_ROLL
OP_ROLL) operating on a stack of 1000 elements for me takes around 4.3
s of CPU time to verify. That's significant, but it's surprisingly
close to what a block packed with OP_CHECKSIGs (taking the 1 sigop /
50 WU limit into account) takes to verify on the same machine (3.8 s).
Even more remarkably, that time is also very close to how long a block
full of most tightly packed OP_HASH256s on 520 byte inputs take to
verify when disabling SHA256 hardware acceleration (3.6 s).

I believe we should keep this 1000 element stack limit for these
reasons. The 100 limit on input stacks can be increased to 1000 for
uniformity with the during-execution limit.

# The 520 byte stack element size limit

Given that there are no known use cases for stack elements larger than
65 bytes (and no opcodes apart from hashes that can even operate on
them), plus their impact on memory usage the execution time of
pathologically constructed scripts full of hashes (see above), I think
we should keep this limit.

Note that this limit can be changed using the OP_SUCCESSx mechanism, if need be.

# Summary

I propose the following changes to resource limits in bip-tapscript
(compared to segwit v0 scripts):

* Replace the separate sigops counter with a "executed sigops must not
exceed (witness size / 50 WU) + 1" rule (already in the BIP).
* Drop the 10000 byte limit for script size (and 3600 byte standardness limit)
* Drop the 201 non-push ops limit per script.
* Drop the 100 input stack elements standardness limit, and replace
with a (consensus) 1000 limit.

The rules limiting the stack + altstack number of elements during
execution to 1000 remains, as well as the 520 byte limit for elements
on the stack.

# References

[1] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-May/016914.html
[2] https://github.com/sipa/bips/blob/bip-schnorr/bip-tapscript.mediawiki
[3] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-August/017270.html
[4] https://github.com/bitcoin/bitcoin/blob/v0.18.1/src/script/interpreter.cpp#L281L1084
[5] https://bitslog.com/2017/04/17/new-quadratic-delays-in-bitcoin-scripts/
[6] https://github.com/bitcoin/bitcoin/pull/16902

Cheers,

--
Pieter
Author Public Key
npub1tjephawh7fdf6358jufuh5eyxwauzrjqa7qn50pglee4tayc2ntqcjtl6r