What is Nostr?
eric at voskuil.org [ARCHIVE] /
npub1r34…s8vu
2023-06-07 23:10:22
in reply to nevent1q…mdqu

eric at voskuil.org [ARCHIVE] on Nostr: 📅 Original date posted:2022-06-08 📝 Original message:Hi Suhas/Gloria, Good ...

📅 Original date posted:2022-06-08
📝 Original message:Hi Suhas/Gloria,

Good questions. I've started a new thread because it became something else...

Various ideas about packaging seem to be focused on the idea of an atomic message that is gossiped around the network like a transaction or block. From my perspective that seems to create a set of problems without good solutions, and it is not a proper analogy to those atomic structures. It may be worth taking the time to step back and take a close look at the underlying objective.

The sole objective, as expressed in the OP proposal, is to:

"Propagate transactions that are incentive-compatible to mine, even if they don't meet minimum feerate alone."

Effectively producing this outcome with an atomic packaging approach while at the same time maintaining network invariants seems unlikely, if not impossible.

Fees:

A node knows what fee rate a peer will accept, and announces individual txs that satisfy peer.feerate. Similarly a node knows its own feerate, and SHOULD drop any peer that announces txs that do not satisfy node.feerate.

Orphans:

A node MAY drop a peer that announces txs that the node sees as orphans against its DAG. It SHOULD drop the orphan tx and MAY request missing ancestors. Presumably after some amount of time connected to peer, node does not expect to see any more orphans from that peer, so these choices could evolve with the channel. However, the design that can only consider each tx in isolation will continue to cause orphan announcements on the channel. A below peer.feerate tx does not get announced to peer, and later a descendant high peer.feerate does get announced to the peer - as an orphan.

BIP133 (feefilter):

"There could be a small number of edge cases where a node's mempool min fee is actually less than the filter value a peer is aware of and transactions with fee rates between these values will now be newly inhibited."

https://github.com/bitcoin/bips/blob/master/bip-0133.mediawiki

Whether the problem is "small" or not depends on the disparity between node fee rates, which is not a matter of protocol. This is an existing problem that can and should be dealt with in packaging, as part of the above objective.

Packaged Transaction Relay:

One might instead think of packaging as a per-connection function, operating over its transaction (input->output) DAG and the feerate of its own node and that of the peer. Logically a "package" is nothing more than a set of transactions (optimized by announcement). Only a node can effectively determine the packaging required by each of its peers, since only the node is aware of peer.feerate.

The only way to avoid dead-ending packages (including individual transactions, as is the objective) is for a node to package txs for each peer. The origination of any package is then just a wallet peer doing what a node does - packaging transactions that satisfy peer.feerate (i.e. that of its node).

Current transaction relay (txB->txA):
===============================
Node0
txA.feerate > node.feerate, and not orphaned (accept txA)
txA.feerate > peer1.feerate (announce txA to peer1)
txA.feerate < peer2.feerate (do not announce txA to peer2)
-----
txB.feerate > node.feerate (accept txB)
txB.feerate > peer1.feerate (announce txB to peer1)
txB.feerate > peer2.feerate (announce txB to peer2)

Node1
Sees/accepts txA and txB.

Node2
Never sees txA, sees/rejects txB (as an orphan).

Packaged transaction relay (txB->txA):
===============================
Node0
txA.feerate > node.feerate, and not orphaned (accept txA)
txA.feerate > peer1.feerate (announce txA to peer1)
txA.feerate < peer2.feerate (do not announce txA to peer2)
-----
txB.feerate > node1.feerate (accept txB)
txB.feerate > peer1.feerate (announce txB to peer1)
txB.feerate > peer2.feerate (do not announce txB to peer2) <== avoid predictable orphan
txA.feerate + txB.feerate > peer2.feerate (announce pkg(A, B) to peer2) <= create minimal package

Node1
Sees/accepts txA and txB.

Node2
pkg(A, B) > node2.feerate (accept txA, txB)
txA.feerate > peer3.feerate (announce txA to peer3)
txB.feerate > peer3.feerate (announce txB to peer3)

Sees/accepts pkg(A, B).

Node3
Sees/accepts txA and txB. <= avoided unnecessary packaging

Summary:

In this design, any node that receives an announcement for a pkg (or tx) later determined to be less than node.feerate SHOULD drop the announcing peer. Unlike with existing tx relay, a node can become "current" and subsequently see few if any tx or pkg orphans, and MAY at some point decide to drop any peer that announces one. Notice that packages are created dynamically, and any package that doesn't need to be grouped gets trimmed down to individual transactions. Furthermore any tx that is "stuck" can be freed by simply sending another tx. The nodes at which the tx has become stuck will just package it up and relay it to peers. In other words, there is no impact on wallet implementation apart from raising the aggregate fee using a descendant transaction.

This is barely a protocol change - it's primarily implementation. All that should be required is an additional INV element type, such as MSG_TX_PACKAGE.

Additional constraints:

* All elements of MSG_TX_PACKAGE in one INV message MUST to be of the same package.
* A package MUST must define a set that can be mined into one block (size/sigops constraint).
* A package SHOULD not contain confirmed txs (a race may cause this).
* A package MUST minimally satisfy peer.feerate.
* A partial tx order, as in the manner of the block.txs ordering, MUST be imposed.
* A node SHOULD drop a peer that sends a package (or tx) below node.feerate.
* A node MAY drop a peer that sends a non-minimal package according to node.feerate.

The partial ordering of block.txs introduces an ordering constraint that precludes full parallelism in validating input attachment. This is an implementation artifact that made its way into consensus. However in the case of packaging, the set of txs is not presumed to be valid under the proof of work DoS guard. As such constraints should minimize the work/traffic required to invalidate the message. The partial order constraint ensures that the DAG can be built incrementally, dropping the attempt (and peer as desired) as soon as the first orphan is discovered. As a result the network traffic and work required is not materially different than with tx relay, with two exceptions.

These are the two central aspects of this approach (Avoiding Predictable Orphans and Creating Minimal Packages). These are graph search algorithms, some basic computer science. Minimality requires only that the package does not introduce txs that are not necessary to reach the peer.feerate (as these can always be packaged separately). It does not require that nodes all generate the same packages. It does not require negotiation, package identity, cryptography, or hashing. As a graph search it should be O(n) where n is the unconfirmed ancestry of the package, but should typically be much lower, if not a single step.

Sufficiently-low-fee nodes will see only single txs. Moderate-fee nodes may cause partial breakup of packages. Sufficiently high fee nodes will cause peers (having received and completed the acceptance of a tx/pkg with pkg.feerate < peer.feerate) to navigate from each tx/package external input until reaching txs above peer.feerate, or confirmed (both of which the peer is presumed to already have). If the pkg.feerate is sufficiently high to connect all external inputs to the intervening txs, they are added to the package and it is announced to the high fee peer. Note that the individual tx.feerate > peer.feerate is insufficient to ensure that the peer should have the tx, as there may be ancestor txs that do not, and for which the tx was insufficient to cause them to be packaged. So a non-caching algorithm must be able to chase each package external input to a confirmed tx (or cache the unconfirmed ancestry fee rate at each tx). Note that fee rates are not directly additive, both size/weight and fee are required for summation (and aggregate sigops should be considered).

This makes no assumptions about current implementations. The design would call for maintenance of a transaction (input->output) DAG with tx.feerate on each tx. This could be the unconfirmed tx graph (i.e. "memory pool") though it does not require maintenance of anything more than the parameters necessary to confirm a set of validated txs within a block. It is very reasonable to require this of any participating node. A simple version negotiation can identify a package-accepting/sending nodes.

I have thought about this for some time, but have not implemented either the graph search, source code, or BIP. Just wrote this off the top of my head. So I am sure there are some things I have incorrect or failed to consider. But I think it's worth discussing it at this point.

e

> -----Original Message-----
> From: bitcoin-dev <bitcoin-dev-bounces at lists.linuxfoundation.org> On
> Behalf Of Suhas Daftuar via bitcoin-dev
> Sent: Wednesday, June 8, 2022 8:59 AM
> To: Bitcoin Protocol Discussion <bitcoin-dev at lists.linuxfoundation.org>
> Subject: Re: [bitcoin-dev] Package Relay Proposal
>
> Hi,
>
> Thanks again for your work on this!
>
> One question I have is about potential bandwidth waste in the case of nodes
> running with different policy rules. Here's my understanding of a scenario I
> think could happen:
>
> 1) Transaction A is both low-fee and non-standard to some nodes on the
> network.
> 2) Whenever a transaction T that spends A is relayed, new nodes will send
> INV(PKGINFO1, T) to all package-relay peers.
> 3) Nodes on the network that have implemented package relay, but do not
> accept A, will send getdata(PKGINFO1, T) and learn all of T's unconfirmed
> parents (~32 bytes * number of parents(T)).
> 4) Such nodes will reject T. But because of transaction malleability, and to
> avoid being blinded to a transaction unnecessarily, these nodes will likely still
> send getdata(PKGINFO1, T) to every node that announces T, in case
> someone has a transaction that includes an alternate set of parent
> transactions that would pass policy checks.
>
> Is that understanding correct? I think a good design goal would be to not
> waste bandwidth in non-adversarial situations. In this case, there would be
> bandwidth waste from downloading duplicate data from all your peers, just
> because the announcement doesn't commit to the set of parent wtxids that
> we'd get from the peer (and so we are unable to determine that all our peers
> would be telling us the same thing, just based on the announcement).
>
> Some ways to mitigate this might be to: (a) include a hash (maybe even just a
> 20-byte hash -- is that enough security?) of the package wtxids (in some
> canonical ordering) along with the wtxid of the child in the initial
> announcement; (b) limit the use of v1 packages to transactions with very few
> parents (I don't know if this is reasonable for the use cases we have in mind).
>
> Another point I wanted to bring up is about the rules around v1 package
> validation generally, and the use of a blockhash in transaction relay
> specifically. My first observation is that it won't always be the case that a v1
> package relay node will be able to validate that a set of package transactions
> is fully sorted topologically, because there may be (non-parent) ancestors
> that are missing from the package and the best a peer can validate is
> topology within the package -- this means that a peer can validly (under this
> BIP) relay transaction packages out of the true topological sort (if all
> ancestors were included).
>
> This makes me wonder how useful this topological rule is. I suppose there is
> some value in preventing completely broken implementations from staying
> connected and so there is no harm in having the rule, but perhaps it would
> be helpful to add that nodes SHOULD order transactions based on topological
> sort in the complete transaction graph, so that if missing-from-package
> ancestors are already known by a peer (which is the expected case when
> using v1 package relay on transactions that have more than one generation
> of unconfirmed ancestor) then the remaining transactions are already
> properly ordered, and this is helpful even if unenforceable in general.
>
> The other observation I wanted to make was that having transaction relay
> gated on whether two nodes agree on chain tip seems like an overly
> restrictive criteria. I think an important design principle is that we want to
> minimize disruption from network splits -- if there are competing blocks
> found in a small window of time, it's likely that the utxo set is not materially
> different on the two chains (assuming miners are selecting from roughly the
> same sets of transactions when this happens, which is typical). Having
> transaction relay bifurcate on the two network halves would seem to
> exacerbate the difference between the two sides of the split -- users ought
> to be agnostic about how benign splits are resolved and would likely want
> their transactions to relay across the whole network.
>
> Additionally, use of a chain tip might impose a larger burden than is necessary
> on software that would seek to participate in transaction relay without
> implementing headers sync/validation. I don't know what software exists on
> the network, but I imagine there are a lot of scripts out there for transaction
> submission to the public p2p network, and in thinking about modifying such a
> script to utilize package relay it seems like an unnecessary added burden to
> first learn a node's tip before trying to relay a transaction.
>
> Could you explain again what the benefit of including the blockhash is? It
> seems like it is just so that a node could prioritize transaction relay from
> peers with the same chain tip to maximize the likelihood of transaction
> acceptance, but in the common case this seems like a pretty negligible
> concern, and in the case of a chain fork that persists for many minutes it
> seems better to me that we not partition the network into package-relay
> regimes and just risk a little extra bandwidth in one direction or the other. If
> we solve the problem I brought up at the beginning (of de-duplicating
> package data across peers with a package-wtxid-commitment in the
> announcement), I think this is just some wasted pkginfo bandwidth on a
> single-link, and not across links (as we could cache validation failure for a
> package-hash to avoid re-requesting duplicate pkginfo1 messages).
>
> Best,
> Suhas
>
>
> On Tue, Jun 7, 2022 at 1:57 PM Gloria Zhao via bitcoin-dev <bitcoin-
> dev at lists.linuxfoundation.org <mailto:bitcoin-
> dev at lists.linuxfoundation.org> > wrote:
>
>
> Hi Eric, aj, all,
>
> Sorry for the delayed response. @aj I'm including some paraphrased
> points from our offline discussion (thanks).
>
>
> > Other idea: what if you encode the parent txs as a short hash of the
> wtxid (something like bip152 short ids? perhaps seeded per peer so collisions
> will be different per peer?) and include that in the inv announcement?
> Would that work to avoid a round trip almost all of the time, while still giving
> you enough info to save bw by deduping parents?
>
>
> > As I suggested earlier, a package is fundamentally a compact block
> (or
> > block) announcement without the header. Compact block (BIP152)
> announcement
> > is already well-defined and widely implemented...
>
>
>
> > Let us not reinvent the wheel and/or introduce accidental
> complexity. I see
> > no reason why packaging is not simply BIP152 without the 'header'
> field, an
> > updated protocol version, and the following sort of changes to
> names
>
> Interestingly, "why not use BIP 152 shortids to save bandwidth?" is
> by far the most common suggestion I hear (including offline feedback).
> Here's a full explanation:
>
> BIP 152 shortens transaction hashes (32 bytes) to shortids (6 bytes)
> to save a significant amount of network bandwidth, which is extremely
> important in block relay. However, this comes at the expense of
> computational complexity. There is no way to directly calculate a transaction
> hash from a shortid; upon receipt of a compact block, a node is expected to
> calculate the shortids of every unconfirmed transaction it knows about to
> find the matches (BIP 152: [1], Bitcoin Core: [2]). This is expensive but
> appropriate for block relay, since the block must have a valid Proof of Work
> and new blocks only come every ~10 minutes. On the other hand, if we
> require nodes to calculate shortids for every transaction in their mempools
> every time they receive a package, we are creating a DoS vector.
> Unconfirmed transactions don't need PoW and, to have a live transaction
> relay network, we should expect nodes to handle transactions at a high-ish
> rate (i.e. at least 1000s of times more transactions than blocks). We can't pre-
> calculate or cache shortids for mempool transactions, since the SipHash key
> depends on the block hash and a per-connection salt.
>
> Additionally, shortid calculation is not designed to prevent intentional
> individual collisions. If we were to use these shortids to deduplicate
> transactions we've supposedly already seen, we may have a censorship
> vector. Again, these tradeoffs make sense for compact block relay (see
> shortid section in BIP 152 [3]), but not package relay.
>
> TLDR: DoSy if we calculate shortids on every package and censorship
> vector if we use shortids for deduplication.
>
> > Given this message there is no reason
> > to send a (potentially bogus) fee rate with every package. It can
> only be
> > validated by obtaining the full set of txs, and the only recourse is
> > dropping (etc.) the peer, as is the case with single txs.
>
>
> Yeah, I agree with this. Combined with the previous discussion with
> aj (i.e. we can't accurately communicate the incentive compatibility of a
> package without sending the full graph, and this whole dance is to avoid
> downloading a few low-fee transactions in uncommon edge cases), I've
> realized I should remove the fee + weight information from pkginfo. Yay for
> less complexity!
>
>
> Also, this might be pedantic, but I said something incorrect earlier
> and would like to correct myself:
>
> >> In theory, yes, but maybe it was announced earlier (while our
> node was down?) or had dropped from our mempool or similar, either way
> we don't have those txs yet.
>
> I said "It's fine if they have Erlay, since a sender would know in
> advance that B is missing and announce it as a package." But this isn't true
> since we're only using reconciliation in place of flooding to announce
> transactions as they arrive, not for rebroadcast, and we're not doing full
> mempool set reconciliation. In any case, making sure a node receives the
> transactions announced when it was offline is not something we guarantee,
> not an intended use case for package relay, and not worsened by this.
>
> Thanks for your feedback!
>
> Best,
>
> Gloria
>
> [1]: https://github.com/bitcoin/bips/blob/master/bip-
> 0152.mediawiki#cmpctblock
> [2]:
> https://github.com/bitcoin/bitcoin/blob/master/src/blockencodings.cpp#L49
> [3]: https://github.com/bitcoin/bips/blob/master/bip-
> 0152.mediawiki#short-transaction-id-calculation
>
> On Thu, May 26, 2022 at 3:59 AM <eric at voskuil.org
> <mailto:eric at voskuil.org> > wrote:
>
>
> Given that packages have no header, the package requires
> identity in a
> BIP152 scheme. For example 'header' and 'blockhash' fields
> can be replaced
> with a Merkle root (e.g. "identity" field) for the package,
> uniquely
> identifying the partially-ordered set of txs. And use of
> 'getdata' (to
> obtain a package by hash) can be eliminated (not a use case).
>
> e
>
> > -----Original Message-----
> > From: eric at voskuil.org <mailto:eric at voskuil.org>
> <eric at voskuil.org <mailto:eric at voskuil.org> >
> > Sent: Wednesday, May 25, 2022 1:52 PM
> > To: 'Anthony Towns' <aj at erisian.com.au
> <mailto:aj at erisian.com.au> >; 'Bitcoin Protocol Discussion'
> > <bitcoin-dev at lists.linuxfoundation.org <mailto:bitcoin-
> dev at lists.linuxfoundation.org> >; 'Gloria Zhao'
> > <gloriajzhao at gmail.com <mailto:gloriajzhao at gmail.com> >
> > Subject: RE: [bitcoin-dev] Package Relay Proposal
> >
> > > From: bitcoin-dev <bitcoin-dev-
> bounces at lists.linuxfoundation.org <mailto:bitcoin-dev-
> bounces at lists.linuxfoundation.org> > On
> > Behalf
> > > Of Anthony Towns via bitcoin-dev
> > > Sent: Wednesday, May 25, 2022 11:56 AM
> >
> > > So the other thing is what happens if the peer
> announcing packages to us
> > is
> > > dishonest?
> > >
> > > They announce pkg X, say X has parents A B C and the fee
> rate is
> garbage.
> > But
> > > actually X has parent D and the fee rate is excellent. Do
> we request the
> > > package from another peer, or every peer, to double
> check? Otherwise
> > we're
> > > allowing the first peer we ask about a package to censor
> that tx from
> us?
> > >
> > > I think the fix for that is just to provide the fee and weight
> when
> > announcing
> > > the package rather than only being asked for its info?
> Then if one peer
> > makes
> > > it sound like a good deal you ask for the parent txids from
> them,
> dedupe,
> > > request, and verify they were honest about the parents.
> >
> > Single tx broadcasts do not carry an advertised fee rate,
> however the'
> > feefilter' message (BIP133) provides this distinction. This
> should be
> > interpreted as applicable to packages. Given this message
> there is no
> reason
> > to send a (potentially bogus) fee rate with every package. It
> can only be
> > validated by obtaining the full set of txs, and the only
> recourse is
> > dropping (etc.) the peer, as is the case with single txs.
> Relying on the
> > existing message is simpler, more consistent, and more
> efficient.
> >
> > > >> Is it plausible to add the graph in?
> > >
> > > Likewise, I think you'd have to have the graph info from
> many nodes if
> > you're
> > > going to make decisions based on it and don't want
> hostile peers to be
> > able to
> > > trick you into ignoring txs.
> > >
> > > Other idea: what if you encode the parent txs as a short
> hash of the
> wtxid
> > > (something like bip152 short ids? perhaps seeded per
> peer so collisions
> > will
> > > be different per peer?) and include that in the inv
> announcement? Would
> > > that work to avoid a round trip almost all of the time,
> while still
> giving
> > you
> > > enough info to save bw by deduping parents?
> >
> > As I suggested earlier, a package is fundamentally a
> compact block (or
> > block) announcement without the header. Compact block
> (BIP152)
> > announcement
> > is already well-defined and widely implemented. A node
> should never be
> > required to retain an orphan, and BIP152 ensures this is not
> required.
> >
> > Once a validated set of txs within the package has been
> obtained with
> > sufficient fee, a fee-optimal node would accept the largest
> subgraph of
> the
> > package that conforms to fee constraints and drop any
> peer that provides a
> > package for which the full graph does not.
> >
> > Let us not reinvent the wheel and/or introduce accidental
> complexity. I
> see
> > no reason why packaging is not simply BIP152 without the
> 'header' field,
> an
> > updated protocol version, and the following sort of changes
> to names:
> >
> > sendpkg
> > MSG_CMPCT_PKG
> > cmpctpkg
> > getpkgtxn
> > pkgtxn
> >
> > > > For a maximum 25 transactions,
> > > >23*24/2 = 276, seems like 36 bytes for a child-with-
> parents package.
> > >
> > > If you're doing short ids that's maybe 25*4B=100B
> already, then the
> above
> > is
> > > up to 36% overhead, I guess. Might be worth thinking
> more about, but
> > maybe
> > > more interesting with ancestors than just parents.
> > >
> > > >Also side note, since there are no size/count params,
> >
> > Size is restricted in the same manner as block and
> transaction broadcasts,
> > by consensus. If the fee rate is sufficient there would be no
> reason to
> > preclude any valid size up to what can be mined in one
> block (packaging
> > across blocks is not economically rational under the
> assumption that one
> > miner cannot expect to mine multiple blocks in a row).
> Count is
> incorporated
> > into BIP152 as 'shortids_length'.
> >
> > > > wondering if we
> > > >should just have "version" in "sendpackages" be a bit
> field instead of
> > > >sending a message for each version. 32 versions should
> be enough right?
> >
> > Adding versioning to individual protocols is just a reflection
> of the
> > insufficiency of the initial protocol versioning design, and
> that of the
> > various ad-hoc changes to it (including yet another
> approach in this
> > proposal) that have been introduced to compensate for it,
> though I'll
> > address this in an independent post at some point.
> >
> > Best,
> > e
> >
> > > Maybe but a couple of messages per connection doesn't
> really seem worth
> > > arguing about?
> > >
> > > Cheers,
> > > aj
> > >
> > >
> > > --
> > > Sent from my phone.
> > >
> _______________________________________________
> > > bitcoin-dev mailing list
> > > bitcoin-dev at lists.linuxfoundation.org <mailto:bitcoin-
> dev at lists.linuxfoundation.org>
> > >
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev at lists.linuxfoundation.org <mailto:bitcoin-
> dev at lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
Author Public Key
npub1r34khxrz9w39zpzezymqz04dcel95adfxf6qpjul9wdv2qn5vtps06s8vu