What is Nostr?
Hugo Nguyen [ARCHIVE] /
npub1jqxโ€ฆjqeg
2023-06-07 23:13:38

Hugo Nguyen [ARCHIVE] on Nostr: ๐Ÿ“… Original date posted:2022-09-21 ๐Ÿ“ Original message:Hello Craig, Thank you for ...

๐Ÿ“… Original date posted:2022-09-21
๐Ÿ“ Original message:Hello Craig,
Thank you for putting this proposal together. It is indeed another big
missing piece of the puzzle.

I would like to echo some of the comments already made by others (and you
yourself) on this thread, that this proposal seems to have some inherent
conflicts between the 2 goals it tries to achieve.

> *Allowing users to import and export their labels in a standardized way
ensures that they do not experience lock-in to a particular wallet
application. As a secondary goal, by using common formats this BIP seeks to
make manual or bulk management of labels accessible to users outside of
wallet applications and without specific technical expertise.*

IMHO, the reason these conflicts exist is because the first one is an
engineering requirement, while the second one is a UX / product requirement.

Engineering requirements typically prioritize data integrity,
reliability/robustness and performance. Do we want some sort of error
detection / correction codes? What data format would be the most robust and
least error-prone? Is CSV a good fit or not for this purpose? etc.

UX requirements, on the other hand, typically prioritize convenience and
ease of use.

When we donโ€™t separate these concerns it can backfire and we might end up
with a Frankenstein standard that is the worst of both worlds. That is: not
quite robust in engineering terms, but also not quite user-friendly in
product terms either.

SLIP-132 is one such example. It tries to solve what are inherently
engineering challenges โ€” how to manage the complexities that arose due to
the evolution of keys and scripts โ€” by sadly offloading those complexities
onto the end users. The end result is user confusion (what kind of [?]PUB
do I need here?) and a nightmare for engineers to maintain (the
complexities are better managed via a high level language such as Output
Descriptors).

Keeping in this mind, I also think having 2 separate BIPs for this is
better.

Cheers,
Hugo




On Mon, Aug 29, 2022 at 4:26 AM Craig Raw via bitcoin-dev <
bitcoin-dev at lists.linuxfoundation.org> wrote:

> Thanks for your feedback @Ali.
>
> I am attempting to achieve two goals with this proposal, primarily for the
> benefit of wallet users:
>
> Goal #1. Transfer labels between different wallet implementations
> Goal #2. Manage labels in applications outside of Bitcoin wallets (such as
> Excel)
>
> Much of the feedback so far has indicated the tension between these two
> goals - it may be that it is too difficult to achieve both, in which case
> Goal #1 is the most important. That said, I think further exploration is
> still necessary before abandoning Goal #2, because removing it would
> significantly reduce the value of this proposal and mean users need to rely
> on application-specific workarounds.
>
> > it is important that a version byte is defined
> If Goal #2 is to be achieved it's difficult to mandate this, particularly
> if one requires bit flags to be set. Should an importing wallet fail to
> import if the version byte is not present, even if all the data is
> otherwise correct? Although it is difficult to know in advance how a format
> may be extended, it is certainly possible to extend this format with
> additional types where the nature of hashes serve as unique identifiers
> (more on this below).
>
> > Don't mandate the file extension... There is no way to enforce this on
> a BIP level.
> I'm not quite sure what you mean here - for example BIP174, which is
> widely used, states "Binary PSBT files should use the .psbt file
> extension." Also, this contradicts Goal #2 - Excel and Numbers register as
> handlers for .csv, and so make it clear that the file is editable outside
> of a wallet.
>
> > ZIP does not have good performance or compression ratio
> Indeed, but it is very widely available. That said, gzip is supported
> widely too these days. Unfortunately, gzip does not offer encryption (see
> next answer).
>
> > ZIP is an archiving format, that happens to have its own compression
> format.
> I agree this is not ideal. My main reason for choosing ZIP was that it
> supports encryption. It seems to me that without considering encryption, an
> application must create label export files that allow privacy-sensitive
> wallet information to be readable in plain text. Being able to transfer
> labels without risking privacy is IMO valuable. I considered other
> encryption formats such as PGP, but they are much more niche and so again
> contradict Goal #2.
>
> > I don't see the benefit of encrypting addresses and labels together...
> additionally, the password you propose is insecure - anybody with access to
> the wallet can unlock it
> I'm not sure I understand your question, but both wallet addresses and
> wallet labels contain privacy-sensitive information that should be
> protected. Wrt to the password, there is actually a more fundamental
> problem with using the wallet xpub - there is no equivalent for multisig
> wallets. For this reason I'll remove that requirement in future iterations.
>
> > Why the need for input and output formats? There is no difference
> between them on the wallet level, because they are always identified with a
> txid and output index.
> The input refers to the txid and the input index (in the set of vin), so
> the difference is the context in which they are displayed. A wallet will
> not necessarily store the spent outputs for a funding transaction
> containing a UTXO coming into the wallet, but it will contain references to
> the inputs as part of that transaction.
>
> > Another important point is that practically nobody labels inputs or
> outputs
> To the contrary, UTXOs are very frequently labelled, as they link and
> reveal information when spent. Inputs are much less frequently labelled,
> but there is no particular reason to exclude them.
>
> > there is a net benefit for the addresses to be exported in ascending
> order
> Indeed, but it makes achieving Goal #2 much more difficult for marginal
> benefit.
>
> > It's better to mandate that they should always be double-quoted, since
> only wallets will generate label exports anyway.
> Rather I think it's better to mandate RFC4180 is followed, as per
> recommendations in other feedback.
>
> > The importing code is too naive... it should utilize a dedicate item
> type field that unambiguously identifies the item
> It's unclear to me what you mean here. As I've indicated it is currently
> possible to disambiguate between addresses/transactions/etc without the
> need for a 3rd column, but in any case the hash functions used ensure that
> labels will not be associated incorrectly. Even in the unlikely event of
> some future address type being indistinguishable from a txid, it will
> simply not match any txids in the wallet.
>
> Craig
>
>
>
> On Wed, Aug 24, 2022 at 9:10 PM Ali Sherief <ali at notatether.com> wrote:
>
>> Hi Craig,
>>
>> This a really good proposal. I studied your BIP and I have some feedback
>> on some parts of it.
>>
>> > The first line in the file is a header, and should be ignored on import.
>>
>> From past experience and lessons, most notably BIP39, it is important
>> that a version byte is defined somewhere in case someone wants to extend it
>> in the future, currently there is no version byte which someone can
>> increment if somebody wants to extend it. In the unique case of CSV files,
>> you should make the header line mandatory (I see you have already implied
>> this, but you should make it explicit in the BIP), but instead of a line
>> with columns in it, I suggest instead of Reference,Label, you make the
>> format like this:
>>
>> BIP-wallet-labels,<version>
>>
>> Since there are two columns per record, this works out nicely. The first
>> column can be the name of the BIP - BIPxxxx where the x's are numbers, and
>> the second column can be an unsigned 32-bit integer (most significant 8
>> bits reserved for version, the remaining for flags, or perhaps the entirety
>> for version - but I recommend leaving at least some bits for flags, even if
>> they all end up being just "reserved").
>>
>> You should make importing fail if the header line is not exactly as
>> specified - or appropriate, should you decide a different format for the
>> header.
>>
>> > Files exported should use the <tt>.csv</tt> file extension.
>> Don't mandate the file extension (read below for why):
>>
>> > In order to reduce file size while retaining wide accessibility, the CSV
>> > file may be compressed using the ZIP file format, using the
>> <tt>.zip</tt>
>> > file extension.
>> I see three problems with this. The first is more important than the
>> later two because it makes them moot points, but I'll mention them anyway
>> so you get a background of the situation:
>> - The BIP is trying to specify in what file format the export format can
>> be written in onto the filesystem. There is no way to enforce this on a BIP
>> level (besides, Unix operating systems don't even consider the file
>> extension, they use its mimetype). Also specifying this in the BIP will
>> prevent modular "Layer 2" protocols and schemes from encoding the Export
>> labels into another format - for example Base64 or with their own
>> compression algorithm.
>>
>> Now for the two "moot problems":
>> - ZIP does not have good performance or compression ratio, there are
>> better algorithms out there like gzip (which also happens to be more
>> ubiquitous; nearly all websites are serving HTML compressed with gzip
>> compression).
>> - ZIP is an archiving format, that happens to have its own compression
>> format. Archiving format parsers can have serious vulnerabilities in their
>> implementation that can allow malware to swipe private keys and passwords,
>> since the primary target for this BIP is wallets. For example, there was
>> Zip Slip[1] in 2018, which allows for remote code execution. So the malware
>> can even hide in memory until private keys or passwords are written to
>> memory, then send them accros the network. Assuming it's targeting a
>> specific wallet software it's not hard to carry out at all.
>>
>> There's two solutions for all this:
>> 1. The duck-tape solution: Use some compression algorithm like gzip
>> instead of ZIP archive format.
>> 2. The "throw it out and buy a new one" solution: Get rid of the optional
>> compression specs altogether, because users are responsible for supplying
>> the export labels in the first place, so all the compression stuff is
>> redundant and should be left up to the user use if they desire to.
>>
>> I prefer the second solution because it hits the nail at the problem
>> directly instead of putting duck tape on it like the first one.
>>
>> > This <tt>.zip</tt> file may optionally be encrypted using either
>> AES-128 or
>> > AES-256 encryption, which is supported by numerous applications
>> including
>> > Winzip and 7-zip.
>> > The textual representation of the wallet's extended public key (as
>> defined
>> > by BIP32, with an <tt>xpub</tt> header) should be used as the password.
>> Not specific to AES, but I don't see the benefit of encrypting addresses
>> and labels together. Can you please elaborate why this would be desireable?
>>
>> Like I said though, it's better to leave it up to users to decide how to
>> store their exports, since BIPs can't enforce that anyway (additionally,
>> the password you propose is insecure - anybody with access to the wallet
>> can unlock it, which is not desireable to some users who want their own
>> security).
>>
>> > * Transaction ID (<tt>txid</tt>)
>> > * Address
>> > * Input (rendered as <tt>txid<index</tt>)
>> > * Output (rendered as <tt>txid>index</tt> or <tt>txid:index</tt>)
>> Why the need for input and output formats? There is no difference between
>> them on the wallet level, because they are always identified with a txid
>> and output index. To distinguish between them and hence write them with the
>> correct format would require a UTXO set and thus access to a full node,
>> otherwise the CSV cannot be verified to be completely well-formed.
>>
>> Another important point is that practically nobody labels inputs or
>> outputs because most people do not know that those things even exist, and
>> the rest don't bother to label them.
>>
>> But the biggest downside to including them is related to the problem of
>> information leaking which you make reference to here:
>> > In both cases, care must be taken when spending to avoid undesirable
>> leaks
>> > of private information.
>> A CSV dump that has inputs/outputs and addresses mixed together can infer
>> the owner of all those items. In fact, A CVS label dump is basically a
>> personal information store so everything in it can be correlated as coming
>> from the same wallet, so it's important that unnecessary types are kept out
>> of the format. People are known to leave files lying around on their
>> computer that they don't need anymore, so these files can find their way
>> via telemetry to surveillence entities. While we can't specify what users
>> can do with their exports, we can control the information leak by
>> preventing certain types of items that we know most users will never use
>> from being exported in the first place.
>>
>> > The order in which these records appear is not defined.
>> Again, since the primary use case for this BIP is wallets, which likely
>> use heirarchical derivation schemes like BIP44, there is a net benefit for
>> the addresses to be exported in ascending order of their `address_type`. It
>> means that wallets can import them in O(n) time as opposed to O(n^2) time
>> spent serially checking in which index the address appears at. Of course,
>> this implies that all addresses up to a certain index have to be exported
>> into the CSV as well, but most wallets I know of like Core, Electrum
>> already store addresses like that.
>>
>> Also if you do this, you will need to group all the transaction records
>> before the address records or vice versa - you can use lexigraphical
>> sorting if you want (ie. Addresses before Transactions). The benefit of
>> this separation of parts is that wallets can split the imported address
>> records from the transaction records internally, and feed them to separate
>> functions which set these labels internally.
>>
>> If you decide on doing it this way, then you need a 3rd column to
>> identify the item type, and also you should quote the label (see below). I
>> strongly recommend using numbers for identification as opposed to character
>> strings, so you don't have to worry about localization or character case
>> issues. There is always one unique number, but there could be multiple
>> strings that reference the same type. This will complicate importing
>> functions.
>>
>> If you insist on include Input and Output types then they can both be
>> specified as <txid>:<index> if you do this change. They won't be used to
>> determine the type anyway.
>>
>> > The fields may be quoted, but this is unnecessary, as the first comma in
>> > the line will always be the delimiter.
>> Don't implement it like that, because that will break CSV parsers which
>> expect a fixed amount of rows in each record (2 in the header, and some
>> rows have >2 rows). It's better to mandate that they should always be
>> double-quoted, since only wallets will generate label exports anyway. If
>> you plan to use headers then the 3rd column can be blank for it (or you can
>> split the version and flags from each other).
>>
>> > ==Importing==
>> >
>> > When importing, a naive algorithm may simply match against any
>> reference,
>> > but it is possible to disambiguate between transactions, addresses,
>> inputs
>> > and outputs.
>> > For example in the following pseudocode:
>> > <pre>
>> > if reference length < 64
>> > Set address label
>> > else if reference length == 64
>> > Set transaction label
>> > else if reference contains '<'
>> > Set input label
>> > else
>> > Set output label
>> > </pre>
>> The importing code is too naive and in its current form will prevent the
>> BIP from getting a number. It is perhaps the single most important part of
>> a BIP. When implementing an importer, it should utilize a dedicate item
>> type field that unambiguously identifies the item. So the naive importer is
>> not good, you need use a 3rd column for that like I explained above, so
>> that the importer becomes robust.
>>
>> In summary (exclamation marks indicate severity - one means low, two
>> means medium, and three means high):
>>
>> 1. Convert the header into a version line with optional flags, otherwise
>> nobody can extend this format without compatibility issues (!)
>> 2. Get rid of the specs related to file compression (!!!)
>> 3. Add a 3rd column for item type (address, transaction etc.) preferably
>> as numeric constants and grouping items of one type after items of another
>> type, or if you insist on strings, then only recognize their Titlecase
>> ASCII versions <spreadsheet software like Excel always tries to titlecase
>> the words> (!!)
>> 4. Require double quotes around the label (or single quotes if you
>> prefer, as long as spreadsheet software doesn't choke on them) (!!)
>> 5. Require sorting the records according to the order they are stored in
>> the wallet implementation. (!)
>> 6. Consider getting rid of Input and Output item types. (!)
>> 7. And last and most importantly, please write a more robust importer
>> algorithm in the example given by the BIP, because code in BIPs are
>> frequently used as references for software. (!!!)
>>
>> I hope you will consider these points in future revisions of your BIP.
>>
>> - Ali
>>
>> [1] https://github.com/snyk/zip-slip-vulnerability
>>
>> On Wed, 24 Aug 2022 11:18:43 +0200, craigraw at gmail.com wrote:
>> > Hi all,
>> >
>> > I would like to propose a BIP that specifies a format for the export and
>> > import of labels from a wallet. While transferring access to funds
>> across
>> > wallet applications has been made simple through standards such as
>> BIP39,
>> > wallet labels remain siloed and difficult to extract despite their
>> value,
>> > particularly in a privacy context.
>> >
>> > The proposed format is a simple two column CSV file, with the reference
>> to
>> > a transaction, address, input or output in the first column, and the
>> label
>> > in the second column. CSV was chosen for its wide accessibility,
>> especially
>> > to users without specific technical expertise. Similarly, the CSV file
>> may
>> > be compressed using the ZIP format, and optionally encrypted using AES.
>> >
>> > The full text of the BIP can be found at
>> >
>> https://github.com/craigraw/bips/blob/master/bip-wallet-labels.mediawiki
>> > and also copied below.
>> >
>> > Feedback is appreciated.
>> >
>> > Thanks,
>> > Craig Raw
>> >
>> > ---
>> >
>> > <pre>
>> > BIP: wallet-labels
>> > Layer: Applications
>> > Title: Wallet Labels Export Format
>> > Author: Craig Raw <craig at sparrowwallet.com>
>> > Comments-Summary: No comments yet.
>> > Comments-URI:
>> > https://github.com/bitcoin/bips/wiki/Comments:BIP-wallet-labels
>> > Status: Draft
>> > Type: Informational
>> > Created: 2022-08-23
>> > License: BSD-2-Clause
>> > </pre>
>> >
>> > ==Abstract==
>> >
>> > This document specifies a format for the export of labels that may be
>> > attached to the transactions, addresses, input and outputs in a wallet.
>> >
>> > ==Copyright==
>> >
>> > This BIP is licensed under the BSD 2-clause license.
>> >
>> > ==Motivation==
>> >
>> > The export and import of funds across different Bitcoin wallet
>> applications
>> > is well defined through standards such as BIP39, BIP32, BIP44 etc.
>> > These standards are well supported and allow users to move easily
>> between
>> > different wallets.
>> > There is, however, no defined standard to transfer any labels the user
>> may
>> > have applied to the transactions, addresses, inputs or outputs in their
>> > wallet.
>> > The UTXO model that Bitcoin uses makes these labels particularly
>> valuable
>> > as they may indicate the source of funds, whether received externally
>> or as
>> > a result of change from a prior transaction.
>> > In both cases, care must be taken when spending to avoid undesirable
>> leaks
>> > of private information.
>> > Labels provide valuable guidance in this regard, and have even become
>> > mandatory when spending in several Bitcoin wallets.
>> > Allowing users to export their labels in a standardized way ensures that
>> > they do not experience lock-in to a particular wallet application.
>> > In addition, by using common formats, this BIP seeks to make manual or
>> bulk
>> > management of labels accessible to users without specific technical
>> > expertise.
>> >
>> > ==Specification==
>> >
>> > In order to make the import and export of labels as widely accessible as
>> > possible, this BIP uses the comma separated values (CSV) format, which
>> is
>> > widely supported by consumer, business, and scientific applications.
>> > Although the technical specification of CSV in RFC4180 is not always
>> > followed, the application of the format in this BIP is simple enough
>> that
>> > compatibility should not present a problem.
>> > Moreover, the simplicity and forgiving nature of CSV (over for example
>> > JSON) lends itself well to bulk label editing using spreadsheet and text
>> > editing tools.
>> >
>> > A CSV export of labels from a wallet must be a UTF-8 encoded text file,
>> > containing one record per line, with records containing two fields
>> > delimited by a comma.
>> > The fields may be quoted, but this is unnecessary, as the first comma in
>> > the line will always be the delimiter.
>> > The first line in the file is a header, and should be ignored on import.
>> > Thereafter, each line represents a record that refers to a label
>> applied in
>> > the wallet.
>> > The order in which these records appear is not defined.
>> >
>> > The first field in the record contains a reference to the transaction,
>> > address, input or output in the wallet.
>> > This is specified as one of the following:
>> > * Transaction ID (<tt>txid</tt>)
>> > * Address
>> > * Input (rendered as <tt>txid<index</tt>)
>> > * Output (rendered as <tt>txid>index</tt> or <tt>txid:index</tt>)
>> >
>> > The second field contains the label applied to the reference.
>> > Exporting applications may omit records with no labels or labels of zero
>> > length.
>> > Files exported should use the <tt>.csv</tt> file extension.
>> >
>> > In order to reduce file size while retaining wide accessibility, the CSV
>> > file may be compressed using the ZIP file format, using the
>> <tt>.zip</tt>
>> > file extension.
>> > This <tt>.zip</tt> file may optionally be encrypted using either
>> AES-128 or
>> > AES-256 encryption, which is supported by numerous applications
>> including
>> > Winzip and 7-zip.
>> > In order to ensure that weak encryption does not proliferate, importers
>> > following this standard must refuse to import <tt>.zip</tt> files
>> encrypted
>> > with the weaker Zip 2.0 standard.
>> > The textual representation of the wallet's extended public key (as
>> defined
>> > by BIP32, with an <tt>xpub</tt> header) should be used as the password.
>> >
>> > ==Importing==
>> >
>> > When importing, a naive algorithm may simply match against any
>> reference,
>> > but it is possible to disambiguate between transactions, addresses,
>> inputs
>> > and outputs.
>> > For example in the following pseudocode:
>> > <pre>
>> > if reference length < 64
>> > Set address label
>> > else if reference length == 64
>> > Set transaction label
>> > else if reference contains '<'
>> > Set input label
>> > else
>> > Set output label
>> > </pre>
>> >
>> > Importing applications may truncate labels if necessary.
>> >
>> > ==Test Vectors==
>> >
>> > The following fragment represents a wallet label export:
>> > <pre>
>> > Reference,Label
>> >
>> c3bdad6e7dcd7997e16a5b7b7cf4d8f6079820ff2eedd5fcbb2ad088f767b37b?,Transaction
>> > 1A69TXnEM2ms9fMaY9UuiJ7415X7xZaUSg,Address
>> >
>> c3bdad6e7dcd7997e16a5b7b7cf4d8f6079820ff2eedd5fcbb2ad088f767b37b?<0,Input
>> >
>> c3bdad6e7dcd7997e16a5b7b7cf4d8f6079820ff2eedd5fcbb2ad088f767b37b?>0,Output
>> >
>> c3bdad6e7dcd7997e16a5b7b7cf4d8f6079820ff2eedd5fcbb2ad088f767b37b?:0,Output
>> > (alternative)
>> > </pre>
>> >
>> > ==Reference Implementation==
>> >
>> > TBD
>>
>> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20220920/d85742d6/attachment-0001.html>;
Author Public Key
npub1jqxs4ftunmm8qjyw9s80hpcayewkjfhpxund29l9qvzy7xqx4duq85jqeg