Fabrice Drouin [ARCHIVE] on Nostr: đź“… Original date posted:2016-09-20 đź“ť Original message: Edit: our link to the ...
đź“… Original date posted:2016-09-20
đź“ť Original message:
Edit: our link to the Flare white paper should be
[1] http://bitfury.com/content/5-white-papers-research/whitepaper_flare_an_approach_to_routing_in_lightning_network_7_7_2016.pdf
Fabrice
On 18 September 2016 at 14:58, Pierre <pm+lists at acinq.fr> wrote:
> TLDR: It kind of works (there is a 80+% chance of finding a route to any
> other node in ~500ms).
>
> Hello guys,
>
> Flare[1] is a routing protocol for the Lightning Network which combines
> local neighborhood maps and paths to remote beacons. It is really
> interesting and the simulations are promising, so we wanted to give it a try
> and actually implement it in Eclair. We ended up with a close variant [2],
> and tested it on 2500 nodes on AWS.
>
> The main difference between our implementation and the original paper is in
> the way nodes communicate with each other. In the paper, an assumption is
> made that "there exists a way for any two nodes in LN to communicate with
> each other, perhaps with some prior setup (such as a distributed hashtable
> overlay maintained among LN nodes that would store mapping of node
> identifiers to their current IP addresses)". In our implementation there is
> no DHT: inter-nodes communication only occurs on existing open channels, and
> all routing-related messages are onion-encrypted and routed themselves, just
> like htlc messages. So a node can't talk directly to a node it doesn't
> already have a route to. This certainly has a cost but it should also
> improves privacy, and overall we think that using a DHT isn't necessary.
> Also, making the assumption that a node is reachable from the outside is
> rather strong, particularly for mobile/light wallets. Whereas by
> communicating over channels, a node can actively participate to the global
> routing as soon as it has more than two channels.
>
> The neighbor discovery/beacons selection is pretty similar to what's in the
> paper, but a little bit simplified :
> - nodes don't ack table update messages (it is probably not needed since it
> occurs between adjacent nodes and any disconnection will be noticed)
> - nodes don't tell other nodes that they have been chosen as beacon (there
> is no beacon_set message). A given node knows it is a candidate because it
> received a beacon_req message, but it won't know if it was eventually
> selected.
>
> We only focused on static ranking (finding a route formed of open channels
> between two given nodes), so it is very possible (probable?) that a route is
> not actually usable because the channels are not balanced. Basically we made
> the assumption that the network graph is undirected, which is not true and
> is a pretty strong assumption.
>
> The route selection has also been simplified. Whereas Flare proposed a 3
> steps process:
> (a) sender uses its own routing table
> (b) sender requests receiver's table and uses a combination of the two
> tables
> (c) sender iterates over nodes it knows that are close to the receiver, and
> request their tables
> in our test we only did (b), which is a strong tradeoff. It means that the
> time allowed to find a route is short and predictable, but it is very
> suboptimal in terms of probability of finding a route.
>
> Below are the results of a few tests we ran on AWS. The procedure was as
> follows:
> 1) start a certain number of m1.medium instances with standard linux AMI
> 2) after startup the instances run an init script that d/l oracle jdk and
> our jar, then run the jar
> 3) from a control server, we link nodes to each other following a given
> graph using json-rpc commands
> 4) wait a few minutes for the gossip to calm down (it can be pretty intense
> with radius=3 [3] and our implementation is not optimized)
> 5) pick 1000 random routes (random sender, random receiver), and for each
> route using json-rpc we (a) asks the receiver for a "payment request" (H +
> routing_table) and then (b) asks the sender to find a route for this payment
> request (without actually making a payment).
> 6) collect stats
>
> Caveats:
> - nodes are connected all at once (step 3) above), and don't periodically
> reconcile their own routing table with that of their neighbors
> (neighbor_reset in the paper). This leads to nodes not knowing all the nodes
> in their radius, and probably has a significant negative impact on results;
> we figured this out afterwards :-s
> - there is no "proof of channels" (nodes are not lying to each other about
> existing channels)
> - onion messages are not actually encrypted (although channel communications
> are)
> - it is a "LAN" setup so things wouldn't be that smooth in reality (although
> we are testing static routes so it doesn't actually matter that much)
> - smallworld graphs may seem a little bit optimistic in terms of actual
> network topology
> - ...
>
> Parameters and results (probability of finding a route to any given node):
> test nodes r beacon_count graph result t_avg
> A 1000 2 10 sw1000_6_02 87%
> B 2000 2 10 sw2000_6_02 76% 305ms
> C 2500 2 10 sw2500_4_01 ~20%
> D 2500 3 10 sw2500_4_01 43% 377ms
> E 2500 3 10 sw2500_6_02 83% 532ms
> Note: a swX_Y_0Z graph is a smallworld graph generated using the Watts and
> Strogatz model with n=X, k=Y and beta=0.Z
>
> Network size (known nodes):
> test adjacent all
> A
> B 6.8 62.3
> C
> D 4.0 58.8
> E 6.0 97.0
> Note: expected values for column 'all' should be D=4^3=64+ and E=6^3=216+,
> see caveat above. The good thing is that a radius=2 might actually be
> enough, just like the paper says!
>
> Beacons:
> test dist_avg dist_max dist_var
> A
> B 7.4 39 29.4
> C
> D 9.2 96 79.5
> E 8.1 45 36.0
>
> Route lengths:
> test len_avg len_max len_var
> A 5.4 26.0 3.3
> B 6.2 30.0 6.2
> C
> D 17.9 74.0 109.5
> E 7.3 28.0 5.6
>
> Response time (this includes sender requesting the receiver's routing table
> and running a dijkstra):
> test t_avg t_max
> A
> B 305ms 5158ms
> C
> D 377ms 7809ms
> E 532ms 5128ms
> Note: High max times are probably related to the use of poor-performance
> instances that sometimes behave erratically (unfortunately we don't have the
> variance).
>
> In conclusion, we show that our modified version of Flare actually works on
> a decent size network. Because of the bug mentioned in the caveats, the
> value of the radius parameter should be taken with caution. One of the main
> concerns for us is the fact that the graph is in fact directed and that at
> creation channels are actually one-way. This might make finding route very
> difficult at bootstrapping.
>
>
> Cheers,
>
> Pierre
>
>
> [1]
> http://bitfury.com/content/5-white-papers-research/bitfury_whitepaper_shared_send_untangling_in_bitcoin_8_24_2016.pdf
> [2]
> https://github.com/ACINQ/eclair/blob/wip-flare/eclair-node/src/main/scala/fr/acinq/eclair/router/FlareRouter.scala
> [3] https://s3-eu-west-1.amazonaws.com/acinq/public/network_out.PNG
>
> _______________________________________________
> Lightning-dev mailing list
> Lightning-dev at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
>
--
ACINQ SAS, Siège Social 10 rue de Penthièvre 75008 PARIS.
Capital 51 437 Euros, 804 203 792 RCS Paris – APE 6202A – SIRET 804
203 792 00010 – TVA FR 34 804203792.
đź“ť Original message:
Edit: our link to the Flare white paper should be
[1] http://bitfury.com/content/5-white-papers-research/whitepaper_flare_an_approach_to_routing_in_lightning_network_7_7_2016.pdf
Fabrice
On 18 September 2016 at 14:58, Pierre <pm+lists at acinq.fr> wrote:
> TLDR: It kind of works (there is a 80+% chance of finding a route to any
> other node in ~500ms).
>
> Hello guys,
>
> Flare[1] is a routing protocol for the Lightning Network which combines
> local neighborhood maps and paths to remote beacons. It is really
> interesting and the simulations are promising, so we wanted to give it a try
> and actually implement it in Eclair. We ended up with a close variant [2],
> and tested it on 2500 nodes on AWS.
>
> The main difference between our implementation and the original paper is in
> the way nodes communicate with each other. In the paper, an assumption is
> made that "there exists a way for any two nodes in LN to communicate with
> each other, perhaps with some prior setup (such as a distributed hashtable
> overlay maintained among LN nodes that would store mapping of node
> identifiers to their current IP addresses)". In our implementation there is
> no DHT: inter-nodes communication only occurs on existing open channels, and
> all routing-related messages are onion-encrypted and routed themselves, just
> like htlc messages. So a node can't talk directly to a node it doesn't
> already have a route to. This certainly has a cost but it should also
> improves privacy, and overall we think that using a DHT isn't necessary.
> Also, making the assumption that a node is reachable from the outside is
> rather strong, particularly for mobile/light wallets. Whereas by
> communicating over channels, a node can actively participate to the global
> routing as soon as it has more than two channels.
>
> The neighbor discovery/beacons selection is pretty similar to what's in the
> paper, but a little bit simplified :
> - nodes don't ack table update messages (it is probably not needed since it
> occurs between adjacent nodes and any disconnection will be noticed)
> - nodes don't tell other nodes that they have been chosen as beacon (there
> is no beacon_set message). A given node knows it is a candidate because it
> received a beacon_req message, but it won't know if it was eventually
> selected.
>
> We only focused on static ranking (finding a route formed of open channels
> between two given nodes), so it is very possible (probable?) that a route is
> not actually usable because the channels are not balanced. Basically we made
> the assumption that the network graph is undirected, which is not true and
> is a pretty strong assumption.
>
> The route selection has also been simplified. Whereas Flare proposed a 3
> steps process:
> (a) sender uses its own routing table
> (b) sender requests receiver's table and uses a combination of the two
> tables
> (c) sender iterates over nodes it knows that are close to the receiver, and
> request their tables
> in our test we only did (b), which is a strong tradeoff. It means that the
> time allowed to find a route is short and predictable, but it is very
> suboptimal in terms of probability of finding a route.
>
> Below are the results of a few tests we ran on AWS. The procedure was as
> follows:
> 1) start a certain number of m1.medium instances with standard linux AMI
> 2) after startup the instances run an init script that d/l oracle jdk and
> our jar, then run the jar
> 3) from a control server, we link nodes to each other following a given
> graph using json-rpc commands
> 4) wait a few minutes for the gossip to calm down (it can be pretty intense
> with radius=3 [3] and our implementation is not optimized)
> 5) pick 1000 random routes (random sender, random receiver), and for each
> route using json-rpc we (a) asks the receiver for a "payment request" (H +
> routing_table) and then (b) asks the sender to find a route for this payment
> request (without actually making a payment).
> 6) collect stats
>
> Caveats:
> - nodes are connected all at once (step 3) above), and don't periodically
> reconcile their own routing table with that of their neighbors
> (neighbor_reset in the paper). This leads to nodes not knowing all the nodes
> in their radius, and probably has a significant negative impact on results;
> we figured this out afterwards :-s
> - there is no "proof of channels" (nodes are not lying to each other about
> existing channels)
> - onion messages are not actually encrypted (although channel communications
> are)
> - it is a "LAN" setup so things wouldn't be that smooth in reality (although
> we are testing static routes so it doesn't actually matter that much)
> - smallworld graphs may seem a little bit optimistic in terms of actual
> network topology
> - ...
>
> Parameters and results (probability of finding a route to any given node):
> test nodes r beacon_count graph result t_avg
> A 1000 2 10 sw1000_6_02 87%
> B 2000 2 10 sw2000_6_02 76% 305ms
> C 2500 2 10 sw2500_4_01 ~20%
> D 2500 3 10 sw2500_4_01 43% 377ms
> E 2500 3 10 sw2500_6_02 83% 532ms
> Note: a swX_Y_0Z graph is a smallworld graph generated using the Watts and
> Strogatz model with n=X, k=Y and beta=0.Z
>
> Network size (known nodes):
> test adjacent all
> A
> B 6.8 62.3
> C
> D 4.0 58.8
> E 6.0 97.0
> Note: expected values for column 'all' should be D=4^3=64+ and E=6^3=216+,
> see caveat above. The good thing is that a radius=2 might actually be
> enough, just like the paper says!
>
> Beacons:
> test dist_avg dist_max dist_var
> A
> B 7.4 39 29.4
> C
> D 9.2 96 79.5
> E 8.1 45 36.0
>
> Route lengths:
> test len_avg len_max len_var
> A 5.4 26.0 3.3
> B 6.2 30.0 6.2
> C
> D 17.9 74.0 109.5
> E 7.3 28.0 5.6
>
> Response time (this includes sender requesting the receiver's routing table
> and running a dijkstra):
> test t_avg t_max
> A
> B 305ms 5158ms
> C
> D 377ms 7809ms
> E 532ms 5128ms
> Note: High max times are probably related to the use of poor-performance
> instances that sometimes behave erratically (unfortunately we don't have the
> variance).
>
> In conclusion, we show that our modified version of Flare actually works on
> a decent size network. Because of the bug mentioned in the caveats, the
> value of the radius parameter should be taken with caution. One of the main
> concerns for us is the fact that the graph is in fact directed and that at
> creation channels are actually one-way. This might make finding route very
> difficult at bootstrapping.
>
>
> Cheers,
>
> Pierre
>
>
> [1]
> http://bitfury.com/content/5-white-papers-research/bitfury_whitepaper_shared_send_untangling_in_bitcoin_8_24_2016.pdf
> [2]
> https://github.com/ACINQ/eclair/blob/wip-flare/eclair-node/src/main/scala/fr/acinq/eclair/router/FlareRouter.scala
> [3] https://s3-eu-west-1.amazonaws.com/acinq/public/network_out.PNG
>
> _______________________________________________
> Lightning-dev mailing list
> Lightning-dev at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/lightning-dev
>
--
ACINQ SAS, Siège Social 10 rue de Penthièvre 75008 PARIS.
Capital 51 437 Euros, 804 203 792 RCS Paris – APE 6202A – SIRET 804
203 792 00010 – TVA FR 34 804203792.