LocalChange Results, Alex Rejoins
LocalChange improves several benchmarks 35%
In our last update we described a performance optimization we wanted to make which we suspected would improve our performance by shiftng responsibility for tracking changes from the backend of the library to the frontend. We’re pleased to report that the localChange
branch has been merged and is showing good results on the benchmarking front.
automerge1 | automerge1 | automergeWASM | automergeWASM | |
---|---|---|---|---|
Version | 09-08-2020 | 01-07-2021 | 09-08-2020 | 01-07-2021 |
[B1.2] Insert string of length N (time) | 441 ms | 288 ms | 35 ms | 22 ms |
[B1.4] Insert N characters at random positions (time) | 947 ms | 164 ms | 224 ms | 164 ms |
[B3.1] 20√N clients concurrently set number in Map (time) | 516 ms | 492 ms | 8 ms | 8 ms |
These show significant improvements on test B1.2
and B1.4
as both of these tests hammer the applyLocalChange
path and no change was expected on B3.1
as it relies only on applyChanges
and does no local change transform.
Also the hard work by Martin Kleppmann can be seen in the greatly improved numbers in automerge1 over this timeframe.
Project initiator Alex Good returns full-time
The first version of Automerge-RS was written by Alex Good, who we teamed up with to build the first version of Automerge-RS last spring.
We’re pleased to say that Alex is back with us on the team working full time on the push to get automerge-rs
to a 1.0 release. He has already cleaned up the tests, reorganized the code across the automerge-protocol
package and merged a huge rewrite of the automerge-frontend
crate.
Blockers for a 1.0 release
Currently there are two projects that stand as hard blockers for a release of automerge-wasm
simultaneous automerge 1.0.
First the whole document compression mode is not yet implemented in rust. This method takes a stack of changes, does the column compression algorithm on the changes itself, and then merges the ops across all changes and compresses them as well. All the tests currently pass without this feature but if a document were saved in the js version currently it could not be loaded in the wasm version.
Second automerge-wasm
needs to be packaged properly. Currently there is no official packaging, and users need to compile the wasm manually to make use of it.
Next steps in performance work
One issue that needs to be addressed is that the wasm-pack
binary throws out function names when doing a profile build. This is a known issue but the workarounds have not worked for us yet. Without this we have to base our profiling on debug builds which will sometimes cause us to focus on tuning up code that the compile already knows how to optimize but isn’t. But even with this lack of visibility in profiling it’s starting to look like the js/wasm call interface is starting to be the problem. Currently JS objects like the UncompressedChange
and the Patch
object get converted into a json string and then parsed as the intermediary on both the call and return value for each function call. Interestingly binary changes do not do this as they are … well … just binary. The only cost is copying them directly between js memory and wasm memory. This is part of the reason why test B3.1
is so very fast. Alternate ways to pass data back and forth between the layers is being looked into.
In furtherance of this, Alex is currently duplicating the benchmark suite in pure rust so we can get a better idea of how much of the problem is the js/wasm interface.
Most of the tests when run with 10x the data take about 10x the time. There are 3 tests showing much worse numbers than they should at larger sizes. These need to be looked into and remedied, hopefully being easy wins for performance.
automerge1 | automerge1 | automergeWASM | automergeWASM | |
---|---|---|---|---|
Version | N=1000 | N=10000 | N=1000 | N=10000 |
[B1.3] Prepend N characters (time) | 2080 ms | 270069 ms | 175 ms | 3624 ms |
[B1.5] Insert N words at random positions (time) | 1619 ms | 46180 ms | 404 ms | 26491 ms |
[B1.10] Prepend N numbers (time) | 2136 ms | 273761 ms | 209 ms | 4664 ms |