Add 'Dictionary'
parent
d725c7de32
commit
b49d924058
1 changed files with 9 additions and 0 deletions
9
Dictionary.md
Normal file
9
Dictionary.md
Normal file
|
@ -0,0 +1,9 @@
|
|||
Fatshark heavily utilizes hashing to turn arbitrary strings into fixed-size data types. Specifically, they use Murmur in its 32-bit and 64-bit variants, which is very fast to calculate, while still producing a decent uniformity (likelihood of producing duplicates).
|
||||
|
||||
However, to be able to do things like loading packages or finding a specific file, we need to know the original string values. The Dictionary (`dictionary.csv` by default) is a collection of known (and guessed) strings. It maps these back to their hash values, so that whenever a hash value is found in the game files, it can be looked up and matched in the dictionary.
|
||||
|
||||
But the dictionary is far from complete, and likely never will be. Therefore, DTMT will fall back to producing the hash value itself whenever it encounters a Murmur hash that cannot be found in the dictionary. It is important to note that because of this, decompiling and re-compiling a file is not always idempotent, as the stringified hash value will produce a different hash than the original string.
|
||||
|
||||
CSV format for the dictionary: `string,murmur64,murmr32,group`.
|
||||
|
||||
The `group` is one of several (see `dtmt dictionary add --help`), which segment hashes by their usage in the engine. While uniformity for Murmur is decent, Fatshark's compilation pipeline only ensures unique hashes within their respective usage groups. I.e. the hashes for file names may overlap with the ones for localization IDs in `.strings` files. Therefore, DTMT has to respect the same groups to avoid false-positive matches during decompilation.
|
Loading…
Add table
Reference in a new issue