The Concept
A hash function takes an input of arbitrary length and produces a
fixed-length output (the hash, digest, or fingerprint). Two properties
make hash functions useful in engineering:
- Determinism: the same input always produces the
same output. Given the same content, the hash is always the same
value.
- Uniform distribution: outputs are distributed
uniformly across the output space. This prevents hotspotting — no single
output bucket is disproportionately likely.
Cryptographic hash functions add a third property:
- Collision resistance: it is computationally
infeasible to find two different inputs that produce the same output.
This enables trust: if two files have the same SHA-256 hash, you can be
confident (but not certain) they have identical content.
The engineering applications of hash functions vary, but each one
exploits one or more of these three properties.