A hash function H can be abstracted as taking in a variable length input, and returning a fixed length output:
Hash functions are designed to be one-way (irreversible) function.
Example
→ the problem is if we know we can calculate → the function maps input to outputs of fixed size, so its a valid hash function.
Question
What makes a hash function good ?
A hash function is “good” if :
- It takes a lot of time to find collisions - The adversary shouldn’t be able to find a collision- this will depend on the computational power that the adversary has access to.
- It’s “hard” to find a pre-image of a random digest - It should not be feasible given current technology and a realistic timeframe to “undo” the hash function.
The pigeonhole principle
The pigeonhole principle states that there must be collisions if our domain is larger that our range. More formally, if items are put into containers, with , then at least one container must contains more than one item.
Types of attacks
- Collision Attack - If the adversary could find distinct messages and such that .
- Preimage Attack - Given a random digest , if the adversary could find a message such that .
- Prefix Collision Attack - A collision attack where the attacker can pick a prefix for the message. In a prefix collision attack, if the attacker knows the hash of some message , and the length of , they can compute the hash of a new message where is some additional data, without knowing the original message .
Note: Prefix collision arracks are just a subset of collision attack.
Message Authentication Codes
If messages are modified in transit, Message Authentication Codes (MACs) can be used in symmetric encryption schemes to verify that the message was indeed written by someone with access to the private key.
This acts like a check bit in networking, sand so recipients can reject messages that don’t compute to a valid MAC tag.
The MAC function takes in the key and message . Without an adversary knowing the key , they cannot generate a valid tag. This assumes both the sender and recipient have access to symmetric key .
Measuring performance
We can measure the performance of a computer using teraflops.
Teraflops (TFLOPS) measure computational power. More specifically, how many calculations (floating-point operations) a system can perform in one second.
TFLOPS can be a rough proxy for a system’s raw compute capability.