01 Hashing

A hash function H can be abstracted as taking in a variable length input, and returning a fixed length output:

H : {0, 1}^{*} \to {0, 1}^{n}

Hash functions are designed to be one-way (irreversible) function.

Example

$H (x) = x + 1$ → the problem is if we know $H (x)$ we can calculate $x$ $H (x) = x mod 2$ → the function maps input to outputs of fixed size, so its a valid hash function.

Question

What makes a hash function good ?

A hash function $H$ is “good” if :

It takes a lot of time to find collisions - The adversary shouldn’t be able to find a collision- this will depend on the computational power that the adversary has access to.
It’s “hard” to find a pre-image of a random digest - It should not be feasible given current technology and a realistic timeframe to “undo” the hash function.

The pigeonhole principle

The pigeonhole principle states that there must be collisions if our domain is larger that our range. More formally, if $n$ items are put into $m$ containers, with $n > m$ , then at least one container must contains more than one item.

Types of attacks

Collision Attack - If the adversary could find distinct messages $m$ and $m^{'}$ such that $H (m) = H (m^{'})$ .
Preimage Attack - Given a random digest $y$ , if the adversary could find a message $m$ such that $H (m) = y$ .
Prefix Collision Attack - A collision attack where the attacker can pick a prefix for the message. In a prefix collision attack, if the attacker knows the hash $H (m)$ of some message $m$ , and the length of $m$ , they can compute the hash of a new message $m^{'} = m ∥ p$ where $p$ is some additional data, without knowing the original message $m$ .

Note: Prefix collision arracks are just a subset of collision attack.

Message Authentication Codes

If messages are modified in transit, Message Authentication Codes (MACs) can be used in symmetric encryption schemes to verify that the message was indeed written by someone with access to the private key.

This acts like a check bit in networking, sand so recipients can reject messages that don’t compute to a valid MAC tag.

M A C (k, m)

The MAC function takes in the key $k$ and message $m$ . Without an adversary knowing the key $k$ , they cannot generate a valid tag. This assumes both the sender and recipient have access to symmetric key $k$ .

Measuring performance

We can measure the performance of a computer using teraflops.

Teraflops (TFLOPS) measure computational power. More specifically, how many calculations (floating-point operations) a system can perform in one second.

TFLOPS can be a rough proxy for a system’s raw compute capability.

Ayush Acharjya's Notes

Explorer

01 Hashing

The pigeonhole principle

Types of attacks

Message Authentication Codes

Measuring performance

Graph View

Table of Contents

Backlinks