**Abstract:**

Modern cryptography relies on the difficulty of certain problems arising in pure mathematics. I will describe how group theory and modular arithmetic give an easy to describe "one-way function" that can be used in a variety of ways, including for spam prevention and for creating a digital currency requiring no central authority. No background in group theory or cryptography will be assumed.

**Outline:**

- group theory
- one-way functions
- proof of work
- spam prevention
- digital cash

Groups are mathematical structures that are used to encode symmetries as well as properties of arithmetic. Here is the formal definition.

**Definition**: A **group** is a set $G$ with a binary operation $m : G \times G \to G$ satisfying the conditions below, where we write $a b$ for $m(a,b)$:

- There is an
**identity element**$e$ in $G$. That is, $e a = a = a e$ for every $a$ in $G$. - The operation is
**associative**. That is, $a(b c) = (a b)c$ for every $a$, $b$ and $c$ in $G$. - For every element $g$ in $G$, there is an element $h$ in $G$ such that $g h = e = h g$. We call $h$ an
**inverse**for $g$ and write $h = g^{-1}$.

When $G$ is a group, the identity $e$ and the inverses are all unique.

- The collection $\Z$ of integers under addition.
- The collection $\R$ of real numbers under addition.
- The collection $\R^\times$ of
*non-zero*real numbers under multiplication. - The collection of rotations of the plane about the origin, under composition.

- The collection $\N$ of natural numbers under addition.
- The collection $\R^\times$ of non-zero real numbers under addition.
- The collection $\Z$ of integers under subtraction: $(4-2)-1 \neq 4-(2-1)$.

For cryptography, we're going to use some other groups.

Choose a prime number $p$, like $p = 7$. Write $\Zp$ for the set $\{0, 1, \ldots, p-1\}$.

This is a group under *addition modulo $p$*, where you
take the remainder modulo $p$ after adding. E.g.

In [154]:

```
(4+5)%7 # 9 = 1*7 with 2 remainder
```

Out[154]:

$0$ is the identity element:

In [155]:

```
(4+0)%7
```

Out[155]:

Associativity holds because you can delay taking the remainder until the end:

In [156]:

```
print ((4+5)%7+6)%7, (4+5+6)%7, (4+(5+6)%7)%7
```

The negative (additive inverse) of 4 is 3:

In [157]:

```
(4+3)%7
```

Out[157]:

In general, the negative of $k$ is $p-k$, since $k + (p-k) = p \equiv 0$ modulo $p$.

The group $\Zp$ is called the **cyclic group of order $p$**.
It is called "cyclic" because if you start with $1$ and keep adding it to itself, you go through all the elements of the group and eventually end up where you started.

The group we *really* need is $\Zpx = \{1, 2, \ldots, p-1\}$ under multiplication modulo $p$. E.g.

In [158]:

```
(3*4)%7
```

Out[158]:

$1$ is the identity:

In [159]:

```
print (1*3)%7, (3*1)%7
```

Associativity is like with addition. But inverses are less clear!

In [160]:

```
print (1*3)%7, (2*3)%7, (3*3)%7, (4*3)%7, (5*3)%7, (6*3)%7
```

So, in $\Zpx$, the (multiplicative) inverse of $3$ is $5$! That is, $3^{-1} = 5$.

But why does *every* number have an inverse? This follows from the Euclidean algorithm, using that $p$ is prime.

For us, the important tool in $\Zpx$ is going to be **exponentiation**:

In [161]:

```
for i in range(7): print '3 to the', i, 'is', (3**i)%7
```

It loops around, giving every element of $\Z/7^\times$!

In other words, $\Z/7^\times$ is *also* a cyclic group,
and is **isomorphic** to the cyclic group $\Z/6$ using
the correspondence sending $i$ in $\Z/6$ to $3^i$ in $\Z/7^\times$.

This happens for any prime $p$: you can always find an
element $g$ such that
\begin{align}
\Z/(p-1) &\cong \Zpx \
i &\mapsto g^i \
\end{align}
is an isomorphism. Such a $g$ is called a **generator**.

Switching gears to cryptography, a **one-way function** is a function $f : A \to B$ such that:

- for each $a$ in $A$, $f(a)$ is easy to compute;
- given $b$ in the image of $f$, it is difficult to find an $a$ in $A$ such that $f(a) = b$.

More precisely, $f(a)$ should be computable in *polynomial time* in the length of $a$, while any polynomial time algorithm should have negligible probability of finding preimages.

It is an **open problem** whether any one-way function exists, and if true, implies $P \neq NP$, one of the biggest open problems in theoretical computer science.

But it is widely believed to be true, and in fact, it is believed that the **exponentiation** map sending $i$ to $g^i$ in a cyclic group is one-way for large primes $p$.

Let's explore this. First, is **exponentiation** quick to compute?

In [ ]:

```
def exp1(g,i,p):
result = 1.
for j in range(i):
result = (result * g)%p
return result
```

In [ ]:

```
p1 = 191; p2 = 1234577; p3 = 123456791; p4 = 12345678923
for p in [p1, p2, p3]:
print '%12d: ' % p,
time2('exp1(3, p-2, p)')
```

That method takes $p-2$ steps, which is **exponential** in the number of bits!

The trick is to use **Horner's** method.
For example,
$
3^{45} = 3^{32+8+4+1} = (3^{32}) (3^8) (3^4) (3^1),
$

and we compute the powers by repeated squaring:
$3^{32} = ((((3^2)^2)^2)^2)^2$ only requires five multiplications!

In [ ]:

```
def exp2(g,i,p):
result = 1
power = g
while i != 0:
if i%2 == 1:
result = (result * power)%p
power = (power * power)%p
i = i/2
return result
```

In [ ]:

```
for p in [p1, p2, p3, p4]: # even p4 is included!
print '%12d: ' % p,
time2('exp2(3, p-2, p)')
```

The inverse to this **exponentiation** map is called the
**discrete logarithm**: given $a$ in $\Zpx$, find $i$
such that $g^i = a$. We'll write $i = \log_g a$.

To compute the discrete logarithm, we can naively search for the answer:

In [ ]:

```
def dlog(g,a,p):
result = 1
i = 0
while result != a:
result = (result * g)%p
i = i + 1
return i
```

In [ ]:

```
time2('dlog(3,81,11113)')
```

In [ ]:

```
a = exp2(3, p3-2, p3)
time2('dlog(3, a, p3)')
```

A one-way function can be used for **proof of work**.

An example of this is the **Hashcash** anti-spam tool.

Each message I send has a header that looks like this:

`X-Hashcash: 1:24:161021:npagliar@uwo.ca::kaw+sLa1OykzF+pL:01LFRH`

This header proves that I used some cpu time before sending
the message to generate this **stamp**. It only took a second
or two, but spammers can't afford to spend one second per message,
so a spam filter is more likely to let my message through.

Here's how we could do this with the **discrete logarithm**.

We decide on a large prime $p$ and a element $g$ in $\Zpx$ so that the discrete log problem takes about a second to solve on average.

Then, we combine the current date and the recipient's address into a string of bits that we interpret as a number $a$ (modulo $p$).

Then we compute $i = \log_g a$ in $\Zpx$, and include $i$ in the header. The spam filter just has to check that $g^i = a$, which is fast.

Proof of work can also be used to create a digital currency like **Bitcoin**.

One way to create a digital currency is to create a **ledger** that records all of the transactions.

This is easy if you have a trusted central authority.
E.g., I could create **Dancoin** right now (if time).

But how to do it in a **distributed** way, with untrusted participants?

We make it so that adding new transactions to the ledger requires such a large amount of work that it takes our combined computing power about 10 minutes.

Then it's unlikely that one person can influence the ledger.

This is how **Bitcoin** and most other digital currencies work.

The method solves the **Byzantine generals** problem, the problem of coordinating information among distributed, untrusted agents with unreliable communication links.

Even better than a one-way function is a **cryptographic hash function**, which must in addition have the property that the output $f(a)$ appears to be essentially random.

It should not be possible to deduce anything about $a$ from $f(a)$, and changing $a$ slightly should change $f(a)$ in an unpredictable way.

This is not true for exponentiation, since increasing $i$ by $1$ multiplies $g^i$ by $g$.

An example of such a function is **SHA**:

In [ ]:

```
print sha('This is a test')
print sha('This is a test!')
print sha(42)
print sha(43)
```

This kind of **cryptographic hash function** is used for:

- Hashcash
- Digital currency
- Electronic signatures
- Verification of file integrity (e.g. downloads)
- Password storage and checking
- Proof of knowledge (e.g. publish the SHA of a document, without revealing the contents until later)

In [ ]:

```
from time import time, sleep
def time2(code):
st = time()
ret = eval(code)
print duration(st)
return ret
def duration(st):
dur = time() - st
if dur >= 1:
return '%.3gs' % (dur,)
dur *= 1000
if dur >= 1:
return '%.3gms' % (dur,)
dur *= 1000
if dur >= 1:
return '%.3gµs' % (dur,)
return '0'
```

In [ ]:

```
time2('sleep(0.15)')
```

In [ ]:

```
from hashlib import sha512, sha256
def sha(s):
return sha256(str(s)).hexdigest()[:50]
```

In [ ]:

```
sha('Testing'), sha(17)
```

In [ ]:

```
```