A bit of Math


Original: 02/01/20
Revised: no

Here are some elementary functions, appearing in a few places on our website, for example when we talk about the computational complexity of algorithms; the variables \(x, x_1, x_2, ..., x_n\) and the constants \(a, b, c\) are all assumed to take values from the real numbers, ...

$$ \begin{align} f(x) &= ax + b \space \space (linear)\\ \\ f(x) &= ax^2 + bx + c \space \space (quadratic)\\ \\ f(x) &= e^x \space \space (exponential)\\ \\ f(x) &= log \space x \space \space (logarithmic)\\ \end{align} $$

... whereas the following functions appear mostly as activation functions in the final layer of a neural network; notice that the logistic function and the softmax function are squashing their inputs onto the interval \((0,1)\), this way their outputs can be interpreted as probabilities:

$$ \begin{align} f(x) &= \space \space max(0,x) \space \space (RELU)\\ \\ f(x) &= \space \space {e^x \over {e^x + 1}} \space \space (logistic)\\ \\ f(x) &= \space \space {e^{2x} + 1 \over {e^{2x} - 1}} \space \space (hyperbolic \space tangent)\\ \\ f_i(x_1...,x_n) &= \space \space {e^{x_i} \over \sum_{i=1}^{n} {e^{x_i} }} \space \space (softmax )\\ \end{align} $$

The following two videos are helpful for understanding the gradient descent method used in the training of a neural network.