Information theory is mathematical theory that considers the measurement and the process of information. Information theory was established by Claude Shannon and includes the source coding theorem, the channel coding theorem and the channel capacity theorem[1].


We assume $ p(m) $ is a probability of $ m $th event when the set of total events is denoted as M.

Self-information with above assumption is given by

$ I(m) = \log \left( \frac{1}{p(m)} \right) = - \log( p(m)) $

where $ p(m) $ is a probability of the $ m $th event and $ \sum_{m \in M} p(m) = 1 $.

Entropy of a set of total events is given by

$ H(M) = E[ I(M)] = \sum_{m \in M} p(m) I(m) = - \sum_{m \in M} p(m) \log p(m) $

where $ E[] $ is the expectation operator.


  1. C.E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal, vol. 27, pp. 379-423, 623-656, July, October, 1948