The information carried by an event can be quantified as:

This means that the less the event is likely to happen (low ) the higher the information it carries (high ). In fact if is high, its will be near , since a very probable event doesn’t carry much information.

Given a probability distribution on some set of events , we define its entropy as the probability-weighted average of information:

H(p)= -\sum_\boldsymbol{x}p(\boldsymbol{x})\log p(\boldsymbol{x})