Definition

Bit is the standard unit of information. Given an event , the amount of bit the event conveys is given as follows:

Intuition

Let’s say that you want to transmit some “information” to another person. (e.g., tomorrow’s weather) However, your transmission device is so slow, so you want to transmit as less informatoin as possible for efficiency. In this scenario, is there a minimum amount of data that is sufficient enough to contain all the information you want to transmit? If so, how can you find this?

The above definition of “bits” of information can give us some insights. According to this definition, an event conveys 1 bit of information if the information “cuts” your space of possibilities in half. In other words, the information can reduce the uncertainty by a factor of two; amongst the two equally likely possibilities, now there is just one.

bits-probability-space Image Source: Solving Wordle using information theory

From our above scenario, suppose that it is either sunny (50%), or rainy (50%) tomorrow. Now, a weather station tells you that it will be rainy tomorrow (event ). This event has reduced your uncertainty of tomorrow’s weather by a factor of two. Hence, we can say that the weather station has transmit a single bit of information to you.

bits-scenario-visualised Image Source: A Short Introduction to Entropy, Cross-Entropy and KL-Divergence

This equivalently means that, it is possible to transmit the same information within a single bit; if the weather station used a string ("rainy\0", 48 bits) instead, then it used 48 times more physical bits than it actually needs!

References