Our friend’s record of the suits drawn from the deck is then a random variable with four equally probable outcomes. Our friend then replaces the card, reshuffles the deck and repeats. Our friend shuffles the deck, draws a card and records its suit without showing us. Next, consider a deck of cards with the jokers removed. Here and in the examples below, consider that ‘the cost of encapsulating information’ is analogous to ‘the number of questions required to describe a random variable’. The number of binary questions we need to ask to describe each outcome is one. You can discern the outcome of any individual coin toss by asking just one binary (yes-no) question: was the outcome a head ? 1 Equally, we can ask whether it was a tail and still end up with the same knowledge. Your friend tosses the coin and hides the result from you. Say we have some random variable, like a coin toss. Illustrating the Concept of Shannon Entropy The key is to divorce the information theoretical definition of information from our everyday concept of meaning. This sounds a bit whacky, but will become clearer as we introduce some equations and examples. Information has some meaning, otherwise it’s not really information. In Shannon’s Information Theory, information relates to the effort or cost to describe some variable, and Shannon Entropy is the minimum number of bits that are needed to do so, on average. In everyday usage, we equate the word information with meaning. This concept of information is somewhat counter-intuitive. In this context, information content is really about quantifying predictability, or conversely, randomness. Measured in bits, Shannon Entropy is a measure of the information content of data, where information content refers more to what the data could contain, as opposed to what it does contain. They also beat the casinos by counting cards at Blackjack.īetween Information Theory and digital circuit design (and less so his gambling escapades), Shannon’s work essentially ushered in the digital world we find ourselves in today. Not being one to let his genius go to waste, he and his buddy Ed Thorpe secretly used wearable computers to beat roulette in Vegas casinos by synchronising their computers with the spins of the wheel. Later, around 1948, he discovered Information Theory, which leverages his unique-at-the-time understanding that computers could express numbers, words, pictures, even audio and video as strings of binary digits. Most notably, he was the first to describe the theory of electrical circuit design (in his Master’s thesis at the age of 21, no less). Shannon (the man, not the entropy) was one of those annoying people that excels at everything he touches. Specifically, we’re going to tinker with the concept of Shannon Entropy. The purpose of this post is to scratch the surface of the markets from an information theoretic perspective, using tools developed by none other than the father of the digital age, Claude Shannon. We chose s to satisfy the following inequalities.Before you commit your precious time to read this post on Shannon Entropy, I need to warn you that this is one of those posts that market nerds like myself will get a kick out of, but which probably won’t add much of practical value to your trading. The size of the domain is $ n^r $ and the size of the range is $ m^s $. The range of E must be greater than or equal to the size of the domain or otherwise two different messages in the domain would have to map to the same encoding in the range. To prove this is correct function for the entropy we consider an encoding $E: A^r \rightarrow B^s$ that encodes blocks of r letters in A as s characters in B. In this case the entropy only depends on the of the sizes of A and B. And if event $A$ has a certain amount of surprise, and event $B$ has a certain amount of surprise, and you observe them together, and they're independent, it's reasonable that the amount of surprise adds.įrom here it follows that the surprise you feel at event $A$ happening must be a positive constant multiple of $- \log \mathbb \log_m(n)$$ It's reasonable to ask that it be continuous in the probability. How surprising is an event? Informally, the lower probability you would've assigned to an event, the more surprising it is, so surprise seems to be some kind of decreasing function of probability.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |