Hi and welcome back to my blog. This is part 2 of a series where I talk about algorithms for trading bitcoin. You can read part 1 here.

Last time we covered a naive market marking algorithm which suffered from adverse selection problems. Academia contains a great many papers which try to define and address this problem (Glosten Milgrom, Kyle, Das) using probability theory and other techniques. I’m going to cover exactly one such technique here.

Before we can start discussing the details, we must define the assumptions the authors of these papers make about the markets they operate in.

## Market model

In all of the above papers, the authors divide the market up into broadly two kinds of traders.

- Informed traders
- Uninformed traders

Informed traders know **in advance** what the future direction of the market will be, whereas uninformed traders do not. Informed traders are always 100% correct with their estimates, whereas uninformed traders are 50% correct on average.

The proportion of informed and uninformed traders in the market is assumed to be known, although no details are given about how to calculate this property.

What does this have to do with adverse selection, you might ask? Well the answer is that when you find your buy orders getting picked off and then the price moves down, it is (most likely) **informed traders** who do the picking, since they have **advance knowledge** of the future direction of the market.

## A learning market marker in the Glosten Milgrom model

The technique I’m going to cover is the one by Samnay Das, in the paper A Learning Market-Maker in the Glosten-Milgrom Model. The idea is based on the much earlier work of Glosten and Milgrom, who describe a simple technique for finding the best estimate of where the true value (i.e. the price) of the market lies between two fixed extremes, one high, one low, taking into account the adverse selection problem. Das extends this idea to deal with more than just two fixed bounds which makes it usable in practice.

These papers are extraordinarily dry and are completely full of impenetrable looking pages of mathematics, but really the idea is incredibly simple.

### Break-even

The design of this market maker is to break even; providing liquidity to the market without suffering overwhelming losses. Adding a profit motive isn’t difficult, though.

### Probability theory

The only thing you need to know about probability theory to understand this technique is three simple things:

- Probability is expressed as a number between
**0 and 1**, 1 being a complete certainty, zero being an (estimated) impossibility. - The probability of two events happening
**in sequence**is calculated by**multiplying**their individual probabilities together. - The probability of
**either of two events**happening is calculated by**adding**their probabilities together.

E.g in a coin toss, the probability of heads or tails is

0.5, or 50%chance of either event. So, the probability of tossing two heads in a row is0.5 x 0.5 = 0.25, or 25% chance.The probability of tossing either heads or tails is

0.5 + 0.5 = 1, i.e. a complete certainty, for obvious reasons!

### A review of market makers

A market maker doesn’t have (or at least tries not to have) a directional view on the market; the idea is to post two orders, one buy (priced under market value), one sell (priced over market value) at the same time with the intention of making profit when **both orders get filled**.

Above a buy and sell order are filled at the same time, making the spread as profit.

With this in mind, the market maker needs to calculate **two prices** every time it is updated; one buy price and one sell price and then place orders at those prices.

### The principle idea

The principle idea of the paper I’m about to describe is illustrated in this diagram.

- The greek letter Î± is the proportion of informed traders.
- P
_{a}is our selling price, or**ask** - P
_{b}is our buying price, or**bid** - V represents true market value (unknown quantity not shown above)

Informed traders will only buy from us when our selling price is **too low** compared to true value, and they will only sell to us when our buying price is **too high**. They do this with a probability of 1 when either of those conditions is true.

Uninformed traders buy from us and sell to us no matter what true value actually is. They buy and sell with a probability of 0.5.

Given these assumptions, we can calculate the probability of a buy/sell at any price just by adding the probabilities of informed and uninformed trades taking place.

When one of our asks gets filled (a trader bought from us, the top region in Figure 1) the two probabilities we need are:

- Î± x 1
- (1-Î±) x 0.5

**Î±** is the proportion of informed traders, **1** is the probability of an informed trader buying in this region so these probabilities get multiplied together.

**(1-Î±)** is the proportion of uninformed traders and **0.5** is the probability of an uninformed trader buying in this region, so these get multiplied together.

So the total probability of our ask getting filled is **Î± + (1-Î±) x 0.5** because we need the probability of one event **or** the other taking place (so we add them). We can write this more compactly:

**P(Buy | V > P _{a}) = Î± + (1-Î±) x 0.5**

Using Figure 1 as a reference, we can also work out the other probabilities, leaving us with this collection:

**P(Buy | V > P _{a}) = Î± + (1-Î±) x 0.5**

**P(Buy | V <= P**

_{a}) = (1-Î±) x 0.5**P(Sell| V < P**

_{b}) = Î± + (1-Î±) x 0.5**P(Sell| V >= P**

_{b}) = (1-Î±) x 0.5These are our

conditional probabilities

So, now we can calculate the probability of the true value V being anywhere given a particular type of order. But, how does this help us with figuring out how to price our bid and ask?

## Putting it all together

The basis for the implementation of this idea is to keep track of the probability **at each price level**. Since prices of BTC (or any asset) can start at 0 and go on to infinity we’re faced with a problem because we don’t have infinite memory to store all the probabilities. In practice, we must chose a window of prices within which to operate and we should have a probability entry for every step between our minimum and maximum prices in the smallest unit the exchange supports (which in this case is 0.01, since we’re working with the Huobi bitcoin exchange).

The technique assumes you have access to the true value at time 0 (V_{0}), in practice we can just use the mid-price to centre our window and the range can be 4 times the distance between the maximum and minimum prices in the last N trades.

The probabilities are initially unknown so are initialised to follow a normal distribution covering the entire window of prices.

This distribution should remain **normalised **at all time – i.e the sum of all probabilities in our window should be 1.

### Estimating true market value

Now that we have our probability distribution, we can estimate true market value very easily just by looping over the window and multiplying each price with the probability at that price and then sum.

public decimal m_TrueValue { get { decimal trueValue=0; foreach (KeyValuePair<decimal, decimal> kvp in m_window) { trueValue += kvp.Key * kvp.Value; } return trueValue; } } |

Codey way of saying the same thing.

### Calculating our bid and ask prices

Now that we have a way to estimate true market value, we can calculate what our bid and ask prices should be. All we need to do is to take this true value estimate and imagine that a buy or sell order has occured.

I write ‘imagine’ here because we are not updating our probability window, just reading from and multiplying it as though a buy/sell had occurred.

In both bid and ask cases, we must refer back to our set of conditional probabilities above. For the case of calculating our ask we have two probabilities:

**P(Buy | V > P _{a}) = Î± + (1-Î±) x 0.5**

**P(Buy | V <= P**

_{a}) = (1-Î±) x 0.5Note that we use P(Buy | …) because this represents a buy from a trader at our

asking price.

So, now we can do a similar weighted sum through our window of prices/probabilities, but also imagine that a sell order has occurred which we do by multiplying the existing probabilities by these conditional probabilities, remembering to divide through by the sum of the weights we used:

public decimal CalcAsk(decimal currentAsk) { decimal ask=0, pask=0; foreach (KeyValuePair<decimal, decimal> kvp in m_window) { decimal finalP; if (kvp.Key > currentAsk) { decimal pBuyVGreater = m_alpha+(1-m_alpha)*0.5M; finalP = pBuyVGreater*kvp.Value; } else { decimal pBuyVLEqual = (1-m_alpha)*0.5M; finalP = pBuyVLEqual*kvp.Value; } ask += kvp.Key*finalP; pask += finalP; } ask /= pask; return ask; } |

The conditional sell order probabilities shift the weighted sum upwards from the true value; this is because the conditional probability is **greater** when the price is above our current asking price and **less** when it is below. If this is unclear at the moment, it should become more obvious when I talk about how the algorithm adjusts it’s prices when an order is actually filled (as opposed to imaginarily).

To calculate the bid price, we can use a precisely parallel technique using the other set of probabilities.

### Adjusting prices when an order gets filled

The entire algorithm is based around the premise that it’s estimates are always wrong. This may sound odd, but actually it makes perfect sense in a market where there are more than 0 informed traders. If the market were entirely uninformed traders, the market would be a completely fair game, or martingale (not to be confused with the maringale betting system) and random buy/sell orders would break even on average.

This is not the case where you have more than 0 informed traders because they will pick off your badly priced orders, leading to the adverse selection problem.

With this in mind, the algorithm presented here always assumes it’s prices are wrong (how wrong depends on the proportion of informed traders). Therefore if a buy order is filled, it’s prices are too high and if a sell order gets filled, it’s prices are too low.

The way that this is expressed is through the conditional probabilities. As I mentioned earlier, for any positive value of Î±:

**P(Buy | V > P _{a}) = Î± + (1-Î±) x 0.5**

Is higher than

**P(Buy | V <= P _{a}) = (1-Î±) x 0.5**

which says that prices above P_{a} are more likely than prices below.

So, when a sell order gets picked off, i.e. a trader buys from us, the algorithm loops over the probability window multiplying the existing probabilities (this time, in place so the resulting value is stored back into the window, unlike when we computed the bid/ask prices) by the conditional probabilities above, for both **V < P _{a}** and

**V >= P**. This increases the probability that the true asking price is above P

_{a}_{a}, causing the weighted average prices of both P

_{a}and P

_{b}to move up, thereby attempting to compensate for adverse selection.

Shown here are the two conditional probabilities with Î± = 0.2. The higher scale for prices above P

_{a}mean our bid/ask will move up after renormalisation.

Over time the algorithm aggregates the probabilities through sequences of buys and sells to narrow in on the true value.

## Narrowing and expanding the spread

The spread of our asking and buying prices can be increased/decreased by scaling the region between our ask and bid more/less than the region outside it. The spread might need narrowing if trading is occurring far within our bid/ask prices without filling them. It might need expanding if there is a change in volatility.

In addition, if prices move too close to the bounds of our probability window, distortion will occur in the estimation of true value because the space nearest the bound will be far less than the space farthest away from it – for accurate estimation the spacing should be equal (within some tolerance) so the algorithm will have to regularly adjust the window if necessary.

The details of these processes are left as an exercise for the reader.

## Results

The deciding factor is of course whether this technique actually works in practice. Does the algorithm break even as it was designed to do? As ever, this algorithm has been live tested on Huobi bitcoin exchange and the results are below.

Unfortunately the answer to the question is no. Although it breaks-even in a mean reverting market (actually, it makes a slight profit), as soon as a sustained trend begins, adverse selection takes over and large losses are incurred.

## Analysis

Possible reason for the failure of the algorithm to adapt to the market as it was designed to do are:

- The proportion of informed traders is a constant which is assumed to be known ahead of time, but actually this proportion is dynamic and hidden.
- This implementation only considers perfectly informed and uninformed traders, whereas the full algorithm of Das includes noisy informed traders as well.
- There could be a bug with my implementation.
- Bitcoin may be such a volatile asset, that this algorithm is unsuitable.
- There is the possibility that academic papers only get published when the techniques they describe stop being profitable(!)

## Next time

Next time I’m going to explore whether estimating the proportion of informed traders using VPIN has a favorable effect on this algorithm, and whether VPIN can be used in genernal as an indicator of an impending trend.

Until next time, have fun!

Cheers, Paul.

I’m looking forward to part three of this article. The math you propose is quite interesting, and similar to half formed ideas I’ve had myself.

I’d love to hear more.

I love this series of tutorials! I wrote my own algorithms/bots a while back, but they barely broke even. There’s a wealth of info in here I didn’t know about, so I may give it another go sometime soon.

Any idea when part 3 will be out?

Thanks!

-Tyler

great post, can’t wait for the next entry. idea for future direction:

does statarb work on bitcoin/ripple maybe? (been meaning to research this myself but no time lately)