Financial Hacking —Full Kindle Highlights By Chapter


Part I — Vanilla World

Ch 1: Risk
  • Risk is the “why?” behind finance. Without risk, there is no reason for finance or insurance or derivatives to exist at all.
  • Ultimately risk is just the set of your unhedged exposures, including ones you don't know about.
  • This holds as a life lesson too: when people ask you questions expecting to hear a particular answer, they won't listen to what you are saying until you first speak the magic words their brains are tuned in to.
  • Finance is the study of risk, even though risk does not have a perfect definition.
  • if you instead try to tweak the algorithm perhaps to squeeze past returns to zero, or institute minimum and maximum thresholds, then you are simply creating a new game for me to play. As a trader, I like games, and I will play yours and not only will I win, I will crush you….How would I do it? I'd just look through thousands of assets until I found one that looks better under your requirements than under my own more reasonable judgment.
    • if you were thinking that this might be part of the problem that caused the recent financial turmoil, you'd be right. Having an objective set of regulations means the regulator can, and will, be gamed. Indeed it can be shown that any regulation of risk will always result in greater systemic risk.
  • a mere portfolio of assets is not a derivative, even though its value is clearly derived from those assets. A financial derivative needs to be something you can trade by itself. If the portfolio were offered to you in the form of a swap (a swap is just a financial contract between two parties) then that swap would be a derivative, because the value of the swap ultimately derives from the value of the component assets. But the portfolio itself is not a derivative.
  • Maybe the best illustration is to think of a Happy Meal at McDonald's. If you just lay out a burger, small fries, and kids drink on a table, that is a portfolio. To get it, you have to make several orders. But if you put it all in a box with a toy, suddenly, it is something you can order immediately. It is more than just a portfolio; it has become a derivative.
  • A forward can have any strike price, but there will always be one unique strike price that results in the forward contract itself having a zero current value. This unique strike price is called the fair forward price.
  • If Tuesday is the ex-date, and Friday is the record date, then if you buy a share on Monday, you will still be a holder as of Friday (Thursday, even), and so you will eventually get the dividend. But if you buy it on Tuesday, then you will be too late. You will not receive the dividend. That is exactly what should happen with forwards, and it is exactly what happens in the stock market. All of those blinking lights and obsessive analysis on TV and radio over every little tick — not a single one of those prices are real spot prices. They are all forward prices. [Kris: Because of T+3 or T+2 settlement all the prices of stocks are technically forwards not spot prices. If you don't believe this the evidence is in how stocks drop after the ex-date]
Ch 2: Arbitrage
  • Arbitrage is defined in academia as “riskless profit” but it is used in practice to mean “good deal that should make money if all goes well.”
  • This is more like a hostile religious question where one of two neighboring religions deny the legitimacy of the other one and will not rest until it is wiped out. On such religious questions, tolerance is futile and counterproductive. If you don't object to the Crusades, you end up Christian or dead.
  • Quantum physicist and Nobel laureate Niels Bohr once said, essentially, that the opposite of a true statement is obviously false, but the opposite of a profound truth is another profound truth. The existence of arbitrage opportunities is a profound truth. Arbitrage is simultaneously both impossible and prevalent. Bohr argued that only in allowing and confronting the paradox head-on can we grow in our understanding.
  • “How wonderful that we have met with a paradox. Now we have some hope of making progress.” Niels Bohr
  • How can arbitrage be both impossible and prevalent? We have met with a paradox, but where is our progress? There is an answer. The answer is frictions.
  • We all have friends and family members who are slowly but surely ruining their lives, at least from our perspective, but there is nothing we can do. Your brother takes the wrong job. Your friend marries the wrong spouse. Your cousin is an alcoholic. Your neighbor overeats. How are you going to “arbitrage” them? You can't buy or borrow people. If you could, you would buy your neighbor, put him on a healthy diet and exercise regimen, and in a few months, return him in better shape, and pocket the difference in profit. You would have made a better man of him. But who are you going to buy him from? Who would you sell him to after? Alternatively, why not borrow him from himself, then fatten him up, make him cheaper, and return him, again pocketing the difference? These are the two things you could do with stocks, but not with people. It's because of frictions. You can't change other people, not easily, and you can't profit from improving them.
  • In the financial world, you may have a colleague who could replace his concentrated holdings in company stock with a diversified portfolio, but he refuses to do so. If you can't sell short the company stock, you can't do the arbitrage [Note the implication -- the diversifiable risk is an arbitrage (alrhough subject to limitation). Imagine being long parallel portfolio that is diversified scaled to the vol of the concentrated holding and short tbe concentrated holding. How does such a long short portfolio perform? What short rate would ruin it?]
  • The resolution of the paradox is this: yes, there are arbitrage opportunities, but not every person can exploit every arbitrage opportunity. Some are made just for you, some just for others, and some for whoever can find them.
  • Hillel answered, and his response encoded what has come to be known as the golden rule: “What is hateful to you, do not do unto others. This is the whole Torah; the rest is commentary. Now go and study.” There are three important aspects here. First, the real Golden Rule of Hillel is not what you might usually think. He does not say to treat others as you would like them to treat you. Instead, he says to refrain from treating others as you would not like them to treat you. It is the difference between a command to do good and a command to abstain from evil. It is impossible to fulfill the duty to do good; one can always do more, and the goodness itself subjectively depends on others. But it is possible to fulfill the duty to abstain from evil: one can simply not hurt others, and the harm, if done, is more objectively noticeable.
  • Second, Hillel's wisdom frames all ethical knowledge and teachings around this simple principle. In this way, when details begin to confuse, as they always tend to do, one can retreat to the big picture to see how it all fits in.
  • If Hillel were a trader today, and a non-trader were to ask him to teach him all there is about financial hacking while standing on one foot, I imagine Hillel might answer something like this: “Accumulate risks that are hateful to others; dispose of risks that are hateful to you. That is the whole of financial hacking; the rest is commentary. Now go and trade.” As profound as that may sound, the key part of the golden rule of financial hacking is not the first sentence, nor the second, nor the third; it is the semicolon. “Here is a lesson in creative writing. First rule: Do not use semicolons. They are transvestite hermaphrodites representing absolutely nothing. All they do is show you've been to college.” Kurt Vonnegut
  • That semicolon is the epitome of financial hacking. It takes two somewhat related concepts and joins them together in an uneasy alliance that is not quite as tight-knit as a comma and not quite as arms-length as a period; not quite as together as with an “and” and not quite as opposite as with a “but.” In our world, these two somewhat related concepts are fundamentally related yet relatively mispriced assets; the semicolon is the trade. It is all well and good to say these two assets ought to have the same price; how do you do actually do it?
Ch 3: Trading Puzzles
  • I can tell you where the SP500 will settle in one month. How much would you pay for this information? (And then, what would you do with it?)
  • Let's say you give a number like $ 10 million, and I accept it. The S& P 500 is currently at 1000. I gaze deeply into your eyes and tell you the truth: in one month, the S& P 500 will close that day's trading at a level of…. 1000. Oops! Now what? How are you going to make money? You owe me $ 10 million in a month, and I will collect. There is no point in buying or selling futures at the same price at which you expect them to expire. So what can you do? [He doesn't mention the stratgey of announcing your shot on social media and using it to gain followers. The value of this will depend on whether you have something to monetize or follow it up with...and if you do not already have a following it's likely you don't have skill in monetizing one so again the value of the follower windfall depends on its beneficiary]
  • All you can do is hope the market moves in the meantime, and it really is a hope, because you have no other information about what is going to happen over the course of the next month, not the volatility, nor the volume, nor the highs and lows. All you know is that it will be at 1000 again a month from now.
  • So how do you time your entry points? Say you have $1 million of liquid assets and say that this much money would let you support up to $10 million in notional, because futures have a haircut of about 10 percent.
  • Suppose you are very lucky and the S&P 500 jumps down to 900 before you even have a chance to put in your order. Now you would want to buy. But how much? Do you put your entire amount on the line, such that even a single tick against you triggers a margin call?
  • Ultimately you can perhaps do best if you are able to buy and sell options, but there won't always be a liquid options market at every strike you need at the asset that you want to trade, and besides, we haven't really discussed options yet.
  • These kinds of practical issues are ignored in standard textbook discussions of riskless profit opportunities but they are precisely the issues that financial hackers worry about most. And you will almost surely never experience anything with this level of certainty at any time in your career. There will always be doubts about your model, your inputs, and your forecasts.
  • The usual algorithm for trading a spread is this: put mid-market orders on both sides and as soon as one is hit, move the price on the other to cross the market. In this way, you can end up paying half of the bid-offer spread on only one leg rather than on both.
  • According to standard theoretical concepts of arbitrage, none of those questions matters. According to real-world practical experience, you can't even begin to trade until you have answered all of them. [Talking about the HSBC share class arb in the 90s]
  • What is the cost of this dollar-for-dollar trading? What is the downside? Maybe you make less money on the convergence than otherwise. But this is not true! In our example, the share-for-share profit would have always been $5, regardless of when or how the convergence occurred. But because you traded dollar-for-dollar and the market went up, you actually made more money! Could dollar-for-dollar be a superior way of trading the discrepancy? Not necessarily. Had the market gone down by 10 percent, and then the two share classes converged, you would have made a 10 percent smaller profit. In fact, you can think of the dollar-for-dollar trade as being exactly the same as the share-for-share trade, plus an additional long amount in the expensive share equal to the amount of the dollar discrepancy. Specifically, with A costing $100 and B costing $105, the dollar-for-dollar position is the share-for-share position of long one share of A and short one share of B, plus a long position in 5/105 shares of B. The difference in the profit between share-for-share and dollar-for-dollar comes precisely from this net long position. So why be long? Because of the market risk.
  • If you know the convergence will happen within a few months, you are probably better off just trading it share-for-share, and taking the short-term mark-to-market risk of your portfolio moving against you in the meantime because of broad market movements. But if you think the convergence may take years, you may be better off with the smaller daily mark-to-market profit variation, at the cost of greater terminal profit variation. With share-for-share, you know for sure what your profit will be when the two share classes merge. With dollar-for-dollar, your profit depends on the level of the market when the convergence happens. If the convergence happens when the market is low, your profit will be low.
  • There is one other wrinkle to the dollar-for-dollar way of trading, and that's the fact that you have to rebalance if you want to maintain your dollar-for-dollar exposure. A share-for-share trade never needs to be rebalanced, but in dollar-for-dollar, if the discrepancy widens to say $10 from $5, you need to either buy more of the A share or buy back some of the B shares to maintain an equal dollar exposure. This additional trading can bring with it its own profit profile, one that depends neither on the overall direction of the market or the discrepancy, but on their correlation and on whether you are willing to increase your bet!
  • Say you always want to maintain a $100 exposure in each share class. Then if the discrepancy tends to widen when the market rises, you would be buying back the B shares at a high price, and selling more of them at a low price. That would hurt you over the long run. But let's say you are willing to increase your position size. Then if the market is up and the discrepancy is wider, you can buy more of the A shares instead of the B shares. And when the market and the discrepancy come down, you can sell more of the B shares. The result will be that your overall position size is larger, but you haven't necessarily lost money yet. There is no easy straightforward answer. There are only issues and considerations for you to weigh. That's what makes it fun!

Part II — Vanilla Derivatives

Ch 4: Black-Scholes
  • If you have an interview for a derivatives position in five minutes, and you know nothing about options, this is the chapter for you. You probably won't get the job but at least you won't look foolish. And they may simply be screening out fools in this round; you may just buy yourself enough time to learn something more before they call you back.
  • Some options are European and some options are American. European options can only be exercised at maturity, the date at which the option expires. American options can be exercised at any time, up to and including the maturity. How do you remember which is which? The American Revolution was fought to attain more freedom than Europe; American options give you more freedom than European ones.
  • Hedging means making trades to reduce your risk. Just like landscapers trim bushes and hedges so that the only greenery that's left is the one they want, so too do traders hedge away all of the unwanted risk so that the only exposure that's left is the one they want.
  • A common interview question is: “Can you prove the put-call parity?”…the financial hacking proof is quick, visual, and insightful.
  • The big intuitive insight from put-call parity is that it doesn't matter whether you trade in puts or in calls: the two are in some deep sense identical, despite their surface differences. Optionality is the key. The kink is the key. Once you have a kink, you can go long or short and you can fiddle with forwards to make it go up or down or left or right as you see fit.
  • Fundamentally, there is no difference between puts and calls, and this insight itself is not dependent on a particular model or restricted to only particular parameter values. It is a deep truth.
  • If you understand this derivation deeply, you understand the basics of nearly every aspect of finance, including arbitrage, risk management, valuation, hedging, Itō's lemma, short selling, mergers, market microstructure, portfolio management, yield curves, hedge funds, behavioral finance, and more. And all this in just five short equations.
  • The assumptions we make in finance essentially just try to formalize in as simple a way as possible our intuitive concepts. For example, how can we formalize the basic concepts of expected and unexpected price change? One simple way is this. Let's assume that the expected price change is some proportion of the previous price, where the proportion is a constant annual drift multiplied by the exact amount of time elapsed, expressed as a portion of a year. And let's assume that the unexpected price change is some scaled white noise.
  • The two most important insights are that: (1) the white noise dw is basically like the square root of dt, and (2) (dt)n = 0 for any exponent n > 1. The first insight follows because random white noise has independent increments with variance proportional to the elapsed time. That means its standard deviation is proportional to the square root of the elapsed time. The elapsed time when looking at things on a continuous basis is just dt. So dw is basically the square root of dt.
  • The second insight follows because, when dt is less than one, its square root is a larger number. So we need to take dw's into account. We even need to take (dw)2 into account, because that is like dt. But everything at a higher power than dt? That's too small.
  • These two insights are the basis of what is known as Itō's lemma, which explicitly states what the dynamics of V(S, t) will be for any function V of the underlying price S and the time t.
  • let's recall that there are three kinds of d's: (1) There is the Greek Δ (pronounced delta), meaning a small change. We might say Δt is a small change in time, like an hour or a day. (2) There is the lower-case d, meaning either a very small and essentially unnoticeable change, so that dt is on the order of a millisecond, or it can mean a total derivative, like d(x2) = 2xdx. (3) Finally, there is the partial ∂, meaning a partial derivative, holding all other inputs constant.
  • Equation (4.2) defines the dynamics of S. What would be the dynamics of V, which is itself a function of S and t? To figure it out, we just take the total derivative of V(S, t), which means taking the sum of each of the partial derivatives:
  • We ignore all higher-order terms because of the rule that everything smaller than dt, like dt2, is for all intents and purposes zero. But remember well that dw2 = dt.
  • What is dS2, the final term in Equation (4.3)? There are four cross products in dS * dS, all but one of them involving dt times either a dw or another dt, meaning that the result would be either dt3/2 or dt2, which are both zero. The only non-zero term is the one involving the square of the white noise term; thus dS2 = σ2S2dt.
  • Isaac Newton and Gottfried Leibniz both claim to have invented calculus, but with different notation. Newton's notation involves putting dots on top of symbols, so something like ẋ to denote the first derivative of x. Leibniz's notation uses little d's in front of the variables, so something like dx to denote the derivative of x. We primarily use Leibniz's notation today in almost all areas where mathematical derivatives matter.
  • The beauty of the d-like notation is that it almost seems as if we can just “factor out” the d 's. And in our case, we can
  • In general, when we see the dynamics of an asset or a portfolio that does not have a dw term, what are we to infer? In our model, the only risk is in the white noise. We specifically defined the unexpected price change to be proportional to dw. If there is no white noise, there is no unexpected price change, and so there is no risk. Therefore, we have a riskless portfolio. How much should a riskless portfolio earn? The riskless amount: it should earn the riskfree rate r multiplied by the amount of time elapsed dt multiplied by the initial value of the security.
  • In other words, if you think of the stuff in parenthesis on the left hand side in Equation (4.5) as being a portfolio n of its own, which it is, then the change in value of that portfolio must equal its initial value times the riskfree rate times the elapsed time, i.e. d∏ = ∏rdt.
  • But the left hand side of Equation (4.5) is equal to the left hand side of Equation (4.4), so the two respective right hand sides must be equal too. So we can eliminate the common dt's and rearrange terms to get: which is the Black-Scholes PDE.
  • For extra credit, you can rewrite it like this: where Θ = ∂V/∂t (theta) is the time decay, Γ = ∂2V/∂S2 (gamma) is the convexity of the derivative with respect to the underlying price, and Δ = ∂V/dS (delta) is the hedging ratio, the number of shares you need to sell (see Equation 4.5) in order to create a riskless portfolio.
  • One of the most common mistakes that even highly experienced practitioners make is to act as if the assumptions of Black-Scholes (lognormal, continuous distribution of returns, no transactions costs, etc.) mean that we can always arbitrarily assume the underlying grows at the riskfree rate r instead of a subjective guess as to its real drift μ. But this is not quite accurate. The insight from the Black-Scholes PDE is that the price of a hedged derivative does not depend on the drift of the underlying. The price of an unhedged derivative, for example, a naked long call, most certainly does depend on the drift of the underlying.
  • Let's say you are naked long an at-the-money one-year call on Apple, and you will never hedge. And suppose Apple has very low volatility. Then the only way you will profit is if Apple's drift is positive; suppose Apple has very low volatility. Then the only way you will profit is if Apple's drift is positive…if it drifts down, your option expires worthless. But if you hedge the option with Apple shares, then you no longer care what the drift is. You only make money on a long option if volatility is higher than the initial price of the option predicted.
  • It has some Greeks in it — things like theta and delta and gamma. But where is rho, the sensitivity of the derivative to changes in the risk free rate? Where is vega, the sensitivity of the derivative to changes in the volatility? Did we make a mistake somewhere in the derivation? After all, vega is perhaps the single most important Greek, after delta, of course. If options are all about volatility, and vega is a measure of the sensitivity to volatility, shouldn't vega be somewhere in the PDE? No, it shouldn't. The PDE is the result of a model, and that model assumed a constant volatility, and a constant risk free rate. Vega and rho, respectively, measure the change in the model output price when we change the input parameter. Parameter Greeks like vega and rho are fundamentally different from sensitivity Greeks like delta and theta. They are far more important. Sensitivity Greeks are simultaneous outputs from the model. They are supplemental information to go along with the model price. If you think of the model as a person, then the model price is its body, and the model sensitivity Greeks are its accentuating makeup and jewelry and clothes. The model is proud to calculate and display them for you. The underlying will bounce around; time will pass; the model knows this and happily exhibits your sensitivities to those events. But parameter Greeks are an embarrassment to the model. They are the cracks beneath the makeup, the scars beneath the clothes, the hollowness inside the body. They expose the fact that the model is false.
  • The model refuses to calculate these directly. It denies they even exist. Essentially the only way to calculate them is to kill the model and create a new one, with new parameters, and look at the differences.
  • Yet these parameter Greeks like rho and vega are the most important ones, because they let you step out beyond the model. They let you measure your exposure to your uncertainty about the correct parameters to the model. It's like buying a brand new car. The car will tell you certain things about itself. It has an odometer and a fuel indicator and many other gauges. Some even report the air pressure in the tyres. But it won't tell you what would happen if there was sudden flooding on the roads or an accident up ahead. Yet those are questions that you as an operator need to ask. And the only way you can find out, for a general kind of model, is to try it out, either for real or through simulations. In some special cases, such as the Black-Scholes formula, you can compute closed-form expressions for these parameter Greeks. But don't let that fool you. You are still exploring things that are outside of the model. And it is precisely because they are outside of the model that these Greeks are so critical to risk management: if the model was correct, you wouldn't need much risk management at all.
Ch 5: Simulation
  • the power of model parameters. The idea is not that the model is correct, or that the assumptions can never be violated, but simply that the model is useful in explaining the risks.
  • The parameters help the intuition. Often the right intuition is that the market price is basically correct, more or less, and if the model is physically unable, for any intuitively reasonable choice of parameters, to match the market price, then it is a bad model. Why? Because it is not useful. The Black-Scholes model is a very good model — not because it is right, but because it, and its parameters, are useful. But this is a very weak test, and only the first test we need to apply to any arbitrary black box model to gauge its value.
  • Every model depends on certain assumptions. A hedging model also tells you what trades you need to do every day after you put on a position — with the Black-Scholes model, the most important hedge is neutralizing the delta of the derivative with an offsetting position in the underlying.
  • So what if we simulate paths for the underlying exactly in accordance with the assumptions required by the model and hedge exactly as the model specifies? What will be the resulting profit or loss from buying and hedging a derivative? If it's not exactly zero, something must be amiss, either with the model or with our simulation. Let's consider a concrete example. Say we buy an at-the-money call, then randomly simulate a single path for the underlying security with lognormal returns having constant volatility. Along that path, we will hedge our delta to the model-computed delta with the same constant volatility. This means we will be net neutral delta as far as the model is concerned. In other words, we will do exactly what Black-Scholes says we ought to do, in a Black-Scholes world, on a plain vanilla option. Then we will do the same thing across lots of different randomly simulated paths. The question is: how much money will you make across all those different simulated paths? (a) You will always make $0 exactly. (b) You will make $0 on average, but not on every path. The essence of the question is: does the model predict perfect replication, or just an average result?
  • the drift term of the underlying only disappears when your net delta is zero. In other words, an unhedged option cannot be priced with no-arbitrage methods. More specifically, if you only do the dynamic delta hedging, but you are not actually hedging anything, then you are not delta-neutral each day, but are instead constantly taking directional market risk. The size of your bet changes as your imagined option delta changes.
  • So the first claim is equivalent to announcing that you have found some kind of timing strategy to just trade shares of the underlying security, with no options involved. But we have assumed that the underlying security returns are distributed randomly. You can't consistently make money just by buying and selling shares when prices move about randomly.
  • The resolution of the second puzzle is the same. If there were a better way to hedge than simply neutralizing the delta as often as possible, then you could construct a pure stock trading strategy on a random underlying that would still make money. Here's how: take a long position in an option and hedge it with the new method, take a short position in the same option and hedge that with the standard Black-Scholes method. Then your net position every day has zero options and some amount of stock equal to the excess of the new method over the Black-Scholes method. At best, this excess position is zero; most of the time, it is some arbitrary position in the market. And again, as we have assumed that the underlying follows the random fluctuations of the Black-Scholes model, there can't be any stock-only trades that consistently make money.
  • How can we get a random number drawn from a lognormal distribution, or any other distribution other than the uniform? We can use the following trick. Think of the cumulative distribution function of any distribution, for example, the lognormal distribution. It looks something like this: The cumulative probability starts at zero and grows up to one. What we want to get is a random number along the x-axis. But notice two things: the y-axis is always between zero and one, and for every y value, there is a unique corresponding x value. The trick is then to pick a uniformly random number between zero and one as the cumulative probability, and then find the matching x value that generates that probability. In Excel 2010, the formula would look like this: =LOGNORM.INV(RAND(),0,0.2) It works by taking the inverse of the lognormal CDF relative to a random cumulative probability, given the assumed parameters of the lognormal distribution. In this example, we assumed µ = 0 and σ = 0.2.
  • To simulate a daily return, we need to scale the volatility down to a single day. The Excel 2010 code for this is: =LOGNORM.INV(RAND(),0,0.2/SQRT(252))
  • The mathematical way is this: what is the expected value of a random variable following a lognormal distribution derived from a normal distribution with mean μ and standard deviation σ? Mathematica immediately gives the answer: e^(μ+σ²/2) In other words, even when there is no drift, so that μ = 0, the expected value of the lognormal distribution is affected by the volatility. Thus, to simulate paths with truly zero drift, we have to subtract o2/2 from the drift. That's the mathematical approach. So now we have learned something many people overlook or forget: when you are simulating lognormal returns, you always have to subtract away half the variance from the drift. This is far and away the most common mistake in simulating returns.
Ch 6: Puzzles and Bugs
  • In short, the inability to hedge perfectly continuously impacts your trading by introducing random risk. This risk decreases if you hedge more frequently, but only as fast as the square root. Therefore, if you want to halve your risk, you have to hedge four times as often.
  • Noise from hedging a one-year option on a daily basis instead of continuously is about the same as one volatility point. So, as a first test, we can ask if one vol point of noise renders Black-Scholes unusable. At first glance, it does not seem like too much noise; after all, when we estimate future volatility to be 20 percent and it ultimately delivers 19 percent or 21 percent, we would not likely conclude that we were wrong. There is noise in realized volatility too.
  • If you make one volatility point in expected profit and the standard deviation of your profit is one volatility point, then your Sharpe ratio is about one. (The Sharpe ratio is the expected excess profit over the risk-free rate divided by the standard deviation, but we are assuming the risk-free rate is zero anyway.) A Sharpe ratio of one is a pretty good trade. Not the best, not the worst. Pretty good. If the only risk from doing the option is the noise that comes from hedging daily rather than continuously, then this is likely a trade you will still want to do.
  • we can conclude that yes, there is a risk to not hedging continuously, but even on a daily basis, it is not too large, it is comparable to the bid-offer spread or the expected profit, and it can be largely diversified away.
  • "If I had an hour to solve a problem and my life depended on it, I would use the first 55 minutes to formulate the right question, because as soon as I have identified the right question, I can solve the problem in less than five minutes." - Albert Einstein
  • When the underlying is drawn from a 30-volatility distribution but we hedge to only a lower 20-volatility, the standard deviation of the return is three times as high, at more than 15%. Even more curious, the average is no longer centered around zero: the median return is negative. What is going on?
  • Let's focus now on the third row where we hedge to a higher volatility than the truth. This is the prototypical case of hedging to the model. Presumably, we believe the future realized volatility should be higher than the implied volatility of the price we bought it at. So we might consider hedging to our best forecast. In that case, when the hedge volatility is 30 and the simulation volatility is 20, the P&L as a function of the last price looks like a heart. It is not symmetric around the x-axis and it doesn't taper off at the sides. For terminal stock prices far from the initial stock price, we are guaranteed to have made a profit, and for terminal stock prices near the initial stock price, we are much more likely to have a loss.
  • The short-form intuition is this: you bought a call and hedged it. So you are betting on higher volatility. When volatility ends up higher, even if only for random reasons, you benefit, and when it ends up lower, you lose. That intuition continues to hold even if you hedge at the wrong vol. If, for example, the true vol is 30 but you hedge to 20, you are just introducing noise. The slope between your P&L and the realized vol is still positive, but not as sharply defined.
  • if you want to minimize your mark-to-market P&L, you may choose to hedge to the market even if you think the market volatility is wrong. How do you trade-off these two risks, the mark-to- market risk versus the at-maturity risk? Ultimately, you probably will decide based on the maturity of the option you are hedging. If the option will expire in a month or two, you will almost surely be able to weather any intermittent mark-to-market volatility, so you will lean towards hedging to model. If the option will expire in many years, you will likely lean towards hedging to market, at least until the expiry gets closer.
  • And what do people do in practice? They hedge their bets on how to hedge. One common rule of thumb is to hedge halfway be- tween the model and the market delta. Then you're never exactly hedged, but you're never too far away either.

Part III — Exotic Derivatives

Ch 7: Single-Asset Exotic Options
  • Once you've worked your way up to be the head of your own trading desk, there will come a time when you begin hiring people to work for you. How can you identify the useful ones in the flood of interns and assistants that will besiege you? Or, if you are an intern or an assistant, how can you elevate yourself above your competitors? One simple assignment you can give everybody is to have them explore and evaluate an exotic option. There are dozens of standard exotics and hundreds more that are customized to clients. You can pretty easily assign different exotics to each of the different people so that everyone has to do their own work. And because most exotics do not have well-known, simple, closed-form solutions, this exercise will help
  • you gauge both the creativity and the determination of the people you have tasked with the challenge. Some will do a passable job. Some will make subtle mistakes. Some will make gross mistakes and not even notice. Some will go the extra mile and look at things from a new angle.
  • The purpose of this chapter is not to price any or even all of the exotic options listed here. It is instead to show how a little simple financial hacking can go a long way towards understanding. These examples are just that — examples. The goal is to be able to quickly gain intuition about new derivatives for which no pricing algorithm has yet been written.
  • It's very common that no static combinations of plain vanilla options result in exotic options payoffs. That may well be the best definition of exotic options. After all, straddles or bull spreads are not really exotic options; they are just options positions. However, it is equally true that exotic options are often combinations of other exotic options.
Ch 8: Multi-Asset Exotic Options
  • Every option whose payoff depends on more than one underlying is an exotic option…Single-asset options at first seemed to be about exposure to the underlying, but turned out to be basically all about volatility. Multi-asset options at first seem to be about complex combinations of exposures to various underlyings, but will turn out to be basically all about correlation.
  • a spread between two lognormal securities is almost definitely not lognormally distributed. Why? Because the difference can be negative. A lognormal distribution does not allow negative numbers. Stock prices are well-modeled as lognormal because they cannot be negative. But the difference between two stock prices may be negative with substantial probability
  • Different formulations of spread options can lead to different closed-form formulas or computational techniques for determining the relevant pricing and hedging. But our goal here is not to review such methodologies. The known techniques change, and can be found elsewhere. Indeed, Mathematica for example now has built-in support for pricing many exotic options. Developing new and more efficient techniques is always a useful research endeavor, but it is not financial hacking. Our focus in this section, indeed in this chapter and in the book overall, is to build intuition so that we can quickly gain deep understanding of even brand new products, faster than the competition. To that end, we don't look to find delicate new pricing formulas, but rather rigorous and useful ways of looking at the problem.
  • we are able to capture the essence of this trade in just these few quick graphs. Meanwhile, if you were to try to derive a complete closed-form solution to this asset, it would take you months, if not years, at which point the pricing would have stabilized, the opportunity would have vanished, and the hot money would have moved on to yet another kind of derivative. Alternatively, if you were more of a fly-by-the-seat-of-your-pants kind of trader with little regard for models or even hacking, you might think it is a good trade and not notice the potential danger. Financial hacking gives you the best combination, in my humble opinion. You are able to quickly price a brand-new derivative, evaluate its hedging shortcomings, and perhaps most importantly, potentially negotiate with the counterparty offering the derivative to incorporate a different group of assets. From the counterparty's perspective, they may not even recognize the implicit dependence on the cross-sectional relationships. They may be perfectly willing to examine a different portfolio of assets. And that different portfolio may be far less likely to cause you a hedging nightmare.

Part IV — Exotic Worlds

Ch 9: The Best Trade In The World
  • An exotic world is one where the Black-Scholes assumptions break down, particularly the assumption about constant volatility. Unfortunately, when that assumption breaks down, so do a lot of our intuitions. This chapter's goal is to build your intuitions back up so that you will not fall victim to fake arbitrage opportunities. It is centered on trying to convince you of the discovery of the best trade in the world. It is up to you to debunk the claims.
  • So far we have assumed that the true volatility of the underlying asset, or assets, has been constant. The realized volatility has noise, but only because we simulate on non-continuous time intervals such as on a daily basis; the noise diminishes for smaller time intervals.
  • "It's because of this hedging to market versus hedging to model thing that I did. I looked into it and ran some simulations. If you hedge to the wrong volatility, you introduce some noise." "So I might make only eight or nine vol points, or as many as eleven or twelve?" This was not where you were going with your criticism but now you are trapped. “Uh,” you finally manage. "Yeah. Something like that."
Ch 10: Variance Swaps
  • If you think of the timeline involved in new products, those who implement well-established pricing formulas are several years late to the party. Those who completely derive what will eventually become well-established pricing formulas are probably about a year late. The purpose of this book is to make you ready to be the first one to trade, when the new product just comes out, and its mispricing is likely at a maximum, or at the very least at its most volatile. It is in times like those that a prepared, flexible financial hacker and trader can pick attractive spots. Besides, there will always be new derivatives, new products, and new opportunities. If your tools only apply to particular kinds of opportunities, you are simply less useful than you could be.
  • the instantaneous volatility today is probably right around 20, because if we stay around here, that's what the market expects us to realize. But the instantaneous volatility if we end up near the further option's strike is some number X such that the average of 20 and X is 15, because 15 is what the volatility of the entire path ought to be. So obviously X is 10. The instantaneous local volatility when the underlying is at the higher strike price should be 10. And by the same logic, the instantaneous local volatility when the underlying is at the lower strike price should be 30,
Ch 11: Esoteric Worlds and Derivatives
  • It is not necessarily the case that the derivatives we will talk about are even more exotic. Indeed they can be vanilla. But the principles we will extract from them are of a more fundamental nature, much like mysticism aims not to dispel other truths but only to help one understand reality on another level.
  • A Warrant, But Not for Your Arrest Warrants are just call options issued by the company itself. The primary difference between warrants and listed call options is the effect of dilution: the warrants, when exercised, are exchanged into new shares to be issued by the company, rather than previously existing shares to be delivered by the counterparty.
  • Think of what happens at maturity. Suppose both companies have soared over the preceding five years. Warrant holders and option holders all want to exercise. Let's say both originally had 10 shares outstanding. The option holder of company B would receive ten percent of the value of the company once he chooses to exercise. But the warrant holder of company A would receive one new share out of the now total eleven shares, so he would have about nine percent of the company.
  • dilution affects the value of the warrant such that it costs less than an equivalent option on a different company that had never issued a warrant. But dilution affects the value of the listed option on the same firm in exactly the same way as it affects the warrant! We can even prove it by arbitrage. Let's say the warrant costs less than the option. Then we buy the warrant and sell the call and pocket a small difference in premium. Now, whenever the option holder informs us they want to exercise, we will exercise the warrant. The option holder was expecting existing shares, not new shares, but once the new shares are issued, they are all fungible, meaning exchangeable with each other. And if the option holder allows the options to expire worthless, we too can let the warrant expire worthless. In other words, it is a perfect hedge.