• Uncertainty Wednesday: Pay Extra to Read (Or Fight to Protect Net Neutrality) November 22, 2017 12:22 pm
    Just imagine for a moment the world we can easily find ourselves in. You love my series of blog posts called “Uncertainty Wednesday” but when you try to access it instead of seeing the content, you receive a notice from you ISP (the company you pay to access the Internet), that Continuations is not included in your current plan. You need to upgrade to a more expensive plan to see any content hosted on Tumblr.This is not some kind of far fetched hypothetical possibility. Without Net Neutrality that’s exactly what will happen over time. We do not need to speculate about that, we can see it in countries that do not have Net Neutrality. Here is a picture from a carrier in PortugalNow you might say: but isn’t it good if this makes services cheaper to access? What if someone can only afford 5 Euros per month, here at least they are getting some access?But asking the question this way is buying into the ISP’s argument that they should get to decide which services you can access. Any one of the bundles above effectively requires a certain amount of bandwidth from the carrier. It should absolutely be the case that a carrier can give you less bandwidth for less money. But then with whatever bandwidth you have purchased you should be able to do as you please.I have explained here on Continuations extensively why Net Neutrality is required for last mile access due to the lack of competition. So I am not going to rehash that again, you can read it at your leisure and so far without having to pay extra.Net Neutrality is once again under attack. Ajit Pai, Chairman of the FCC, has announced his plan to “restore internet freedom” which is, as it turns out not your freedom as a consumer to use the bandwidth you have purchased as you see fit, but rather the freedom of your ISP to charge you for whatever it wants to.So if you don’t want to wind up with the Portugal situation from above, go ahead and call Congress. Thankfully the website Battle for the Net makes this super easy. Do it!
  • Wenger Design: Men’s Wear November 20, 2017 12:16 pm
    I have mentioned here on Continuations before that we have been home schooling our children. The main reason for doing so is to give them plenty of time to pursue their interests. Interest that over time can deepen into passions and have the possibility of ultimately providing purpose. For our son Peter one of those interests has been fashion. He has been learning how to sketch, cut, sow, etc. since age 8 and now at 15 has put together his third collection. This one is Men’s Wear and for the first time he is making it available for sale.I particularly like the Bomber Jacket above. I am definitely not cool enough though to wear the Kilt:You can find more pieces from the collection at Peter’s web site Wenger Design.
  • Twitter’s Verified Mess November 17, 2017 3:12 pm
    One of the problems with a relatively open platform such as Twitter is impersonation. I can claim to be somebody else, upload their picture to my profile and tweet away. This is particularly problematic for public figures and businesses but anyone can be subject to impersonation. Years ago, Twitter decided that it would “verify” some accounts. While a good idea in principle, Twitter’s implementation, sowed the seeds of the current mess. First, Twitter chose to go with a heavily designed checkmark that looks like a badge. Second, this badge appeared not just on a person’s profile but prominently in all timeline views as well. Third, the rollout appeared geared towards Twitter users who were somehow cool or in-the-know. Fourth, Twitter seemingly randomly rejected some verification requests while accepting others. The net result of all of these mistakes was that the verified checkmark became an “official Twitter” badge. Instead of simply indicating something about the account’s identity it became a stamp of approval. Twitter doubled down on that meaning when it removed the “verified” check from some accounts over their contents, most notably in January of 2016 with Milo Yiannopoulos.Just now Twitter has announced a further doubling down on this ridiculously untenable position. Twitter will now deverify accounts that violate its harassment rules. This is a terrible idea for two reasons: First, it puts Twitter deeper into content policing in a way that’s completely unmanageable (e.g., what about the account of someone who is well behaved on Twitter but awful off-Twitter?). Second, it defeats the original purpose of verification. Is an account not verified because it is an impostor or because Twitter deverified it?What should Twitter have done instead? Here is what I believe a reasonable approach would have been. First, instead of a beautifully designed batch, have a simple “Verified” text on a person’s profile. Second, do not include this in timeline views. It is super easy from any tweet to click through to the profile of the account. Third, link the “verified” text in the profile to some information such as the date of the verification and its basis. For instance, “Albert Wenger - verified October 11, 2012 based on submitted documents.” This type of identity-only verification would be quite scalable using third party services that Twitter could contract for (and users could pay for if necessary to help defray cost). Twitter could also allow users to bring their own identity to the service including from decentralized systems such as Blockstack. It would also make it easy for people to report an account strictly for impersonation. Harassment on platform is a real problem, but it is a separate problem and one that should be addressed by different means.
  • Uncertainty Wednesday: Sample Mean under Fat Tails (Cont’d) November 15, 2017 12:12 pm
    Today’s Uncertainty Wednesday will be quite short as I am super swamped. Last week I showed some code and an initial graph for sample means of size 100 from a Cauchy distribution. Here is a plot (narrowed down to the -25 to +25 range again) for sample size 10:And here is one for sample size 1,000:Yup. They look essentially identical. As it turns out this is not an accident. The sample mean of the Cauchy distribution has itself a Cauchy distribution. And it has the same shape, independent of how big we make the sample!There is no convergence here. This is radically different from what we encountered with the sample mean for dice rolling. There we saw the sample mean following a normal distribution that converged ever tighter around the expected value as we increased the sample size.Next week will look at what the takeaway from all of this. Why does the sample mean for some distributions (e.g. uniform) follow a normal distribution and converge but not so for others? And, most importantly, what does that imply for what we can learn from data that we observe?
  • Call Your Senators to Preserve Equity Compensation for Startups November 13, 2017 1:10 pm
    The latest Senate version of the “Tax Cuts and Jobs Act” has a stab in the eye for startups. It proposes to tax certain stock options and RSUs at the time of vesting. An earlier House version also contained this provision, but the House removed it. Startups are a key part of innovation. Often joining a startup means accepting a lower cash compensation for a higher potential upside. This upside usually comes in the form of stock options or other stock based compensation such as restricted stock units (RSUs). For these to be effective means of offsetting lower current compensation, they need to provide upside with no downside.In particular, an employee should not own taxes on the appreciation of the capital until they actually have liquidity in the asset. Everything else runs the risk of having to pay taxes on paper gains that subsequently evaporate. This problematic situation exists today already for many employees who leave companies and have to exercise their options and there have been legal efforts under way to change that.The Senate version of the bill does the exact opposite. It now moves the tax payment for options to the point of vesting. So imagine working for a highly successful startup. At each vesting date you would owe a tax payment on the difference between your option strike price and the now fair market value. These could be substantial payments! Not only do you not have the money to pay those unless you are already wealthy but also you have no idea what those shares will eventually be worth. It could easily be much less again. Possibly zero! We have seen plenty of companies that had been valued in the 100s of millions and some in the billions of dollars that went to 0 without ever achieving liquidity along the way.Now it is somewhat unclear whether this would affect all options or only so-called non-qualifying options. If it only affects the latter, then one possible way to fix the issue would be to dramatically increase or entirely remove the cap on the amount of equity that can be awarded in an incentive stock option (it is currently $100,000 which is why many executive grants wind up being non-qualifying).I don’t know if this tax bill has a chance of passing. I suspect that it does as it appears less controversial than the healthcare bill. If you want startups to continue to be able to readily use deferred equity compensation then I encourage you to call your Senators right away and let them know you are opposed to Section III(H)(1) of the “Tax Cuts and Jobs Act.”
  • Board Effectiveness Tip #5: Have a Lead Director November 10, 2017 12:32 pm
    Two years ago I wrote a series of blog posts on board effectiveness tips. I am adding a new one today. If you have a large board, make sure you have a lead director. “What is a lead director?” you may ask. Informally speaking it is the director who makes sure that the board reaches consensus on important issues. Some companies formally elect director to the lead role, but this is uncommon for startups. Startups that have raised multiple rounds of financing can wind up with large boards with three, four or more investors on it. In these cases a disfunction that I have observed more than once is that each investor waits for some other investor to take the lead. And as a result key decisions are either delayed or not made at all, often with dire consequences. This usually happens around really important and difficult decisions, such as replacing a member of the management team, changing strategy, doing a down round or accepting or rejecting an unsolicited M&A offer (especially if that offer is not super attractive).So if you have a large board, ask yourself who the lead director is. Who will you go to in such a situation to make sure your board members are engaged? And when you go to them, will they have the time and inclination to act as lead director? If you can’t answer that, I highly recommend you find someone among your board members before a crisis arises.
  • Uncertainty Wednesday: Sample Mean under Fat Tails November 8, 2017 11:59 am
    In today’s Uncertainty Wednesday we are putting some of the ideas from the last few weeks together: we are looking at the behavior of the sample mean of a fat tailed distribution. To do this we will again use a bit of Python code. Unlike our first sample mean example where we looked at the roll of a die, we will need some help here to draw samples from a more complicated distribution. Thankfully the Python ecosystem has the wonderful SciPy libraries, which if you don’t know already you should check out in any case.So here’s the code for drawing 100,000 samples of size 100 each from the Cauchy distribution.from scipy.stats import cauchy import numpy as np size = 100runs = 100000digits = 1 dist = {} for run in range(0, runs):      r = cauchy.rvs(size=size)     mean = np.mean(r)     rounded = round(mean, digits)      if rounded in dist:          dist[rounded] += 1     else:          dist[rounded] = 1 for mean in sorted(dist):     print "%s: %s" % (mean, dist[mean]);I am rounding everything to only 1 digit to produce a histogram. And here is a chart from a run of the above program.What is going on? There seems to be a spike around 0 which is where the distribution is centered, but there also are outcomes where the sample mean from 100 draws is greater than 25,000 and others where it is smaller than -75,000! And pretty much all the values along the way seem to have occurred also.Let’s zoom in on the spike to see its shape better. Here are just the counts for sample means between -25 and +25:This looks very much like a chart of the Cauchy distribution itself. Remember that when did this for the rolls of a die (a uniform distribution) we observed that the distribution of the sample mean not only looked normally distributed but that distribution became tighter as we increased the size of the sample.Next Wednesday we will try the same here. We will look at both smaller and larger sample sizes to see what the effect is here.
  • Forking vs. Voting in Blockchains November 6, 2017 1:05 pm
    There was an interesting post on the YCombinator blog by Ramon Recuero about the evolution of blockchain protocols through forking and copying. The post does not mention the alternative possibility of binding voting as a mechanism for the evolution of blockchains. There are several projects, including the troubled Tezos, where the blockchain protocol will be able to evolve via on-chain voting.Voting is an important mechanism to be explored as an alternative to forking. In his famous treatise Exit, Voice, and Loyalty, Albert Hirschman describes how members of an organization or consumers of a product/service can respond to a deterioration in quality. They can either choose to exercise voice, that is speak up and demand changes, or they can exit and join a different organization or use a different product/service.For blockchains forking is the native implementation of “exit” but voting will be the way to achieve “voice.” Change is most effectively accomplished when both mechanisms are available. Forking (exit) is very disruptive and should be chosen only as a means of last recourse after voting (voice) has been tried and failed. This is why I am excited to see projects that are working to implement on-chain voting for protocol evolution. This is an important missing capability. 
  • Uncertainty Wednesday: Interlude (Random Variable vs Distribution) November 1, 2017 6:29 pm
    Super short Uncertainty Wednesday post today as I have a crazy busy week featuring our annual meeting (where we meet with our Limited Partners). The last few weeks we have been digging into sample means and expected values. We saw some surprising things already, such as random variables that do not have an expected value.During these posts I have sometimes used the terms random variable and probability distribution interchangeably, despite previously having given two separate definitions (see links). So what gives? Technically they are different concepts but for some commonly used probability distributions, such as the normal distribution, all random variables based on the distribution differ only in one or two parameters (for the normal distribution: mean and standard deviation). These differences turn out to be boring and so using the terms interchangeably seems OK. Put differently, often the difference between distributions is more important than the difference between the same distribution with different parameters. For instance, all random variables based on Cauchy distributions (fat tailed) are very different from all random variables based on Normal distributions. That difference is huge compared to the difference between normally distributed random variables.Important caveat: there are distributions for which changes in the parameter make a big difference, such as the power law distribution. In those cases random variables based on the same distribution will differ a lot from each other.
  • Some Thoughts on the State of Bitcoin and Ethereum October 30, 2017 11:38 am
    Upfront disclosure: I am long both Bitcoin and Ethereum (personally and also indirectly via USV).In preparation for the annual meeting at USV, we have been putting together some slides on the cryptocurrency market. Looking back at the last year I was most surprised by the run-up in Ethereum as part of the ICO craze. I did not see that coming to nearly the extent that it did. While the ICO is an important innovation, there has definitely been an excess both in the amount of money raised for some individual projects and in projects raising that either have very little chance of succeeding or are outright scams.So where are we today? At least temporarily there seems to be a slow down in ICOs. This could turn out to just be a lull before more activity resumes but it could also be a welcome return to more sanity (if the latter, there is likely going to be an over correction). In either case Ethereum faces a strong headwind not only from this change in sentiment but also from relatively costly and slow on chain computation. The bull case for Ethereum is that sometime in 2018 we will see a couple of Ethereum based projects launch successfully and get broad adoption *AND* progress is made on Ethereum scaling (either directly or through projects such as Raiden or Plasma). The bear case is that at least one, or possibly both of these don’t happen.How about Bitcoin? Oddly I think that Bitcoin continues to be misunderstood by many people in the cryptocurrency space who want it to be more than it has to be for it to succeed. It is one of those cases where the more you know, the more you are likely to overthink it. Yes, Bitcoin has all sorts of drawbacks as a blockchain, but it is the one cryptocurrency with a widely understood use case: censorship resistant store of wealth. Fiat currencies, precious metals and real estate (including land) all have more government control and/or are more difficult to move around and transact in than Bitcoin. With everything crazy that’s going on in the world politically, the demand for censorship resistant wealth storage is high and growing.Bitcoin has issues resulting from mining concentration and the attendant attempts to create additional wealth ex nihilo through forks. The large amounts of money involved have made it nearly impossible to have rational discussions on questions of technical merit. The bull case for Bitcoin is therefore easier than the one for Ethereum: all it takes is for the current forking noise to die back down and one chain to continue to be recognized as Bitcoin (above all other contenders). As a side note: Should Bitcoin’s self-inflicted troubles mount, it will make Zcash and Monero a lot more attractive.In summary then: for the time being I am cautiously bullish on Bitcoin and at best neutral on Ethereum. As always though please don’t take this as investment advice and keep in mind that all cryptocurrencies continue to be highly risky. 
  • Blockstack Update October 26, 2017 2:30 pm
    I last wrote about Blockstack early in 2016 when the team announced the goals of the project. Since then a lot of progress has been made. The team has released a browser which is now available for Mac, Windows and Linux. They have published several papers, including a couple in peer reviewed journals. The latest paper introduces the Blockstack token. In the meantime people have started to build applications using the BlockstackJS framework. If you are interested in writing an application you should check out the available bounties. Here is a new video in which Ryan and Muneeb explain the Blockstack project:If you are interested based on this update, the Blockstack team has provided a lot of information about the upcoming Blockstack token sale.
  • Uncertainty Wednesday: Fat Tails October 25, 2017 12:08 pm
    So last Uncertainty Wednesday we encountered a random variable that does not have an expected value. Now if you read that post you might ask, was this just an artificially constructed example or do random variables like that actually occur? Well the example I gave was an extreme form of a power law, which are distributions increasingly found in the economy as we transition to a digital world. Due to network effects, the winning company in a space has many times the size of the runner-up and there is a long tail of smaller competitors. The distribution of views on a site such as Youtube similarly follows a power law. So increasingly does the wealth distribution.Here is another example of a distribution that at first glance looks like it ought to have an expected value:Just eyeballing this it would seem that the expected value is 0. But that’s, well, wrong. In fact, this distribution, known as the Cauchy Distribution does not have an expected value (it does not have a variance either!).Now you might have noticed that this looks a lot like the Normal Distribution, which we had encountered earlier. That had a well defined expected value and variance, so what gives? Well consider the following graph which compares the two distributions:You can see that the Normal Distribution has more probability concentrated right around the 0 and then declines very rapidly. The Cauchy Distribution by contrast declines less rapidly in probability in the tails. It is an example of a so called fat tailed distribution.In the Cauchy Distribution if we tried to form the expected value for outcomes above 0 the infinite sum goes to positive infinity and below 0 it goes to negative infinity. The two do not offset each other, instead the sum is not defined. So this is what both last Uncertainty Wednesday’s example and today’s example have in common: extreme events have sufficiently high probability that the expected value is not defined. Next Wednesday we will see some practical implications of this for observed sample means and what we can learn from them.
  • Rebecca Kaden: New Partner at USV October 23, 2017 11:04 am
    Last week was a big week for us at Union Square Ventures. We had the MongoDB IPO on Thursday and on Friday Rebecca Kaden announced that she has joined USV as a partner. Rebecca is a New Yorker and after nearly a decade on the West Coast, she is looking forward to coming back here. You can read more about it in Rebecca’s own words over on the USV blog. I am super excited for today’s Monday meeting because it will be the first one with Rebecca!
  • MongoDB IPO and New York Tech October 20, 2017 11:32 am
    I spent much of yesterday morning at the NASDAQ for the IPO of our portfolio company MongoDB (now trading as MDB). This is a big milestone for technology companies in New York. We have had New York IPOs before, including Etsy in our portfolio, but MongoDB is the first core technology (as opposed to applied technology) company that’s based in New York city to reach this milestone. As someone who came to New York in 1999 at the height of the dotcom bubble, when it first felt like the city could be a force in tech, and then lived through the tech winter that followed in the early 2000s this moment feels particularly sweet.Why does this matter? Because it is one more step along the way of demonstrating that geography is no longer destiny. You do not have to move to San Francisco / Silicon Valley to start a tech company. New York is not unique in this regard, we are just maybe a bit further along than other cities in the United States. Globally there are many places that are building healthy tech eco systems, including Toronto, London, Berlin and Beijing.My congratulations and thanks to the team at MongoDB that has worked tirelessly to make this possible. Keep up the great work!
  • Uncertainty Wednesday: A Random Variable without Expected Value October 18, 2017 12:14 pm
    I ended the previous Uncertainty Wednesday post asking whether an expected value always exists. Going back to the definition, the expected value is the “probability weighted average of a random variable.” So let’s construct an example of a random variable which does not have an expected value. We will consider a probability distribution with infinitely many discrete outcomes, in which the first outcome has probability ½, the second ¼, the third 1/8 and so on. This is a valid probability distribution because all the probabilities sum up to 1:½ + ¼ + 1/8 + 1/16 + …. = 1Whether or not an expected value exists depends on what the numeric values of the outcomes are. Consider for instance the random variable where the first outcome is 1, the second is 2, the third is 3 and so on. For this random variable we have an expected value, becauseEV = ½ * 1 + ¼ * 2 + 1/8 * 3 + 1/16 * 4 + …. = 2Why is that so? Because even though our random variable includes ever larger outcomes, these very large outcomes occur with very small probability and so the probability weighted average is a convergent infinite sum.But now consider what happens when the outcomes themselves grow exponentially. Let’s consider the case where the first outcome is 2, the second is 4, the third is 8 and so on. Now we haveEV = ½ * 2 + ¼ * 4 + 1/8 * 8 + 1/16 * 16 + …. EV = 1 + 1 + 1 + 1 …Clearly the EV here is no longer a convergent sum but rather diverges towards infinity.Now you might say, Albert that’s not an example of an expected value that doesn’t exist, the expected value is simply infinite. This might take us into a separate discussion of the meaning of infinity, which might be fun to have, including the more sophisticated objection to the example which would claim that all real processes have some finite upper bound.For now though let’s focus on a different question: how does the sample mean behave for the random variable we just defined? This is a well defined question. A sample has, by definition, a finite number of observations (that’s what it means to be a sample). So each sample will have a mean. What is the implication of the expected value diverging for the behavior of the sample mean?
  • Blog Panic (Backing Up Cloud Services) October 16, 2017 11:24 am
    I woke up this past Friday (the 13th) to a DM on Twitter saying that Continuations was down. I immediately tried to open the site on my phone and was greeted by an ominous:At first I figured that maybe Tumblr was down. A quick check of other Tumblrs revealed that not to be the case. At this point a somewhat queasy feeling started to set in. After a quick shower I went to my laptop and tried to log into my account only to see a terrifying sight:At this point I was in a full blown state of panic. I have been writing here for a long time and my last full backup of Continuations was several years old!Thankfully I know several people connected to Tumblr and they kindly offered to help. Continuations was fully restored within a couple of hours, but those were some scary hours in which I kicked myself for not following the advice that I give every USV portfolio company, which is to make sure to backup of all their data.I have since learned that Tumblr does an excellent job keeping data around making it easy for them to restore things after an accidental deletion (apparently some automated bot deletion system had malfunctioned and removed Continuations). Still, I am feeling much better now that I have a once again current backup of Continuations.This experience has made me think about other cloud services that I use extensively such as Google and Dropbox. I am now wondering if I should back these up to each other for increased redundancy. I am curious to hear from anyone who does that as to why and how they have set it up.
  • Uncertainty Wednesday: Sample Mean (Part 3) October 12, 2017 12:21 am
    Last Uncertainty Wednesday we dug deeper into understanding the distribution of sample means. I ended with asking why the chart for 100,000 samples of size 10 looked smoother than then one for samples of size 100 (just as a refresher, these are all rolls of a fair die). Well, for a sample of size 10, there are 51 possible values of the mean: 1.0, 1.1, 1.2, 1.3 … 5.8, 5.9, 6.0. But with a sample size of 100 there are 501 possible values for the mean. So with the same number of samples (100,000) the distribution will not be approximated as closely. We can fix this by upping the number of samples to say 1 million. Thanks to the amazing speed of a modern laptop even 100 million rolls of a die just take a couple of minutes (this still blows my mind). Here is the resulting chart:Much smoother than before! We could make that even smoother by taking up the number of runs even further.  OK. So what would happen if we went to sample size 1,000? Well, by now this should be easy to predict. The distribution of sample means will be even tighter around 3.5 (the expected value of the distribution) and in order to get a smooth chart we have to further up the number of runs.So what is the limit here? Well, this get us to the law of large numbers, which essentially states that the sample average will converge to the expected value as the sample grows larger. There is a strong version and a weak version of the law, a distinction which we may get to later (plus some more versions of the law).For now though the important thing to keep in mind is that when we have small sample sizes, the sample mean may be far away from the expected value. And as we see above that even for a super simple probability distribution with 6 equally likely outcomes there is considerable variation in the sample mean even for samples of size 100! So it is very easy to make mistakes from jumping to conclusions on small samples.Next Wednesday we will see that the situation is in fact much worse than that. Here is a hint: every sample has a mean (why?) but does every probability distribution have an expected value? 
  • Support idyll - Interactive Narratives October 9, 2017 11:40 pm
    I am excited about a new open source project called idyll. Here is how Matthew Conlen, the lead author, describes idyllIdyll is a tool that makes it easier to author interactive narratives for the web. The goal of the project is to provide a friendly markup language — and an associated toolchain — that can be used to create dynamic, text-driven web pages.Idyll helps you create documents that use common narrative techniques such as embedding interactive charts and graphs, responding to scroll events, and explorable explanations. Additionally, its readable syntax facilitates collaboration between writers, editors, designers, and programmers on complex projects. The project seems like an important step in the direction of an interactive learning environment that seamlessly combines text, mathematical formulas, code, graphics. Creating such an environment and then using it to share knowledge about the consilience of math, physics, computation and more is one of my three passion projects.An example of an idyll document explains the etymology of the trigonometric functions. In a future version of idyll it will be easy to show and even edit the code behind the unit circle graph on the right.If you are as excited about idyll as I am, please help me support the project via the idyll Open Collective page.
  • Uncertainty Wednesday: Sample Mean (Cont’d) October 4, 2017 11:26 am
    Last Uncertainty Wednesday, we started to look at the behavior of the mean of a sample by repeatedly drawing samples. We used a sample of 10 rolls of a fair die. We know that the expected value of the probability distribution is 3.5 but we saw that the sample mean can deviate substantially from that on a small sample. In particular, with 10 rolls we got sample means both close to 1 (almost every roll is a 1) and close to 6 (almost every roll is a 6).The fact that the sample mean itself is random and has a distribution shouldn’t be surprising and yet it is the source of a great deal of confusion. Let me show that in the case of discussion of weather and climate. I had defined climate as the probability distribution of possible weather events. The realized weather then is a sample. So we should not at all be surprised to see variability in the weather relative to past averages. And yet we use terms such as “unseasonably” cold or “unseasonably” hot all the time, which imply that there is something out of whack with what was observed. The challenge then in analyzing climate change based on data is to separate variability within the existing distribution (climate) from changes in the distribution (climate). We will get back to that in future posts, but first we have more on sample means.What happens is we make our sample larger? Instead of a sample size of 10 rolls, let’s consider a sample size of 100 rolls. Below are graphs contrasting the results of 100,000 runs for sample size 10 and sample size 100:We again see a distribution in the sample mean but it is much tighter around the expectation of 3.5, with almost all observed sample means for size 100 falling between 3 and 4, as compared to 2 and 5 for sample size 10.It is really important to let this all sink in deeply. Even at a sample size of 100 rolls, there is significant variation in the sample mean. The good news is that the distribution of the sample mean is centered on the expected value. This is often referred to as the sample mean being an unbiased estimator of the expected value. We will dig into when and why that’s the case as it is not true for all underlying probability distributions (almost certainly *not* true for weather). The bad news though is that even when the sample mean is an unbiased estimator of the expected value, on any one sample that you draw if it is the *only* sample, you have no idea whether you are above or below the expected value. Keep in mind that all this analysis we are currently conducting is based on a known distribution. That is hardly ever the problem we actually confront. Instead, we have explanations which lead us to prior beliefs about distributions and we need to use the observations to update those beliefs. More to come on sample means and what we can learn from them next Wednesday. Until then, here is a question to ponder: why did the graph for sample size 10 comes out smoother than the one for sample size 100?  
  • Las Vegas Mass Shooting and Life’s Fundamental Asymmetry October 2, 2017 6:46 pm
    I was going to write a post today about how liquidity in financial markets goes down as concentration of holdings goes up but then I woke up to the news of the mass shooting in Las Vegas last night. I have written extensively before about the need for better gun control, so won’t rehash that today other than to note that the statistics for 2017 show already over 11,000 deaths year-to-date.What yesterday’s mass shooting does make eminently clear yet again though is just how much damage and trauma a single person can inflict using modern technology. It is so much easier to destroy a life than to build one. Split seconds of pulling a trigger, versus decades of nurturing and growth. This fundamental asymmetry is one that we as humanity need to pay more attention to as we make more technological progress. The asymmetry between destruction and creation will never go away. It is baked deeply into the fabric of reality. There are myriads of arrangements of the molecules found in a human body and only a tiny fraction of those arrangements amount to a person who is alive and well. So as we have more and more power at our disposal we need to think carefully about how to prevent ever more destruction brought about by individuals (and small groups). Unfortunately, there are no simple answers here and we will be forced to look at uncomfortable trade-offs.