Soda-straws: mathematical models for 5th graders

2016 January 20
by Daniel Lakeland

I'm getting ready to run an in-class experiment with some 5th graders from the local elementary school. It's going to be about fundamental aspects of doing science, and it's going to be both very simple, and very deep at the same time. I hope it's successful. The idea goes like this:

We'll talk about measurement, and about doing science. We'll talk about the science facts that they learn (there are electrons and protons, the heart is a muscle, it pumps blood, clouds are made of water that evaporates from the ocean etc). We'll also talk about tools to do science with, measurement, how to connect measurements to facts we're interested in (models) and how we sometimes can't measure directly the thing we are interested in, and also how we can't measure things accurately (and the difference between accurate and precise).

The activity we're going to do, is to figure out how much a piece of soda straw weighs by measuring something that we can measure easily and accurately (its length).

Now, ask the kids if you took 1mm off the end of the straw, and weighed it super-accurately, how would you calculate how much 205mm of straw would weigh? Get the idea that 205mm of straw has the same weight as 205 1mm pieces into their head, and the idea that no matter where you slice the straw if it's 1mm long it will weigh basically the same weight (except with the accordion part).

We'll cut some soda straws. I've found it probably makes better sense to ask very specific tasks from specific people, so ask someone to cut a straw about 10, 25, 40, 60, 80, and 100 mm long. Then measure the actual length. If you ask them to cut a variety of lengths you'll wind up with a lot of 45,50,55mm type straws, you want a range from mid 10's to 90 or 100mm. We also want about 20 pieces that are around 5-10mm, just short pieces we'll be choosing randomly later.

Ask the kids for their guess as to how much 100mm of straw weighs. Get an order of magnitude estimate at least. (1g? 100g? 0.1g? etc give them a point of reference by measuring a pencil or a penny on the scale). Come up with an "upper end" estimate u and in your head think of the uniform prior on (0,u) as the prior for the slope of the line in cg/100mm.

Then, using a scale that measures down to 0.01 g (available on Amazon for like $10) we'll weigh the soda straws. Use a small cup as a weigh boat, and tare the scale to the cup weight. I used a medicine measuring cup from children's Advil, but a Dixie cup or lightweight plastic or styrofoam would be ok too.

Each time you weigh a straw, select at random 5 of the 20 small pieces of soda straw and add those to the cup. Write down the weights in cg (hundredths of a gram) next to the length in mm.

Now we've got a table of numbers. Plot the numbers on a graph paper. I just ran this experiment today, and I strongly recommend printing out a customized graph paper with the scales already in place. Kids had trouble preparing the scales to fit onto the page and cover the range of the data.

Once you've got the data points on the paper. Discuss with them about the added extra 5 pieces, and how this means all the data points are "too high". Also talk with them about how the line you're looking for has to go through (0,0) because zero length of straw has zero weight. Consider also that because we always chose 5 pieces, the errors should all be similar in size to the weight of 5x the "average length" of the straw bits.

Choose candidate lines and have each child graph a different candidate line on the graph paper. Have the children calculate the error for each data point, the distance in cg between the data point and the prediction on the line for that data point. Write those down in your table. Also calculate the slope of the line (cg/100mm for example, read off the prediction for a 100mm length).

Now, calculate 3mL and 7mL where m is the slope of your line, and L is the guess at the "average length" of the small bits you polluted the measurement with. If any of the individual errors is outside that range, call it an unacceptable line. Calculate the total error, if the total error is not between 4NmL and 6NmL it's not an acceptable line. Otherwise call it an acceptable line.

Put all the acceptable lines on one graph paper. Choose the one in the middle of the range. This is our estimate of the median of the posterior distribution of line slopes under a declarative model:

p(m) ~ uniform(0,u)
p(error | m,L) ~ uniform(3mL,7mL)
p(\sum error | m,L) ~ uniform(4NmL,6NmL)

Ask them to predict the weight of one of the longer straws, say near 100mm, and then weigh just that straw without the polluted measurements, and see how well we predict it using the median posterior value gotten above.

Watch out for outliers, where kids transpose digits or switch the weights between straws etc. This model won't handle those at all, there may be no lines that are acceptable if you have outliers.

Differential Entropy and nonstandard analysis

2016 January 14
by Daniel Lakeland

For reasons discussed previously I believe that every scientific measurement lives on a finite sample set. But, it is tiresome to work with enormous explicit finite sample sets. like for example the actual vales that a 64 bit IEEE floating point number can take on... They're not actually evenly spaced for example. What we tend to do is deal with discrete samples spaces with explicit values when the set is small enough (2 or 10 or 256 or something like that) and deal with "continuous" distributions as approximations when there are lots of values, and the finite set of values are close enough together (for example a voltage measured by a 24 bit A/D converter in which the range 0-1V is represented by the numbers 0-16777215 so that the interval between sample values is about 0.06 micro-volts, which corresponds to 0.06 micro amps for a microsecond into a microfarad capacitor, or around 374000 electrons).

Because of this, the nonstandard number system of IST corresponds pretty well to what we're doing typically. Suppose for example x ~ normal(0,1) in a statistical model. We can pick a large enough number, like 10, and a small enough number like 10^{-6} and grid out all the individual values between -10 and +10 in steps of 0.000001 and very rarely is anyone going to have a problem with this discrete distribution instead of the normal one. Anyone who does have a problem should remember that we're free to choose a smaller grid, and their normal RNG might be giving them single precision floating point numbers that have 24 bit mantissas anyway... IST formalizes this by some stuff (axioms, lemmas etc) that proves the existence, in IST, of an infinitesimal number that is so small no "standard" math could distinguish it from zero, and yet it isn't zero.

So, now we could say we have the problem of picking a distribution to represent some data, and we know only that the data has mean 0 and standard deviation 1. We appeal to the idea that we'd like to maximize a measure of uncertainty conditional on mean 0 and standard deviation 1. In discrete outcomes, there's an obvious choice of uncertainty metric, it's one of the entropies

E = -\sum_{i=1}^{N}p_i\log(p_i)

Where the free choice of logarithm is equivalent to a free choice of a scale constant which is why I say "entropies" above. Informally, since the log of a number between 0 and 1 (a probability) is always negative, then the negative of the log is positive. The smaller you make each of the p values, the bigger you make each of the \log(p) values. So maximizing the entropy is like pushing down on all the probabilities. The fact that total probability stays equal to 1 limits how hard you can push down. So that in the end the total probably is spread out over more and more of the possible outcomes. If there are no constraints, all the probability become equal (the uniform probability). Other constraints limit how hard you can push down in certain areas (ie. if you want a mean of 0 you probably can't push the whole range around 0 down too hard) so you wind up with more "lumpy" distributions or whatever depending on your constraints.

The procedure for maximizing this sum subject to the constraints is detailed elsewhere. The basic technique is to take a derivative with respect to each of the p_i values and set all the derivatives equal to 0. To add the constraints, you use the method of lagrange multipliers. The result would be each p_i = \exp(-Z-k(x_i-\mu)^2) and the k will depend on \sigma=1 in our case, and the Z chosen to normalize the total probability to 1.

Now, suppose you want to work with a "continuous" variable. In nonstandard analysis we can say that our model is that the possible outcomes are on an infinitesimal grid with grid size dx and constrained to be between the values [-N,N] for N a nonstandard integer. So the possible values are -N+idx for all the i values between 0 and M = 2N/dx. We define a nonstandard probability density function p(x) to be a constant over each interval of length dx, and the probability to land at the grid point in the center (or left side or some fixed part) of the interval is p(x)dx.

Now we calculate the nonstandard entropy

E = -\sum_{i=0}^{M}\log(p(x_i)dx) p(x_i)dx

Now clearly the argument to \log(p(x_i)dx) is infinitesimal since p(x_i) is limited and dx is infinitesimal, so -\log(p(x_i)dx) is nonstandard (very very large and positive). But, it's a perfectly good number. There is a finite number of terms in the sum so the sum is well defined. The value of the sum is of course a nonstandard number, but we could ask, how to set the p(x_i) values such that the sum achieves its largest (nonstandard) value. Clearly p(x) is going to be the same kind of expression as before, because we're doing the same calculation (hand waving goes here feel free to formalize this in the comments) so we're going to wind up with:

p^*(x) = \exp(-Z- k (x-\mu)^2)

Where p^*(x) refers to the nonstandard function which is constant over each interval, the standardization of this p(x) is going to be the usual normal distribution.

The point is, just because the entropy is nonstandard doesn't mean it doesn't have a maximum, and so long as the maximum occurs for some function of x whose standardization exists, we can take the standard probability density that is chosen as the maximum entropy result we should use, and this procedure is justified in large part because of the way that the continuous function is being used to approximate a grid of points anyway!

If you don't like this result, you could always use the relative entropy (ie. replace the logarithm expression with \log(p(x)dx/q(x)dx) relative to a nonstandard uniform distribution whose height is q(x) = \frac{1}{2N} across the whole domain [-N,N].  This seems to be the concept referred to by Jaynes as the limiting density of discrete points. Then, the dx values in the logarithm cancel, and the entropy value itself isn't nonstandard, but the distribution q(x) is, so it's still a nonstandard construct. Since q(x) is just a constant anyway, it's basically just saying that by rescaling the original one via a nonstandard constant, we can recover a standard entropy to be maximized. But... and this is key, we are never USING the numerical entropy value itself, except as a means to pick out a probability density which turns out to have a perfectly well defined standardization, namely the normal distribution.


Ignore so-called Unemployment "statistics". Labor force participation rate is higher In Sweden than the US

2016 January 1
by Daniel Lakeland

This washington post article talks about how the US has "Scandimania" where people are perhaps overlooking problems and talking about how great it would be to be more like Sweden and Denmark etc.

One of the points made was: "unemployment is 5.6 percent in the United States, vs. 8.1 percent in Sweden, 8.9 percent in Finland and 6.4 percent in Denmark"

Well, let me tell you, Unemployment is a totally and utterly BOGUS number. Labor Force Participation Rate however, is very straightforward, what fraction of the people over 15 years old are employed? Turns out the World Bank will happily plot this for you.

Data from World Bank

Note, that Sweden is higher than the US, and Denmark is essentially the same. Finland has consistently been 4 or 5 percentage points lower than the US.

To 2nd ESSID or not to 2nd ESSID... that is the question

2015 December 15
by Daniel Lakeland

If you have a wireless router with both 2.4GHz and 5GHz bands, then the question arises as to whether you should allocate different ESSID (network names) to the two radios.

Up to now, I've been an advocate for a single ESSID and let the clients decide which network to get on. But, I've found that there is a case for forcing some clients to one band or another.

In particular, the FireTV Stick that we have is in a fixed location where the 5GHz band reception is good, and there are no other 5GHz devices nearby. Yet, it will sometimes flop back and forth between the 2.4 and 5GHz bands, generally resulting in stuttering video and/or problems with other 2.4GHz only devices.

Solution: Run OpenWRT or another free software high quality router distro, and run an EXTRA ESSID on the 5GHz band (only works on hardware that supports more than one network name per radio). This gives you the best of both worlds. Both 2.4 and 5 GHz are available to mobile clients with dual radios, but if you want to force something, especially a fixed location radio, onto the 5GHz band, you can connect to the extra 5GHz only ESSID. By bridging that extra ESSID onto your LAN, it just acts as a different access point to the same network.

So, with the FireTV forced onto foo5 and your cellphone on foo, can you use the FireTV remote app and connect? Yes, thanks to the network bridging they are on the same broadcast network so they see each other.

Religiosity and Intelligence and Natural Selection

2015 December 10
by Daniel Lakeland

A number of studies have found statistical relationships between religiosity and reduced IQ or increased gullibility at least with respect to population averages. These studies may well be flawed, and they may not have very well thought out causality behind them. But let's just pretend that they're actually correct for a moment. It would seem at first glance that if there is a common cause for both religiosity and reduced IQ we'd find this a "bad" thing, that is, intelligence is generally considered good, and therefore reduced intelligence is "non good". Occasionally atheist organizations will use such types of arguments.

I'm here to mainly just point out how naive that concept is. Here is a scientific argument: if a trait remains in the population after millions of years of evolution, you've gotta assume that it offers some important value. More specifically, there are many facets of "the good" in society, so picking one dimension "IQ" and saying that reducing it makes things worse... is basically ignoring this. If we were all as intelligent as Stephen Hawking and equally as immobile, we'd be up shit creek without a paddle. Finally, because we're social animals, we share the benefits of various traits amongst our society. Having some very smart people working on say solving certain scientific problems with diseases or figuring out how to build important technologies balances out with having very caring people who advocate for the mentally ill, or who do charitable work that can help people educate their children out of a poverty trap, or simply act to help everyday people cope with adversity.

Each of us basically gets some amount of each of the "good" things about being human, and some amount of the "bad" things as well (propensity for heart disease, anger management issues, willingness to wear socks with sandals, color-blindness, inability to manage money etc etc) and the fact that we cooperate together allows us to benefit from the strength of others.

With that in mind, atheists and religious people can go forth and cooperate, treating each other with respect, and recognizing that failure to mesh on certain dimensions, no matter how important those dimensions are to the individual, actually makes society a more robust and beneficial place. The more people can recognize the areas of their own shortcomings, and look for members of their family or their friends who can help them in those matters, the better off we will be, and that goes for both the ultra-rational seeking help from religious people when it comes to interacting with society as well as religious people seeking help from rationalists when it comes to figuring out what verifiable facts are true or not.


Even Mother Jones thinks the statistic of ~ 1 mass shooting per day is crap

2015 December 4
by Daniel Lakeland

From an Op-Ed in the NYT

The owner of single-handedly decided to re-define mass shooting for political reasons and then this was picked up and posted by the Washington Post, and all over the internet by people on Facebook etc.

Like "practically significant" vs "statistically significant" confusing the dialog on scientific results, this new definition of mass shooting confuses the dialog on a particular type of crime. It's true that the shootings at involved 4 or more people being struck by bullets. But, practically speaking, when a typical person hears "mass shooting" they believe that you mean essentially what the FBI call an "active shooter incident": a situation where a person shows up at some highly public location with the expressed interest in just shooting as many people as they can relatively indiscriminately.

The vast majority of what tracks are things like gang-turf-wars, drug deals gone bad, people getting revenge on someone they know, and in general people with a specific criminal motive against some small set of specific people. The vast vast majority of those incidents are simply what used to be called "gun crime" which is a problem, but it's not the SAME problem as what people think of when they hear "mass shooting".

Do motives matter? They do if you want to understand how to prevent the crimes. Preventing people from getting a gun and killing the person who beat their brother into a coma is going to require some different actions from preventing people from becoming religiously radicalized and showing up to kill a large number of innocent non-criminal people in a public space.

So, from a polluting-the-dialog perspective, gets bad marks from me. On the other hand, if you're interested in studying "gun crime" it sounds like they have a pretty interesting database.

US gun ownership and violence issues

2015 December 3
by Daniel Lakeland

I'm forbidden by my wife from commenting on Facebook posts regarding gun control. It's probably a good thing. Facebook is fast becoming an emotional outlet for meaningless political echo-chambering and responses are usually both uninformed and vitriolic.

But, hopefully at least on my blog I can say some things I've been thinking and avoid vitriol. In fact, to avoid an emotion fueled shitstorm I am going to put my opinion and some of the facts they are based on out here, but simply close the article to comments. I don't have the energy to carry on any kind of debate on this topic and I don't want it to be a major focus of my blog in any way. I'll allow trackbacks and pingbacks, so if you want to comment on this, post your own blog entry on your own blog and link to mine.

The US is in a pretty bad portion of the state-space with regards to gun ownership. We have about 1 gun per person in this country. But, we have a pretty long history (about 100 years) of strong attempts at gun control, most of which have begun to unravel as the aftermath of the Heller decision plays out.

All of this puts us in a position where if you are a law abiding citizen living in one of the larger metropolitan areas such as Los Angeles, Chicago, New York, San Francisco Bay Area, Boston, or many other similar areas, you have a decent chance of being able to purchase and OWN a firearm, but about zero chance of being able to have that firearm available to you in public to help you in case of violence. Furthermore, you live in a place where criminals have essentially no trouble acquiring firearms, and they do so principally through black markets, not legal sales.

On the other hand, if you are a person interested in committing violence, you have no qualms about whether you are technically violating a gun carrying ordinance or whatnot, and, you have relatively easy access as I linked above, principally via black markets.

There are basically two important categories of violent offender, and one additional important category:

  1. Criminals committing various criminal acts of which violence is a byproduct (drug dealers, burglars, gang turf wars etc).
  2. Mentally Unstable / Political / Mass shooters (basically suicide and political statement by public attack).
  3. Suicides (private / alone / involving no other victims).

Category 1 (~ 10,000 victims / yr) criminals tend to use illegally acquired weapons, often purchased from friends and acquaintances, and to a great extent entering this secondary market via purchases by people who can pass background checks and are then willing to pass the firearms to the criminals (see linked articles above). A typical case might be either a girlfriend, or a sibling or whatnot.

Category 2 (~ 100 victims / yr) criminals tend to buy weapons legally, though possibly they should have been denied the sale due to history of mental illness.

Category 3 (~ 20,000 victims / yr) suicides probably involve people using their own legally acquired guns which they acquired years before. They then become depressed due to whatever life circumstances, and commit suicide. This includes a certain number of soldiers with PTSD, and people who have other challenging life issues (loss of a family member, loss of economic livelihood, bankruptcy, chronic illness, addiction etc).

Clearly, although Category 2 in my experience seems to get the most media attention, with articles all over all the major media outlets nationally regardless of where the incident occurred... It is also the least important in terms of sheer numbers. For every person killed in a mass shooting there are more or less 100 people killed via violent inter-personal crime, and 200 people who commit suicide with a firearm.

What's different though is the background of the victims:

In Category 1 we're talking mostly about people who have criminal backgrounds, principally males between 15 and 25 years of age, committed mostly with handguns, and we're talking about 4 times as many black people as white people. (if links break, they're all from graphs at these wikipedia articles on Crime, and Gun Violence)

In Category 2 we're talking about a relatively much smaller number of people, but they are typically very different: employees at places where shooters work or used to work, children at elementary schools, people shopping at public stores and restaurants, adults studying at Universities, people at community functions, churches, synagogues, mosques, women's health clinics, etc etc. See the FBI study on "active shooters"

In Category 3 (suicides) the issue is very very clearly one of failure of our mental health care system. The Wiki article on epidemiology of suicide certainly seems to point to similar levels of suicide between countries with widely different gun ownership and access. Suicide seems to be something where gun access affects the choice of method, but does not particularly affect the overall rate within the country. For example: New Zealand, United States, and Netherlands all have similar rates but wildly different gun access, and from the same graph, Japan a country with virtually no citizen access to firearms has about 2x the rate of suicide. People commit suicide for various reasons, but not principally because they own a firearm.

What do we learn from all this? From the attention that mass shootings get, we conclude that people care much more about victims who are not criminals than they do about the gang, drug, poverty driven violence that is 100 times more prevalent in the US. Plus, we also conclude that sensationalism of the events drives media sales.

Typical response on the part of citizens who are not gun owners is to demonize gun ownership, and to call for increased restriction on legal gun purchases etc. But, public shootings is probably the area where increased possession of firearms by the general public in public places is likely to do the most good. Since the victims are more or less law abiding and going about their daily business, they are also the ones most likely under our current laws to be legally disarmed, particularly when the events occur specifically at locations where people are commonly disarmed by law: schools, universities, religious buildings, and shopping areas with posted no-concealed-carry-allowed signs (such as the case in Colorado in the shootings at the movie theater screening a Batman movie, where the shooter drove around looking for those signs).

Opponents of CCW (concealed carry) sometimes point to articles like this one from Mother Jones which they claim shows that no mass shootings in the last 30 years have been stopped by armed citizens but this article has a serious statistical problem, namely, it conditions on the outcome! If you look only at incidents in which multiple people were killed, you will find, unsurprisingly, that none of those incidents were stopped before multiple people were killed! There is no question that resisting an armed shooter who is intent on hurting people is dangerous. But, it's pretty clearly a lot more dangerous if you're unarmed. If you go looking for incidents where armed citizens stopped shootings, they aren't hard to find, such as those found by this article at the Washington Post. A clear issue is that when someone stops an attack early, we don't know what might have happened if the attack hadn't been stopped whereas when no-one stops an attack which goes on and 20 or 30 people are killed or injured, it's very clear what the intent was. This is an important statistical bias in any analysis.

FBI has a study on "active shooters" which gives a relatively clean view of important incidents in which people went into public with the explicit goal of killing people publicly. It is no surprise that the FBI data is a little less sensationalistic than typical media reports. It's worth a look, because it highlights the fact that the main targets of these events are typically places where people work or used to work, and places where you can expect to find relatively helpless people (mostly elementary and high schools).

Increased CCW on the other hand, is unlikely to result in any increase in crime from Category 1, because the criminals that commit these homicides are already easily getting their guns, and won't apply for CCW as they won't pass the required checks, just as they won't buy their firearms from the legal market. In fact, the general liberalization of CCW over the last 20 years or so has coincided with a time of general reductions in overall homicide (probably mostly unrelated to CCW rates and more related to economic conditions and a general decline in violence since the mid 1990's).

All of this is to say that the vast majority of what goes on post public-shooting is well-deserved indignation but not particularly helpful in terms of determining how best to respond to either the overall level of violence in the US, or the mass-shooting issue in particular.

In the us-vs-them political shitstorms that arise post sensationalistic shooting incident, one thing is clearly thrown out the window which I hope to make explicit here:

The common ground is that virtually NONE of us want to see public shootings. None of us want high levels of crime, and none of us want high levels of suicide. That is true on both "pro gun" and "anti gun" sides of the political spectrum. But, we're not in a situation where there are magic wands available to make those things go away. Since the issue is complex, it is entirely reasonable to have different views on what should be done with respect to gun laws even while having the SAME GOALS!

We're also in a situation where thankfully overall levels of violence are declining, and legal precedent is being set every day showing that the 2nd amendment protects an individual right, that the right extends to the states, and extends beyond simply owning and possessing firearms at home, and that ultimately many of the gun control laws passed during the period 1980-2000 at the peaks of criminal violence rates are going to be overturned on legal grounds. Where will we be at that point? Productive gun laws are going to have to come from a proper understanding of the rights that are protected, and how armed citizens can best be helpful, because we're going to see even wider spread CCW once the vast number of legal cases finally work their way through courts. Fortunately, the experience over the last decade or so clearly shows that increased CCW does not by itself lead to any significant increase in violent crime.

What we do to productively combat violence in the US while staying within the scope of constitutional law will not be decided by emotional Facebook posts. My personal feeling on the active-shooter issue is that we need to look hard at the motivating factors in these incidents, and the warning signs that might have been available to us to prevent them, and some of us will get CCW licenses, and will need to be responsible for using those safely. As for violent crime in general, reducing the criminalization and economic black-market gains from drug sales is going to be a key factor in reducing violence levels.

A (much better) grant evaluation algorithm

2015 October 19
by Daniel Lakeland

There are lots of problems with the way that grants are evaluated in NIH or NSF study sections. A few (typically 3) people read each grant, they give a score, then if the score is high enough, they discuss the grant with the rest of the group, then the rest of the group votes on the results of that discussion, then you add up all the scores and get a total score, then you rank the scores and fund the top 5-15% based on available funds (or some approximation of this fairy-tale).

To make it past the first round (ie. into discussion) you need to impress all 3 of the randomly selected people, including people who might be your competitors, who might not know much about your field, who might hold a grudge against you... And then, you need those people to be good advocates for you in the discussion... It's a disaster of unintended potential biases. Furthermore, the system tends to favor "hot" topics, and spends too little time searching the wider space of potentially good projects.

Here is an alternative that I think is far far better:

  1. A study section consists of N > 5 people with expertise in the general field (as it does now).
  2. Each grant submitted by the deadline is given a sequential number.
  3. Take the Unix time of the grant deadline expressed as a decimal number, and the last names of all authors on grant submissions in ascending alphabetically sorted order with upper-case ASCII characters, and compute the SHA512 hash (or other secure crypto hash) of this entire string. Then using AES or another secure block cipher in CBC feedback mode, with the first 128 bits of the hash as the key, and the rest of the hash as a starting point for the cyphertext, encrypt the sequence x,x+1,x+2,x+3... starting at x= the rest of the SHA hash. This defines a repeatable and difficult to muck-with random number sequence for a random number generator.
  4. Each grant is reviewed by 5 people chosen at random. (In sequential order, choose a grant number, then choose 5 people at random with replacement to review it... repeat with the next grant.)
  5. Allow each reviewer to score the grant on the usual criteria (feasibility, innovation, blablabla) with equal weight put on the various criteria. For each grant add up the total score for each of the 5 scorers.
  6. For each grant, take the median of the 5 scores it was assigned. This prevents your friends or foes or clueless people who don't understand the grant, or whatever from having too much influence.
  7. Divide each grant's score by the maximum possible score.
  8. Add 1 to the score.
  9. Divide score by 2. You now have a score between 0 and 1 and 50% of that score is influenced by the reading of the grant, and 50% is constant under the assumption that most of the grants are of similar quality, this prevents too much emphasis on the current hot topic.
  10. While there is still money left: select a grant at random with probability proportional to the overall scores of the remaining grants. Deduct the grant's budget from the total budget and fund that grant, removing it from the pool. Repeat with the next randomly chosen grant until all the money is spent.

Why is this a good idea? So much of grant scoring is influenced by things other than the science: whether the person writing the grant has published in this field a lot, whether they are well known and liked by the committee members, whether they have been funded in the past, whether they are working on a hot topic, whether they're a new investigator, how MANY papers they've published (not so much how good those papers were) whether they have a sexy new technique to be applied, etc etc.

But the truth is most grants are probably similar mostly lousy-quality projects. It's hard to do science, very few experiments are going to be pivotal, revolutionary, or open new fields of research. There's going to be a lot of mediocre stuff that's very similar in quality, and so ape-politics is going to have a big influence on total score across the 5 reviewers.

But, the review process does offer SOME information. When at least 3 of 5 randomly chosen reviewers recognize that a grant is seriously misguided, that's information you should use. Taking the median of 5 scores, and using it as 50% of the decision making criterion seems like a good balance between a prior belief that most grants are of similar quality, and specific knowledge that the 5 reviewers are going to bring to the table after reading it.

Dos Equis

2015 October 19
by Daniel Lakeland


Some order of magnitude estimates on Universal Basic Income

2015 October 16
by Daniel Lakeland

Ok, so I've advocated with some of my friends for a Universal Basic Income (UBI). The basic idea is this, if you are an adult citizen of the US, you get a social security number, and you register a bank account, and you get a monthly direct deposit pre-tax from the government. A flat amount that everyone receives just for being a citizen. The goal here is to simplify vastly the requirements to provide a basic social safety net as well as eliminating the complexity of programs like the progressive income tax with millions of specialty deductions etc.

The UBI eliminates the fear of being without income on the very short term (days, weeks, months), and lets people take risks, be entrepreneurial, take care of families, weather bad events better, etc. It also takes care of pretty much everything that a progressive tax rate structure is supposed to do (help poorer people who spend a lot of their income on basic necessities). So once a UBI is in place, you can VASTLY simplify the administration of an income tax, and you can eliminate all sorts of specialized subsidies that current require a lot of administrative overhead (checking that people qualify, running housing projects, providing specialty healthcare programs etc).

The UBI doesn't work for the mentally ill, so they will continue to need specialty help in addition, but for everyone else, it's a very efficient way to do what we're currently doing in very inefficient ways.

But, this isn't a post about the merits, it's a post about order of magnitude estimates for the quantities of money involved.

According to Google there are 3.2 \times 10^8 people in the US.

The federal Budget is currently about 3.7 \times 10^{12} dollars, with about 0.63 \times 10^{12} in defence, the rest in various social services and interest on debt.

Let's take as an order of magnitude estimate of a good choice for UBI as the 2015 federal poverty guidelines. That's about 12\times 10^3 per year, or about $1k / mo.

So, if we just started shipping out cash to everyone at the rate of 12k/yr how big is that as a fraction of the federal after-defense budget?

 12\times 10^3 \times 3.2\times 10^8 / ((3.7-0.6) \times 10^{12}) \approx 1.2

So, to first order, the entire non-defense budget is about the same as the amount of money you'd need to spend on a UBI. But the UBI can replace a *lot* of other government programs. Social security, medicare, housing and human services, a big majority of what we're spending this budget on is basically doing an inefficient job of helping people.

I don't advocate gutting all of the government programs and replacing them with a UBI, but I imagine I could easily get on board with gutting 60 or 70% of them and replacing with a UBI.

Besides reducing the overhead of government, you'd need to increase revenue. The UBI would drive sales, and a flat federal sales tax would be a very simple way to take care of this extra need for income. A sales tax would be also a consumption based tax, which has good economic consequences (it encourages saving and investing vs income taxes which discourage earning and encourage consumption!)

So, our order of magnitude estimates show, this is a feasible plan. It's not something that would be easy to transition to in a blink, but it could be done a lot easier than setting up a universal medicare system for example. A UBI accomplishes things that both the liberal and conservative groups in politics wants: helping people, while being efficient, and encouraging growth and entrepreneurialism. It's an idea whose time has come:

(see this WaPo article on how a UBI like thing helped native american populations for some empirical information)