One of the best books in Applied Mathematics and Mechanics ever:

2014 December 2
by Daniel Lakeland

From G.I. Barenblatt's recently released book: "Flow, Deformation, and Fracture" published by Cambridge, footnote 3, pg 4:

There is nowadays considerable discussion concerning the subject of applied mathematics. In fact, its proper understanding is clarified if we remember the famous saying of J. W. Gibbs: "Mathematics is also a language." If so, then on the one hand pure mathematicians can be identified with linguists, who study the laws of the formation, history, and evolution of languages. On the other hand applied mathematicians, building models of phenomena, are analogous to writers and poets who use language to create novels, poetry, critical essays etc. Applied mathematics is the art of constructing mathematical models of phenomena in nature, engineering, and society

A novelist needs a facility with words, and an understanding of what makes good insight into the human condition to create a great novel. An intense study of the family tree of indonesian and southeast asian ancient languages tracing the evolution of words and writing systems would generally not help. At the same time, indigenous peoples in a rainforest may have intense knowledge of ecological interconnections, but without a facility for written language they will never communicate it to the rest of the world.

On the other hand, Applied Mathematics isn't like say science journalism, in which writers write about what others have done, it's more like perhaps investigative journalism, in which journalists uncover and put together the facts, organize the ideas, and explain them to the world.

Dear Firefox, you just nuked my entire 5 years of Zotero archives!

2014 October 21
by Daniel Lakeland

Luckily I keep good backups. But recently I started Firefox and it said something like "you haven't used firefox in a while, would you like to start over with a new profile to take advantage of our new features?" So since I mostly use Chromium, I said, "yes". Days later I tried to start Zotero and it had no data directory (because by default it uses the one in .firefox/<profile directory>/zotero.

Thanks to rdiff-backup I was able to recover my zotero directory and put it in .zotero/zotero, but this would have been a BIG deal, and Firefox should have done something like move the old directory to a backup location, not nuke it entirely.

Dimensional analysis and the Ebola epidemic

2014 October 18
by Daniel Lakeland

To follow up on my discussion of the Ebola uncertainty. Let's take a look at some very basic differential equations that we can use to get an idea of the factors that go into making up an epidemic.

First, we'll model a population as having infected I and uninfected U. Let's also measure these populations as a fraction of the total population. So initially I=\epsilon and U=1-\epsilon and \epsilon is small (like maybe 10^{-6} or 10^{-8}). Now, how does the infected population grow?

\frac{dI}{dt} = k_{IU} I U = k_{IU} I (1-I)

The assumption here is that in a short unit of time, each I person becomes in contact with a certain number of U people, and for the initial stages at least, this drives the infection. Note that in later stages, I population will begin to be reduced as they die off, and there is more going on. We're interested mainly in the initial stages because we'd like to avoid a major epidemic killing off a few percent of the worlds population etc.

Now, I and U are unitless (they are the ratios of counts of people), and t has units of time, so k_{IU} has units of "per time". It represents the rate at which infected people mix with uninfected people, times the fraction of these mixings which result in transmission. In theory, the fraction of mixing that results in transmission is the definition of R_0 from my previous post (EDIT: not quite, R_0 is actually the fraction of mixings that result in infection, times the average number of mixings throughout a total epidemic... but we could imagine that's constant...)... so we can replace k_{IU} with R_0 r_{IU} where r_{IU} is the rate of mixing.

\frac{dI}{dt} = R_0 r_{IU}I(1-I)

I starts out at near zero, and we're interested in how the infection grows, hopefully we will do something to squash it before it reaches more than 0.005 or 1/2 a percent of the population, so we can assume (1-I) \approx 1 initially, that is for small t.

\frac{dI}{dt} \approx R_0 r_{IU}I

This is the equation for exponential growth, we can make it dimensionless by choosing t_0 = 1/r_{IU} to be the unit of time, and we get:

\frac{dI}{dt'} \approx R_0I : t' = O(1)

So all epidemics are similar at some time scale, and controlled by R_0, this reassures only the naivest of mathematicians, because the assumption is only valid for t' = O(1). In a situation in which the mixing time t_0 is small, this could mean we have only say a few days before I \approx 0.02 at which point we have a SERIOUS problem (2% of the population actively has Ebola, and that would be devastating). The point is, the equation has to change before t gets too big in dimensionless time.

So  R_0 is useful as an index of how infective the virus is, but NOT how quickly it will spread, since there is also the mixing time to be considered. In western countries we'd have to imagine that the mixing time could be much lower than in West Africa, and so effective response would have to be much faster.

In addition, another dimensionless group is important, namely t_rR_0/t_0, where t_r is the time it takes to effectively institute response measures and t_0 is the mixing time. The larger this is (the longer the response takes in dimensionless time) the bigger will be the problem.

Fortunately, we also have maybe some suggestion that R_0 would be smaller in the US, in West Africa many tribal groups wash and prepare their dead, then kiss the bodies to say goodbye... not a good idea with Ebola. Also, there have been attacks on healthcare workers as ignorant people believe Ebola is either a hoax or spread by the government or whatever. Those things probably won't happen in the US.

All this is to say, there is a lot of uncertainty, with mixing time t_0 and infectivity R_0 both having different values in Western countries than in West Africa. So the actual number of days or weeks we will have to effectively respond, and change the equation of growth of the infected population is unknown. One thing we DO know though, is the faster the better. And this is where the CDC and other officials are not driving a lot of confidence in the US population. The general population's cry of "we need to do something about this NOW" is well justified. Given that Ebola has been around for decades, there should be an established plan and some contingencies that have already been thought out. That this doesn't seem to be the case is not confidence inspiring.

 

 

 

The myth of R_0 and Ebola infectiousness

2014 October 17
by Daniel Lakeland

R_0 or (the basic reproduction number) is a parameter used in mathematical models of infection. It's in theory the time integrated average number of people who will be infected by each new case. An R_0 < 1 suggests the infection will die out, and greater than 1 suggests it will spread. But R_0 is a tricky thing to calculate. Wikipedia gives references to how it's calculated, and in fact it seems to be that these different methods of calculation give different results even with a given infection, and likely comparison across diseases is not indicative of something that can really be compared accurately.

But beyond the difficulty of actually calculating such a parameter, there's the uncertainty involved when an epidemic moves from one environment, where you've got a lot of data (say West African Ebola), to another environment which has very different social dynamics and where you have very little data (Say Ebola in International Airline Travel). Bayesian methods can be used to help give a sense of the uncertainty in the parameter once you've got enough cases to do calculations... But I'm going to hope we will have to rely primarily on prior data in the Ebola outbreak. Unfortunately, we are going to have to put a wide prior on R_0 in the global case, because we just don't know how highly mobile and interacting societies compare to West African villages in the spread of this disease.

 

Sand tapping experiments and MCMC

2014 September 26
by Daniel Lakeland

It's a well known phenomenon in granular materials that if you fill up a tube full of sand and then you tap the tube repeatedly, the sand will settle down to a certain stable height in the tube. Typically the variability between the "least dense" and "most dense" states is a few percent of the height. So for example you might start with 10cm of sand, tap it for a while and wind up with 9cm of sand. Note that it's also possible though difficult to get your sand into a state where it actually expands as you tap it, but generally doing so requires you to crush the sand into the tube initially, when poured into the tube the sand will generally be less than or about equal to equilibrium density.

During my PhD I spent a lot of time thinking about how to model this process. One of the key issues is that we have essentially no information about the sand. For example the position, orientation, shape, and material properties (elasticity, surface/friction properties, etc) of the individual grains. It's tempting to say that this is similar to the situation in the ideal gas where we have no idea where, how fast, or in what direction any of the atoms are. That's true, in so far as it goes. But whereas in the ideal gas we have no interactions between the gas molecules, in the static sand condition we have essentially nothing but interactions between the sand grains. At first glance it seems hopeless to predict what will happen when what will happen is caused by interactions, and we have virtually no information about those interactions.

However, it does also depend on what you want to predict, and for someone interested in say soil liquefaction, the main thing to predict is how some disturbance such as a shear wave will affect the density of the soil, and in particular when that soil is saturated with water.

So consider a sand tapping experiment. We have a short-ish column of sand at uniform porosity \phi (the fraction of the volume taken up by voids), and we tap this tube of sand with a blow from a hammer having kinetic energy dE which is small compared to the total gravitational potential of the deposit relative to the bottom of the tube (you won't be lifting the whole tube off the table and putting it into near earth orbit), but large compared to the gravitational potential of a single grain sitting at the top of the tube (you may very well bounce the grains sitting at the surface up a few millimeters), and given this energy, the sand grains bounce around a bit. Most of the sand grains will move not-very-far, you won't have a grain go from the bottom of the tube to the top for example. The average center-of-mass distance traveled is likely to be considerably less than a typical grain diameter. However, the orientations of the grains may change by larger fractions, it wouldn't be completely unheard of for a grain to rotate 180 degrees around some axis.

This tapping process is in many ways like the process of a random "proposal" in MCMC. It moves the grains to a nearby state, one in which the total energy is within about dE of the initial energy. It makes sense to ask the question: "Given that the final state is somewhere in a very high dimensional state space which has energy within about dE of the initial energy, what is the d\phi that we're likely to observe?"

It is, in general, hopeless to try to compute this from first principles for realistic sands, you might get somewhere doing it for idealized spherical beads or something like that, but it isn't hopeless to try to observe what actually happens for some sample of sand, and then describe some kind of predictive model. In particular it seems like what we'd want is a kind of transition kernel:

P(\phi_1 | \phi_0, dE)

at least for \phi_0,dE in a certain range.

So, while I didn't get around to doing it in my PhD dissertation, I may very well need to go out and buy a bag of sand, a clear plastic tube, some kind of small hammer, and a bit of other hardware and have a go at collecting some data and seeing what I get.

 

Randomized Chess

2014 September 5
by Daniel Lakeland

I've been sick a lot recently, in part thanks to having small children. In any case, one thing I've been doing is revisiting Chess. I honestly am pretty clumsy at Chess but it's one of those things I always felt I should probably do. When I was younger most of my friends played stronger games than me, and it was hard to enjoy when you were getting beaten all the time. Now, thanks to Moore's law and clever programming, even the very very very top players are useless against a 4 core laptop computer running Stockfish.

So we can all agree now that it's no fun getting blasted out of the water every time, but also we can use computers to make things better and more interesting for humans, since that's what they're for right?

There are lots of proposals for randomized or alternative starting position Chess games. For example Chess 960 (Fischer random chess) is a variant with 960 possible starting positions. The idea is to avoid making Chess a game where a big advantage comes from memorizing opening moves in some opening database. I'm more or less for this in my play. I enjoy playing Chess well enough, but I have absolutely NO interest in poring over variation after variation in a big book of opening theory. I think some people like this stuff, so for them, they can of course continue to play regular chess.

On the other hand, for people like me, consider the following method of starting the game:

  1. Set up the board in standard starting position.
  2. Using a computer, play N random legal pairs of moves (turns). possibly with or without capture.
  3. Using a chess program on the computer, find the "score" for this position.
  4. Accept the position if the chess program decides that the score is within \epsilon of 0.0 (where positive is good for white, negative is good for black, this is standard output from chess engines), otherwise go to step 1.
  5. Assign the color randomly to the human (or to one of the humans if you're not playing against a computer).
  6. Start the game by allowing white to move.

Note, this variation also can be used to handicap games by accepting a starting position if it is within \epsilon of some handicap value h, and then assigning the weaker player to the color who has the advantage. It's also possible to play N random moves and then allow the computer to move K moves until the score evens out properly if you can get support from the chess engine. Finally, it's also possible to search a large database of games for one in which after N moves the position evaluates to within \epsilon of the appropriate handicap value, rather than generating random moves.

I suspect N=6 to N=10 would be the appropriate number of moves to use.

Now, who will implement this in SCID or the like?

 

What it takes for a p-value to be meaningful.

2014 September 3
by Daniel Lakeland

Frequentist statistics often relies on p values as summaries of whether a particular dataset implies an important property about a population (often that the average is different from 0).

In a comment thread on Gelman's blog (complete with a little controversy) I discussed some of the realistic problems with that, which I'll repeat and elaborate here:

When we do some study in which we collect data d and then calculate a p value to see if it has some particular property, we calculate the following:

1-P(s_1(d),s_2(d),\ldots,s_n(d))

Where P is a functional form for a cumulative distribution function, and s_i are sample statistics of the data d.

A typical case might be 1-p_t(\bar d / s(d),n(d)-1) where \bar d is the sample average of the data and s(d) is the sample standard deviation, n(d) is the number of data points, and p_t is the standard t distribution CDF with n-1 degrees of freedom.

The basic idea is this: you have a finite population of things, you can sample those things, and measure them to get values d.  You do that for some particular sample, and then want to know whether future samples will have similar outcomes. In order for the p value to be a meaningful way to think about those future samples you need:

  • Representativeness of the sample. If your sample covers a small range of the population's total variability, then obviously future samples will not necessarily look like your current sample.
  • Stability of the measurements in time. If the population's values are changing on the timescale between now and the next time you have a sample, then the p value is meaningless for the future sample.
  • Knowledge of a good functional form for p. When we can rely on things like central limit theorems, and certain summary statistics therefore have sampling distributions that are somewhat independent of the underlying population distribution, we will get a more robust and reliable summary from our p values. This is one reason why the t-test is so popular.
  • Belief that there is only one, or at least a small number of possible analyses that could have been done, and that the choice of sample statistics and functional form are not influenced by information about the data: p_q=1-P_q(s_{iq}(d)) represents in essence a population of possible p values from analyses indexed by q, when there are a wide variety of possible values for q, the fact that one particular p value was reported with "statistical significance" only indicates to the reader that it was possible to find a given q that gave the required small p_q.

The "Garden of Forking Paths" that Gelman has been discussing is really about the size of the set q independent of the number of values that the researcher actually looked at. It's also about the fact that having seen your data, it is plausibly easier to choose a given analysis which produces small p_q values even without looking at a large number of q values when there is a large plausible set of potential q.

Gelman has commented on all of these, but there's been a fair amount of hoo-ha about his "Forking Paths" argument. I think the symbolification of it here makes things a little clearer, if there are a huge number of q values which could plausibly have been accepted by the reader, and the particular q value chosen (the analysis) was not pre-registered, then there is no way to know whether p is a meaningful summary about future samples representative of the whole population of things.

What problems are solved by a Bayesian viewpoint?

Representativeness of the sample is still important, but if we have knowledge of the data collection process, and background knowledge about the general population, we can build in that knowledge to our choice of data model and prior. We can, at least partially, account for our uncertainty in representativeness.

Stability in time: A Bayesian analysis can give us reasonable estimates of model parameters for a model of the population at the given point in time, and can use probability to do this, even though there is no possibility to go back in time and make repeated measurements at the same time point. Frequentist sampling theory often confuses things by implicitly assuming time-independent values, though I should mention it is possible to explicitly include time in frequentist analyses.

Knowledge of a good functional form: Bayesian analysis does not rely on the concept of repeated sampling for its conception of a distribution. A Bayesian data distribution does not need to reproduce the actual unobserved histogram of values "out there" in the world in order to be accurate. What it does need to do is encode true facts about the world which make it sensitive to the questions of interest. see my example problem on orange juice for instance.

Possible Alternative Analysis: In general, Bayesian analyses are rarely summarized by p values, so the idea that the p values themselves are random variables and we have a lot to choose from is less relevant. Furthermore, Bayesian analysis is always explicitly conditional on the model, and the model is generally something with some scientific content. One of the huge advantages of Bayesian models is that they leave the description of the data to the modeler in a very general way. So a Bayesian model essentially says: "if you believe my model for how data d arises, then the parameter values that are reasonable are a,b,c\ldots ". Most Frequentist results can be summarized by "if you believe the data arise by some kind of simple boring process, then you would be surprised to see my data". That's not at all the same thing!

 

Wait, you need to what??

2014 August 15
by Daniel Lakeland

According to this opinion piece in The Telegraph by a knighted peer, and inventor of the bagless vacuum, the UK needs to "double the number of engineering graduates coming out of our universities each year, for the next twenty years"

Ok, let's see here. There must be at least 1 engineering school in the UK, and it must graduate at least 100 students because otherwise how would it stay in business?

So let's calculate: \sum_{i=1}^{20} 100 \times 2^i. We could do all kinds of nice little integration techniques to approximate this, or we could just ask Maxima for the answer:

(%i1) sum(100*2^i,i,1,20);
(%o1) 209715000

Yep, so at a minimum the UK needs to graduate another 209 million engineers total, in a country  that currently only has 63 million people and a growth rate of 0.6% last year?

And what if we maybe assume that more than 100 people graduated from engineering school last year? Like maybe it was 1000 or even 10000? The whole thing is linear in the factor of 100, so we might need to graduate 2 billion or even 20 billion engineers over the next 20 years. Perhaps they're planning to puree them into Soylent Green to feed the masses?

Monty Python had something useful to say at this point:

Let's just hope these engineers have better numeracy than the peerage.

 

The future is now, or maybe next week or next year... what do we do about it?

2014 August 13
by Daniel Lakeland

This video was posted to my FB news feed by a friend who works in technology. It gives voice to a lot of thoughts I've been having recently on the future of the economy:

Here are some things I've been thinking about related to this:

It's easy for me to imagine a coming world, in my lifetime, in which the vast majority of the frequently repeated tasks currently performed by humans are performed by computer and robot. This includes:

  • Transporting things (by truck, train, within warehouses, picking, packing, shipping, etc)
  • Selling things (from what Amazon already does, to cars, houses, industrial supplies...)
  • Making decisions about which things are needed where, or how much of a thing to make or acquire (much of store management, middle management).
  • Cleaning things (janitors, maids, etc)
  • Diagnosing illness and treating common illness (let's say 80 to 90% of general practice medicine)
  • Manufacturing things, even or especially customized things (3D printers, general purpose CAD/CAM, robotic garment construction)
  • Basically, anything you do more than 1 day a week...

In that context, first of all vast numbers of people will be unemployed, or looking for alternative employment, and second of all, the only alternative employment will be doing things that are "one off" where the cost of teaching a machine to do it doesn't pay off. So humans will need to find a series of relatively random things to do. Things which you won't typically repeat more than a few times a month or year.

Furthermore, it now becomes impossible to rely on working at a "regular job" to provide a regular level of income to feed, clothe, house, educate, medicate, and otherwise sustain the basic needs of a human being. So, all else equal, the average wages humans will earn will go steadily down.

At the same time, cost of producing things will go down too. A decent pair of glasses might be something you can 3D print the frame of, and online order the lenses for, assembling it all yourself in less time than it takes to get a pair from a current Lens Crafters, at a price of say $3.00 instead of $50 or $250 (for designer frames), and choosing from a vast vast selection of designs. So, the bottom might drop out of the price of everyday goods. Note that you can already buy cheap eyeglasses online for around $10 if you don't care about relatively high quality lens material and coatings.

The question is, what will be the value of the ratio \bar P/ \bar S, that is the average price (P) of a basket of important common goods that you consume in a typical year, divided by the average salary (S) in dollars per year. This is a dimensionless ratio and describes whether you are breaking even or not (>1 = losing, < 1 = winning). Both of these quantities are in theory going to be decreasing. But what matters is not just what is some new asymptotic value, 100 years from now or more, but also what are the dynamics of these changes during the next 10, 20, 50, or 100 years. It is entirely plausible that the bottom drops out of the market for labor quicker than it drops out of the market for goods for example. The result is potentially worse than the Great Depression during a period where nevertheless we have vast potential growth in real wealth through automation!

One problem area seems to be social and political technology. Conservative ideas about how the world should be: "work hard and get ahead" sort of thoughts could very well be highly counterproductive during these changes. The future may well be in "find stuff no-one has thought to automate yet, and do that for a little while until it isn't needed anymore", where "a little while" might be anywhere from an hour to a couple of months or a year but probably won't be ten or twenty years.

We already see some of this in things like "Etsy" where individuals produce one-off or short batches of custom goods, not that Etsy is by itself changing the economy dramatically, but even the world of people buying used books from libraries, marking them up by a few pennies, selling them on Amazon with shipping and handling fees, and pocketing a few dollars per book is an example of humans going out and doing one-off tasks (namely combing through boxes of books for likely candidates). Even that, with its regular nature is fairly automatable, and it only exists because we don't legally allow technology to scan and reproduce those books electronically (copyright anyone?).

One political advance I see being needed is relatively simple and efficient redistribution of wealth. We're already redistributing a lot of wealth through tax systems, welfare systems, etc. But we could set some baseline standard, and create the Universal Basic Income, or build it in to our taxation system (see my previous post on optimal taxation with redistribution). The idea being that we give everyone a simple cushion to help make the risky entrepreneurial aspect of doing a wide variety of non-repeated things more workable for people, and let people improve on their basic level of income through this entrepreneurial process, with people essentially running a vast number of small businesses utilizing the tools that a bot-based production system creates.

Like it or not, I just don't see 9-5 jobs having very much future beyond my generation, but we should probably embrace that idea and make it work for us as a society, not rebel against the inevitable. Doing so will require new social structures, new identities, new laws, and new ideas about work.

Unfortunately, it doesn't seem to work for Entsophy

2014 August 11
by Daniel Lakeland

Oh well, I got a laugh out of it at least.