When working with social data it's pretty frequent that you get binned information. Like for example some survey might tell you the quartiles or deciles of the age distribution for males in the US, or the percentage of people whose age is between certain fixed values. If you're lucky you might also find out something about the people within each quartile (such as the mean age).

Suppose you'd like to do some prediction of some quantity based on this kind of information. For example, you might have percentage of people with a given educational attainment born before each year, as a graph, and you might have population quartiles of the current population, and you have some predictive equation for say income based on educational attainment and age, and you'd like to calculate the average income for males between age 20 and 45 today and for males between 20 and 45 years of age 20 years ago.

This is a made up example, but typical of the kind of thing I'm thinking of. In particular, you might like to do something like take panel data and infer trajectories through time for individuals, even though you don't have repeated measures. So for example you might generate virtual people born in 1940 and then have them go through earnings trajectories which put together replicate the panel data in 1960, 1970, 1980, 1990, 2000 and estimate something like what the distribution of household wealth would have been if some kind of policy were different (and you have say some simple causal model for what the savings would have been if the policy were different).

The answer of course is that you use maximum entropy. But the maximum entropy distribution of interest is a complicated one, and you might like to do numerical maximization of the entropy.

If you want to do something like this in Stan, where you're simultaneously doing inference on parameters using Bayesian methods, and finding the parameters for a distribution that maximize some measure of entropy for some prior... how do you go about it? I don't think there is an easy answer. It might be good to come up with a more simple and tractable example problem. So for example, suppose you know that the quartiles of age in some population are 23, 44, and 70 years of age, that the average age between 0 and 23 is 9, between 23 and 44 is 31, between 44 and 70 is 62, and over 70 is 81.

Suppose also that we have some function Q(x), and we want, in Stan, to approximately identify the gaussian mixture model with 3 components (8 degrees of freedom, the mean and SD of each mixture component and the weights of the mixture) that maximizes entropy subject to those 7 constraints, and calculate in the generated quantities the mean value of Q(x) from a sample of 1000 points drawn from the maxent distribution. I'll even add in some leeway as if these numbers above are rounded off, so your maxent only needs to satisfy the constraints to within +- 0.5% and +- 0.5 years.

The answer: not very important. I couldn't find data back before 1960, but Pew had this graph from 1960 to 2010 (click image to go to their source page).

This suggests that back in the 1940's and 1950's when the withholding system and progressive tax brackets were thought up (EDIT: 1913 was first income tax in US, 1942 was when then invented withholding and set the bottom bracket to double-digit percentages. See Wikipedia), the fraction of households that would be affected materially by the second-earner being kept out of the workforce by high marginal tax rates was negligible. In particular I'm just going to guess and say maybe 80% of family households had only the father employed in 1950 and even those where the mother was interested in work, how many women actually had the education and qualifications to have jobs rivaling their husbands earning power, such as Lawyers, Surgeons, Engineers, Advertising Executives, or whatever it was that made you the big bucks in 1950? It couldn't have been more than on the order of 1% or so.

EDIT: This graph from Wikipedia shows only about 5% of the whole population had completed a bachelor's degree.

This link to Wikipedia shows that the "modern" tax code in which the first tax bracket was double-digit percentages started in 1942.

Based on my discussion of eating your own labor one of the major points there is that the second earner in a household faces a massive tax burden that makes it generally unwise for households to have two earners unless each of them earns a very low wage. What would be a better scheme? Really we'd like two people in a household to face symmetric work incentives. This then enables households where two people have advanced degrees or other large earning potential to both do the thing that society most wants them to be doing (namely, working to provide valuable services).

This symmetry property works as follows. Suppose we have worker W1 and W2, and they earn salaries S1 and S2. We can define a function Keep(S) that describes how much of the salary a single worker keeps.

Where t(x) is the marginal tax on an extra dollar that a person earning x dollars takes home.

Now, the symmetry property we want is

That is, whether you aggregate the salaries into a single household and use the household keep function, or you consider them split out as two totally separate earners, you still get the same total amount kept by the two earners.

Now, this is something like linearity. The property of linearity would be

and

and

But, our fairness/symmetry requirement imposes a weaker requirement than linearity. For example suppose $Keep_2(x) = x$ for $x < 2$ and $Keep_2(x) = 2$ for x > 2, and $Keep(x) = x$ for $x< 1$ and $Keep(x) = 1$ for x > 1, (that is, you keep 100% of the first dollar each person earns and nothing after that). Then the symmetry we want holds, but of course this is a seriously pathological example.
Now let's examine a further principal. The idea is you should keep SOME of every dollar that you earn, otherwise there is no incentive to do something valuable to society for which you can get paid, since the pay would all go to the government. This principal can be translated to the following mathematics:

Which says that the Keep function is always going up at least a little bit.These two principals are sufficient to show that both the $Keep$ and the $Keep_2$ functions are lines with the same slope. We can do this by supposing that S1 is anything you like, and S2 is $1 which we will take to be dx, an essentially infinitesimal quantity. Which shows that $Keep_2(S1) = Keep(S1)$ and $\left. \frac{dKeep_2}{dx}\right |_{S1} = \left. \frac{dKeep}{dx}\right |_0$, which is to say that the $Keep$ and the $Keep_2$ function are the same, and they have the same slope at any arbitrary point S1 as they do at x=0 (so it's a line). You can also get that the keep functions are the same from $Keep_2(S1+0) = Keep(S1) + Keep(0)$ and $Keep(0)=0$. From this, I argue, the only income/wage tax compatible with the family value of "not punishing a second earner in a married household" is a flat tax on wages (ie. a constant tax rate, you're always taxed say p% of your total earnings). In my opinion, Feminists should be arguing strongly for flat taxation as women are more likely to be second earners, and women, particularly women who are educated, have student loans, have deferred incomes to gain skills, and soforth are the ones most harmed by wage taxes with increasing rates for increasing household income. Furthermore, society needs the skills of all the highly skilled people we can get. Highly skilled people tend to marry other highly skilled people. We can't afford to be educating anyone to get skills and then leave them unable to take advantage of those skills because they're the second earner in a high earning household. That means we either give up on highly skilled people having families (completely untenable!), or we give up the value to society they would have provided in services if their marginal tax rates were constant, and make them pay for their student loans from their spouses salaries instead of what would have been their high earning potential. Note: those who complain about the "regressive" nature of flat taxes should defer their complaints. The amount you consume doesn't need to be equal to the wages you keep. You can also spend money you receive from the government or from non-wage sources, and I will argue for a Universal Basic Income to compliment flat taxation in a future post. Furthermore, it's entirely plausible that the harm to society from making half the potential skilled work-force stay out of work is plausibly much larger than any harm from regressive taxation. In Los Angeles, a middle class family of 4 can eat out at a restaurant for around$50 (likely a little more if you are eating only relatively healthy food).

In a month eating out every meal, you would spend: $50x3x30 =$4500/mo

Now, shopping for groceries for a family of 4 costs about $900/mo. Cooking for a family of 4 takes about 3 hours a day including washing up. This is an hourly "after tax wage" of$(4500-900)/(3*30) = $40/hr, and a monthly "after tax earnings" of 4500-900 =$3600/mo.

A family with one wage earner with an upper middle class job (8 hour days + commute and earning $75,000/yr before taxes), and one stay at home parent has a marginal tax rate (in LA) of about 25% federal + 9% SS&Medicare + 6% CA state = 40%, this could go up to around 50% for upper-middle class incomes. (let's not forget the employer part of the SS&M but it doesn't enter this calculation) So, to earn$3600/mo take-home the second worker in the family needs before tax income of $3600/(1-.4) =$6000, or $72,000/yr. How many jobs are there where you can work 3 hours a day and earn$72,000/yr ? The GDP/capita is $56,000, and median household income is also about$56,000 and most wage earners give up more than 8 hours a day (esp. including commute, and for those earning $75k as salaried workers let's face it, 8 hr days are the minimum). And this is just comparing cooking your own food to eating out at restaurants. Hiring someone to come and do laundry and clean a house twice a week costs easily$500/mo, which before taxes would be 500/.6 = $833/mo or$10k/yr. So, cooking and cleaning for a family of 4 for say 4 hours a day vs 8-10 hour days with a salary of $82,000 and by the way you'll wind up financially breaking even? Hmmmmm, and people wonder why there's so much economic anxiety and so many Men Without Work? As more women get career jobs, it will become more and more common for husbands to stay home, why would they go to work when that would mean losing both time and money? I had a call from a client who is dealing with a quality control issue in the repair and re-installation of a large number of windows at a hotel. My recommendation came down to the only way to ensure consistency through time in installation was to randomly test as things were repaired. In particular, no you can't say that there were 200 windows installed in the last few months, but only 20 of them were eligible for testing on the day you went out, and of the 20, 7 were randomly selected for testing, and all of the 7 passed, and then extrapolate to what the condition of the full 1000 windows will be at the end of the project a year from now. Unlike a random number generator whose consistency is mathematically guaranteed by algorithm design and testing, in the real world windows are installed through time, in different temperatures, different amounts of rain, with materials coming from different suppliers, with crews that get sick, or have vacations, with materials that are delivered and sit out in the sun, or don't sit out the sun, with windows installed on different faces of the building getting different amounts of heat, or wind, different amounts of dust on the surfaces that the sealant has to adhere to, different crews who do or do not know about how to properly clean the surfaces... Unlike a computer RNG there is no future guarantee of consistency in real-world conditions. I also argued for choosing how much resources to allocate to testing by balancing the cost of testing against the expected risk cost of having to re-repair windows done wrong, rather than say relying on some formal NHST power calculation or the like (these questions typically come in the form "how many windows would we have to test to have 95% confidence?" which isn't a complete question, but even if you flesh it out so that it has an answer, it's still the wrong way to think about the problem.). I find it makes sense to reorienting the question towards something like: "what is the best number of windows to test to keep our total cost low including the cost of both testing, and of re-repairing things when we later discover they were done wrong?" Fortunately, Engineers tend to find Bayesian interpretations of probability and cost optimization methods intuitive and appealing. So, it turns out that the NVG599 doesn't seem to hand out big prefixes like /60 or /61 but it will hand out several individual /64 prefixes. So, now I have full native speed ipv6, with up to 15 prefixes for internal networks (the 16th is for the WAN side network that the ATT router "owns") Example config in wide-dhcpv6, if you have WAN on eth0 and LAN on eth1 and you want eth1.10 vlan to have its own prefix. The line "ia-pd 0" and "ia-pd 1" define two requests for prefixes # Default dhpc6c configuration: it assumes the address is autoconfigured using # router advertisements. profile default { information-only; request domain-name-servers; request domain-name; script "/etc/wide-dhcpv6/dhcp6c-script"; }; interface eth0 { send rapid-commit; send ia-na 0; send ia-pd 0; ## request our main prefix send ia-pd 1; ## request a second prefix }; id-assoc na 0 { ## puts an address on the wan side eth0 }; ## define the prefix we want for our main prefix id-assoc pd 0 { prefix ::/64 infinity; # Internal interface (LAN) prefix-interface eth1 { sla-len 0; sla-id 1; ifid 1; }; }; ## define the prefix we want for our second one id-assoc pd 1 { prefix ::/64 infinity; # Internal interface (LAN) prefix-interface eth1.10 { sla-len 0; sla-id 1; ifid 1; }; };  EDIT: although the below system WORKS, it gave me ABYSMAL performance. On my gigabit connection I was getting ~500Mbps over ipv4 and around 20Mbps over the ipv6 tunnel set up this way (test your ipv4/6 speeds at Comcast's Speed Test which tests both types of connections). Going back to getting my ipv6 from ATT's router gave me full speed ipv6. Evidently there's some traffic shaping on the ATT side that doesn't apply to my Arris router. DON'T set up the following unless you need more subnets more than you need full speed. So, the router supplied by ATT was an Arris NVG599, it has 6rd set up by default. ATT is set up with its own 6rd /28 prefix such that by appending your DHCPv4 address you can get a /60 prefix. The Arris router supplied will delegate exactly one of the 16 /64 prefixes to your machine via a DHCPv6 request, which you can get wide-dhcp-client to do for you. This of course is fine if you just want a single /64 but what if you want something like a guest wifi VLAN with its own ipv6 prefix? You have a /60 available to you, but how to make it work? First off, go to the firewall settings on the Arris box and under ip passthrough mode turn on passthrough with DHCPS-fixed, and choose your Linux router box as the machine to receive the public IP. Now, turn OFF the ipv6 services on the Arris under "Home Network". Restart the Arris box and the router so you get fresh DHCPv4 address on your router. Now, you're running a Debian based system of course 😉 so you'll want to set up a 6rd tunnel, get yourself a /60 prefix, and then manually delegate one of those prefixes to your LAN interface. You can potentially manually delegate additional prefixes to VLANs or other interfaces on your router box as well. Here's how: Make sure you've installed ipv6calc apt-get install ipv6calc In /etc/network/interfaces auto tun6rd iface tun6rd inet manual up /etc/network/6rdup down ip tunnel del tun6rd  Now, you need the script /etc/network/6rdup, mine looks like: #!/bin/bash -x ATT6RDPREF="2602:300::/28" ATT6RDRELAY="12.83.49.81" PUBLICIFACE=eth0 OURLANIFACE=eth1 PUBLICIP=$(ip addr show $PUBLICIFACE | sed -n -E -e "/(192.168)/! s: *inet ([0-9.]*)/.*:\1:p") OUR6RD=$(ipv6calc -q --action 6rd_local_prefix --6rd_prefix ${ATT6RDPREF}${PUBLICIP} | sed -e "s/::/::1/")

OURDELEGATE1=$(echo$OUR6RD | sed -e "s|.::.*|1::1/64|")

MTU=1472 ## it's what the router uses

echo ${PUBLICIP} echo "IP Tunnel:${OUR6RD} via tun6rd"

ip tunnel add tun6rd mode sit local ${PUBLICIP} ttl 64 ip tunnel 6rd dev tun6rd 6rd-prefix${ATT6RDPREF}
ip addr add ${OUR6RD} dev tun6rd ip link set tun6rd up ip link set dev tun6rd mtu$MTU
ip route add ::/0 via ::${ATT6RDRELAY} dev tun6rd ip addr add${OURDELEGATE1} dev \$OURLANIFACE
exit 0 ## do more error checking if you like


Your mileage may vary, and you may need to debug this stuff. In particular, I'm not doing much error checking, and I'm not removing the ipv6 address from the internal interface when the link comes down. Bringing links up and down several times on your router might cause trouble. Either fix that or just do a reboot instead of monkeying with indidivual interfaces (after all, you want to make sure you can restart the thing and get a properly working network).

With this all in place, together with dnsmasq to handle router advertising and do DHCP/DHCPv6 on my local lan, and Firehol to handle the firewall, I get a fully routed ipv6 subnet on my lan with firewall that passes only very limited inbound traffic, and full outbound traffic... with no appreciable change in latency. The 6rd relay is an ATT anycast ipv4 address so it picks out the "closest" ipv4 relay for you to use. In my case "ping6 facebook.com" has a 9-10ms round trip for example.

The IPv6 standard has a concept called ULA (Unique Local Address) similar in nature to the 10.0.0.0/8 or 192.168.0.0/16 address spaces in IPv4. For IPv6 these are in the address space fc00::/7. They are addresses that are defined only locally within an organization and don't route on the wide internet. In general, it seems like these addresses are a bad idea to use reflexively. But what are some actual use cases?

One that I can think of is to deal with the 6rd addresses typically handed out by consumer ISPs. The way these work is that each ISP has a prefix, and then your router creates a sub-prefix by taking the ISP prefix and appending your dynamic IPv4 address. If the ISP prefix is short enough, you can have a few bits of address space to play with for yourself. A 6rd prefix looks like

{ISPPREFIX} {YOURIPV4} {LOCALBITS}

Well, as you can probably guess, every time your power goes out for a couple of hours you might lose your IPv4 DHCP lease and now your whole network needs to be renumbered when you come back up with a new ipv6 prefix.

Ideally, when you sign up with your ISP, they'd give you a fixed static IPv6 /56 or even /48 prefix, which you would keep until you decide to move to a different ISP, but instead of that administrative hassle, they've invented a way to use the existing IPv4 DHCP infrastructure, which makes their lives easier, but your life less certain.

Sure, responding to new prefixes is do-able. But also, what about ISP outages? Just because your ISP goes down when a squirrel chews through your connection, doesn't mean you want to lose access to say your printer or scanner or file-share within your home or small business (ok, sure you've got an ipv4 10.0.0.0/8 set up anyway... but honestly that won't be forever).

Enter the ULA. You create a random prefix in the ULA space (get one randomly generated here) and then you set this up as an additional prefix for your local network. Now you can provide your laser printer or Samba share or web/security camera with a static local address that doesn't depend on what your ISP (or the squirrels!) had for breakfast, and your other hosts can auto-configure via SLAAC so that they can access these printers/cameras/shares via the ULA prefix.

If you're big enough to have your own assigned IPv6 prefix, then great, you can use those numbers, but if your ISP is someone like ATT who is using 6rd and your IPv6 prefix is inevitably going to depend on some IPv4 DHCP lease that is totally unpredictable, then ULA can give you a predictable redundant local network that always looks the same regardless of what happens on the wide internet. That's a good thing for the majority of us.

If you look on the internet there are lots of professional network people, especially at Colleges and Universities, who are trying to limit the bad effects of people streaming movies on their network. Netflix in particular is very popular and takes up to around 3GB/hour of streaming HD video (~ 7 Mbps). Even a regular SD video might take 1 GB/hr (2.2 Mbps). A few hundred students all lounging around between classes could consume a full gigabit/s internet connection.

But, how to do it? If you have a smallish network with a single uplink point, it's not too hard. A combination of dnsmasq, Firehol, and FireQOS on your router together with the Linux ipset functionality can both prioritize the video streams so that they don't stutter, and at the same time limit the total bandwidth so that other activities don't suffer. Netflix, and YouTube with broad content delivery networks, can both easily hit 200 or more Mbps for 3 seconds while buffering up HD videos. That's 3 seconds of drop-out or stuttering on your VOIP call or lock-up in your interactive game. Those high peaks of bandwidth usage can saturate WiFi connections, make interactive games or voice communications break down and generally cause problems. Forcing these streams to buffer for a longer time at lower peak bandwidth means plenty of room for small interactive packets to interleave. Solving this issue is a good way to improve interactivity and voice quality. Here's how:

Preparations:

• You have a Linux machine as a router, either a commercial dedicated wifi-router with OpenWRT or a small Intel based server, such as a Mini-ITX or even a regular desktop machine with several NICs.
• You have dnsmasq, Firehol and Fireqos installed, as well as ipset and all associated required packages ("apt-get install firehol fireqos dnsmasq ipset" on Debian).
• You have configured your network so the computers using it get their ipv4 addresses from dnsmasq via DHCP, and also use dnsmasq as the local nameserver, so all the machines on your network ask dnsmasq on your router when they need names resolved.

There is a feature of dnsmasq which will add the looked-up IP addresses to an ipset if the name comes from a particular domain. To use this feature we'll create two ipsets (one for IPv6 one for IPv4) to hold addresses from the domains googlevideo.com (from which YouTube gets its video content) and nflxvideo.net (where Netflix serves its video), add the following lines to dnsmasq.conf

ipset=/nflxvideo.net/videostream4,videostream6
ipset=/googlevideo.com/videostream4,videostream6

To create the ipsets we'll use Firehol, and then we'll use Firehol to mark packets adding something like this to the config before the first interface definition:

ipset4 create videostream4 hash:ip
ipset6 create videostream6 hash:ip

dscp 0 PREROUTING inface MY_WAN_IFACE
## kill off the DSCP mark inbound, don't trust others to classify our packets

#mark my voice packets high priority
dscp4 48 PREROUTING udp src MY_ASTERISK_BOX

#mark video packets medium priority using AF41
dscp6 class AF41 PREROUTING src6 ipset:videostream6
dscp4 class AF41 PREROUTING src4 ipset:videostream4


We've marked packets from these video streaming sources with dscp mark AF41 (decimal value 34). This is the highest "assured forwarding" class, with low drop probability. It's recommended for use with Live Video by Cisco. My assumption here is that your spouse will not like it if video stutters thanks to your messing around on the router. The voice packets get an even higher DSCP=48 priority which ensures WiFi WMM puts them in the VOICE queue.

Finally, we'll use FireQOS to give these packets a moderate high priority but limit them to some total bandwidth. Typically maybe 4 HD streams would be enough for a single family, so around 27 Mbps would be reasonable, with some overhead required, maybe 33 Mbps would make sense.

Note, both Netflix and YouTube are modern infrastructure with plenty of IPv6 connectivity, so you will need to use an "interface46" declaration in FireQOS to prioritize BOTH types of IP packets. I have a bonded interface "bond0" that is my LAN output. In FireQOS I have QOS on the LAN side output:

interface46 bond0 lanout output rate 2500Mbit qdisc fq_codel minrate 1mbit
class voicertp commit 1mbit ceil 10mbit
match4 udp src MY_ASTERISK_BOX
class video commit 1mbit ceil 33mbit
match dscp AF41
....


In addition to this causing routed video stream traffic to be limited to 33mbit it also puts a dscp AF41 mark on the packet which causes it to have reasonably high priority over WiFi (the VI queue intended for video use) under the WMM QoS system in 802.11n and later, so downstream on your network it will also be treated as an important but not top-priority packet.

That's it! Using "fireqos status lanout" you can watch as packets go through the video class at maximum 33mbps instead of the default class at 250mbps or whatever, leaving you with plenty of spare bandwidth for interactivity over a typical WiFi connection.