Author @MathProgramming, book: pimbook.org. Optimization @Google. PhD from UI Chicago. Newsletter: buttondown.email/j2kun

Portland, OR
Joined March 2014
This week on Saturday Math Writing Showcase, we're continuing with Differential Privacy, but focusing on the chapter prose since the code is done. And boy is there a _lot_ to write about
2
1
11
Part of the difficulty is that most of the motivating stories about differential privacy focuses on anonymization failures. AOL, Netflix, Mass. health records, all of those are bad, but DP was a lateral solution to those problems.
1
While those failed attempts all published individual data records, DP usually focuses on publishing aggregate statistics. If, for whatever reason, what you need is closer to individual data records, I don't know of existing DP mechanisms that help (open to tips!)
1
So instead I'm planning to spilt this into two chapters. First, find another prod use case better motivated the AOL/Netflix stories (maybe k-anonymization for passwords?). For DP, focus on the single really compelling use case: the US Census
1
1
And just casually mention some of the other use cases I'm aware of (business "how busy" histograms for Google Maps, COVID datasets, and DP-based SQL engines)
1
Apparently Alabama thought it would screw up their redistricting methods, claiming that their access to census data must be "free from manipulation by unlawful statistical methods that affect districting decisions."
1
1
I suppose they did not understand that the pre-DP methods already affected redistricting accuracy, and it's not clear what they mean by an "unlawful" statistical method. AFAICT there is no law regulating what statistical methods are lawful.
1
3
The opinion of the court was to reject Alabama's claim, because the law that said you can complain about statistical methods that affect apportionment was only intended to be for individuals, not states, to file suit.
1
Also, apparently, you can't ask for an injunction (court forcing a change) for a plan to do something, it has to have already been done. You can't sue over a possible future injury, only a realized injury. Alabama also "sat on their hands" for 5 years before suing.
1
It appears (to my naive eye) that the court just sidestepped the core issue (is differential privacy a lawful use of statistics in the census) and dismissed the suit on standing. So it could come back in a few years after some harm is shown to Alabama's districts??
2
However, in discovery in this case, John Abowd, the Chief Scientist at the Census Bureau, disclosed the details of their internal investigation on reconstructing the secret census records from the released statistics. @TedOnPrivacy summarized it in desfontain.es/privacy/us-cen…
1
1
3
That's all for today's writing session. I feel mostly done with the Census research background, and finished the draft of the chapter for everything except the math details (the easy part!)
1
For those interested in a deeper dive into the math, @thegautamkamath has a course from 2020 with videos and supplementary reading gautamkamath.com/CS860-fa202…
2
This week on Saturday Math Writing Showcase, we're continuing with Differential Privacy, but focusing on the chapter prose since the code is done. And boy is there a _lot_ to write about
2
1
11
Part of the difficulty is that most of the motivating stories about differential privacy focuses on anonymization failures. AOL, Netflix, Mass. health records, all of those are bad, but DP was a lateral solution to those problems.
1
While those failed attempts all published individual data records, DP usually focuses on publishing aggregate statistics. If, for whatever reason, what you need is closer to individual data records, I don't know of existing DP mechanisms that help (open to tips!)
1
So instead I'm planning to spilt this into two chapters. First, find another prod use case better motivated the AOL/Netflix stories (maybe k-anonymization for passwords?). For DP, focus on the single really compelling use case: the US Census
1
1
And just casually mention some of the other use cases I'm aware of (business "how busy" histograms for Google Maps, COVID datasets, and DP-based SQL engines)
1
Apparently Alabama thought it would screw up their redistricting methods, claiming that their access to census data must be "free from manipulation by unlawful statistical methods that affect districting decisions."
1
1
I suppose they did not understand that the pre-DP methods already affected redistricting accuracy, and it's not clear what they mean by an "unlawful" statistical method. AFAICT there is no law regulating what statistical methods are lawful.
1
3
The opinion of the court was to reject Alabama's claim, because the law that said you can complain about statistical methods that affect apportionment was only intended to be for individuals, not states, to file suit.
1
Also, apparently, you can't ask for an injunction (court forcing a change) for a plan to do something, it has to have already been done. You can't sue over a possible future injury, only a realized injury. Alabama also "sat on their hands" for 5 years before suing.
1
It appears (to my naive eye) that the court just sidestepped the core issue (is differential privacy a lawful use of statistics in the census) and dismissed the suit on standing. So it could come back in a few years after some harm is shown to Alabama's districts??
2
Also, h/t to @elizabeth_joh. By listening to her podcast @trumpconlaw, I (a fool with no legal training) can kinda sorta read these district court opinions and make sense of them.
1
2
This was also my first attempt to dig through supplemental court documents, and apparently they charge you _per page_??? Access to court documents should be free!!!
1
1
Jeremy Kun retweeted
Mathematical expressions can now be displayed in Markdown on GitHub using the $$ and $ delimiters - all with the help of the wonderful @MathJax project. github.blog/2022-05-19-math-…
76
1,196
322
4,817
Got a bonus at work, what should I spend it on. (non-cash so I have to spend it on actual stuff)
22
15
The older I get the more I find Rust's interface idea (traits) appealing. Allow a proliferation of thin interfaces, letting clients implement them for upstream libraries if needed.
3
1
14
Next up is the chapter prose, and I still have a lot of reading to do from @zephoria on the census controversies.
Show this thread