Subscribe: The Geomblog
http://geomblog.blogspot.com/feeds/posts/default
Language: English
Tags:
authors  blind review  blind  don  double blind  double  learning  might  much  paper  papers  people  review  theory  work
Rate this Feed

Feed Details and Statistics
Preview: The Geomblog

# The Geomblog

## Ruminations on computational geometry, algorithms, theoretical computer science and life

Updated: 2018-04-20T03:38:01.276-06:00

A #metoo testimonial that hits close to home...

2018-02-14T07:00:37.576-07:00

Double blind review: continuing the discussion

2018-01-22T09:00:18.345-07:00

My first two posts on double blind review triggered good discussion by Michael Mitzenmacher and Boaz Barak (see the comments on these posts for more).  I thought I'd try to synthesize what I took away from the posts and how my own thinking has developed.First up, I think it's gratifying to see that the the basic premise: "single blind review has the potential for bias, especially with respect to institutional status, gender and other signifiers of in/out groups" is granted at this point. There was a time in the not-so-distant past that I wouldn't be able to even establish this baseline in conversations that I'd have.The argument therefore has moved to one of tradeoffs: does the installation of DB review introduce other kinds of harm while mitigating harms due to bias?Here are some of the main arguments that have come up:Author identity carries valuable signal to evaluate the work. This argument manifested itself in comments (and I've heard it made in the past). One specific version of it that James Lee articulates is that all reviewing happens in a resource-limited setting (the resource here being time) and so signals like author identity, while not necessary to evaluate the correctness of a proof, provide a prior that can help focus one's attention. My instinctive reaction to this is "you've just defined bias". But on reflection I think James (and others people who've said this) are pointing out that abandoning author identity is not for free. I think that's a fair point to make. But I'd be hard pressed to see why this increase in effort negates the fairness benefits from double blind review (and I'm in general a little uncomfortable with this purely utilitarian calculus when it comes to bias).As a side note, I think that focusing on paper correctness is a mistake. As Boaz points out, this is not the main issue with most decisions on papers. What matters much more is "interestingness", which is very subjective and much more easily bound up with prior reactions to author identity. Some reviewers may be aware of author identity and others might not. This inconsistency could be a source of error in reviewing.Boaz makes this point in his argument against DB review. It's an interesting argument, but I think it also falls into the trap of absolutism: i.e imperfections in this process will cause catastrophic failure. This point was made far more eloquently in a comment on a blog post about ACL's double blind policy (emphasis mine). I think this kind of all-or-nothing position fails to consider one of the advantages of blind review. Blind review is not only about preventing positive bias when you see a paper from an elite university, it’s also about the opposite: preventing negative bias when you see a paper from someone totally unknown. Being a PhD student from a small group in a little known university, the first time I submitted a paper to an ACL conference I felt quite reassured by knowing that the reviewers wouldn’t know who I was. In other words, under an arXiv-permissive policy like the current one, authors still have the *right* to be reviewed blindly, even if it’s no longer an obligation because they can make their identity known indirectly via arXiv+Twitter and the like. I think that right is important. So the dilemma is not a matter of “either we totally forbid dissemination of the papers before acceptance in order to have pure blind review (by the way, 100% pure blind review doesn’t exist anyway because one often has a hint of whom the authors may be, and this is true especially of well-known authors) or we throw the baby out with the bathwater and dispense with blind review altogether”. I think blind review should be preserved at least as a right for the author (as it is know), and the question is whether it should also be an obligation or not.Prepublication on the arXiv is a desirable goal to foster open access and the speedy dissemination of information. Double blind review is irrevocably in conflict with non-anonyous pre-print dissemination.Th[...]

Double blind review at theory conferences: More thoughts.

2018-01-09T00:41:39.915-07:00

I've had a number of discussions with people both before and after the report that Rasmus and I wrote on the double-blind experiment at ALENEX. And I think it's helpful to lay out some of my thoughts on both the purpose of double blind review as I understand it, and the logistical challenges of implementing it.What is the purpose of double blind review? The goal is to mitigate the effects of the unconscious, implicit biases that we all possess and that influence our decision making in imperceptible ways. It's not a perfect solution to the problem. But there is now a large body of evidence suggesting thatAll people are susceptible to implicit biases, whether it be regarding institutional status, individual status, or demographic stereotyping. And what's worse that we are incredibly bad at assessing or detecting our own biases. At this point, a claim that a community is not susceptible to bias is the one that needs evidence. Double blind review can mitigate this effect. Probably the most striking example of this is the case of orchestra auditions, where requiring performers to play behind a screen dramatically increased the number of women in orchestras. What is NOT the purpose of double blind review? Double blind review is not a way to prevent anyone from ever figuring out the author identity. So objections to blinding based on scenarios where author identity is partially or wholly revealed are not relevant. Remember, the goal is to eliminate the initial biases that come from the first impressions. What makes DB review hard to implement at theory venues? Theory conferences do two things that are different from other communities. Werequire that PC members do NOT submit papersallow PC members to initiate queries for external subreviewers. These two issues are connected. If you don't allow PC members to submit papers, you need a small PC. If you have a small PC, each PC member is responsible for many papers. If each PC member is responsible for many papers, they need to outsource the effort to be able to get the work done. As we mentioned earlier, it's not possible to have PC members initiate review requests if they don't know who might be in conflict with a paper whose authors are invisible. So what do we do? There's actually a reasonably straightforward answer to this. We construct the PC as usual with the usual restrictions.We construct a list of “reviewers”. For example, "anyone with a SODA/STOC/FOCs paper in the last 5 years” or something like that. Ideally we will solicit nominations from the PC for this purpose.We invite this list of people to be reviewers for SODA, and do this BEFORE paper submissionauthors will declare conflicts with reviewers and domains (and reviewers can also declare conflicts with domains and authors) at bidding time, the reviewers will be invited to bid on (blinded) papers. The system will automatically assign people. PC members will also be in charge of papers as before, and it’s their job to manage the “reviewers” or even supply their own reviews as needed. Any remaining requests for truly external sub reviewing will be handled by the PC chairs. I expect this number will be a lot smaller.Of course all of this is pretty standard at venues that implement double blind review. But what if a sub-area is so small that all the potential reviewers are conflictedwell if that's the case, then it's a problem we face right now. And DB review doesn't really affect it. What about if a paper is on the arXiv? We ask authors and reviewers to adhere to double blind review policies in good faith. Reviewers are not expected to go hunting for the author names, and authors are expected to not draw attention to information that could lead to a reveal. Like with any system, we trust people to do the right thing, and that generally works. But labeling CoI for so many people is overwhelming.It does take a little time, but less time than one expects. Prac[...]

Report on double blind reviewing in ALENEX 2018

2018-01-07T13:54:24.304-07:00

+Rasmus Pagh and I chaired ALENEX 2018, and we decided to experiment with double blind review  for the conference. What follows is a report that we wrote on our experiences doing this. There are some useful notes about logistics, especially in the context of a theoretically-organized conference on experimental algorithms.ALENEX 2018 Double Blind ReviewFor ALENEX 2018, we decided to experiment with a double blind review process i.e one in which authors and reviewers were unaware of each others’ identity. While double blind review is now almost standard in most computer science conferences, it is still relatively uncommon in conferences that focus on theoretical computer science and related topics. The motivationIn the original argument we presented to the ALENEX Steering Committee, we presented the following reasons for why we wanted double blind review:1. Eliminating bias. Andrew Tomkins did an experiment for WSDM this year and wrote a report on it: https://arxiv.org/abs/1702.00502One particular observation:"Reviewers in the single-blind condition typically bid for 22% fewer papers, and preferentially bid for papers from top institutions. Once papers were allocated to reviewers, single-blind reviewers were significantly more likely than their double-blind counterparts to recommend for acceptance papers from famous authors and top institutions. The estimated odds multipliers are 1.66 for famous authors and 1.61 and 2.10 for top universities and companies respectively, so the result is tangible”2. Common practice.Virtually every CS community except theory is doing double blind review, including most of ML (NIPS, ICML), DB (VLDB, SIGMOD), Systems (NSDI), etc. While theory papers have their own idiosyncrasies, we argued that ALENEX is much closer in spirit and paper structure to more experimental venues like the ones listed.3. Limited burden on authors and reviewers for an experimentThere was no real logistical burden. We were not blocking people from posting on the arXiv, or talking about their work. We’re merely requiring submissions be blinded (which is easy to do). For reviewers also, this is not a problem - typically you merely declare conflicts based on domains and that takes care of the problem of figuring out who’s conflicted with what paper (but more on this later).4. Prototyping.While theoryCS conferences in general do not make use of double blind review, ALENEX is a small but core venue where such an experiment might reveal useful insights about the viability of double blind overall. So we don’t have to advocate changes at SODA/STOC/FOCS straight up without first learning how it might work.5. PC submissions.We are allowing PC members to submit papers, and this has been done before at ALENEX. In this case double blind review is important to prevent even the appearance of conflict.The processBefore submission: We provided a submission template for authors that suppressed author names. We also instructed authors on how to refer to prior work or other citations that might leak author identity - in brief, they were asked to treat these as any third-party reference. We also asked authors to declare conflicts with PC members. After submission/before reviews: We recognized that authors might not be used to submitting articles in double blind mode and so went over each submission after it was submitted and before we opened up bidding to PC members to make sure that the submissions were properly blinded. In a few cases (less than 10/49) we had to ask authors to make modifications. During review process: The next step was to handle requests for subreviewers. Since PC members could not determine CoIs (conflicts of interest) on their own, all such requests were processed through the PC chairs. A PC member would give us a list of names and we would pick one. (so more private information retrieval than a zero knowledge protocol!)IssuesA number of issues came up that appear to be unique to the theory conference review process. We[...]

Cake cutting algorithms in prison

2017-10-22T23:49:41.985-06:00

This past Friday, I gave a lecture on cake cutting algorithms at the Timpanogos Women's Facility as part of a lecture series organized by my Utah colleague Erin L. Castro and her Utah Prison Education Project. The project's mission is to
... provide quality, sustained, and meaningful higher educational opportunities to individuals incarcerated in Utah state prisons. Through embodying the mission of the University of Utah, the Project assists incarcerated students and non-incarcerated volunteers to live lives of impact, both in prison and post-incarceration, by fostering leadership, civic engagement, and critical inquiry. UPEP aims to create lasting impact in our state and our communities by investing in people and providing them the tools necessary for empowerment and lifelong learning.
I think this is incredibly important work.  We don't need to get into a much larger discussion about rehabilitation versus punishment theories of justice to appreciate how providing access to education might allow incarcerated students the ability to turn their life around, or even find opportunities for work once they leave prison so that they have a way to support themselves without falling back into criminal activities. Maybe the amount of education they get in prison might even one day be a predictive factor in deciding whether they will reoffend!

When I first mentioned this on Facebook, many people were curious about what it was like. I can report here that my lecture was.... more or less exactly like a lecture would happen in any of the other places I lecture. Students came in with a lot of fear of math (which is why I thought I'd talk about recreational math). There was an actual cake and lots of nervousness when I had students run the algorithms on the cake. There were lots of questions about the different models of fair division, and some confusion about whether we could trust the results of the process.

In other words, every kind of question one might expect in any setting. The students were engaged and interested. They hadn't had too much math experience except what they did in high school, but they were able to follow along quite well and come up with their own algorithms as we advanced further into the lecture. I enjoyed myself, and I hope they did too!

But of course this was at a prison, so there were some other details.

• Arranging the cake (and a whiteboard) took some work, and Erin and her group (as well as the lieutenant at the prison) did their magic to make it happen. There was some discussion about who would use the plastic butter knife: eventually the students did.
• I couldn't bring my laptop in, and used a whiteboard (which worked fine for my talk). But I hadn't planned on using one, and maybe if I did want to do a slide presentation that would have been arranged as well.
• There was some discussion about where the students would sit, and how much spacing was needed. There was even some minor discussion about whether we could all get together for a group photo at the end (we did!).

(image)

On music, mathematics and teaching.

2017-09-29T09:00:34.916-06:00

I'm a perpetual student when it comes to my guitar-playing. I started off learning acoustic guitar, and taught myself a little bass in college. When I was in the college band our music advisor played some classical guitar and that got me hooked.I've had a number of teachers through grad school and beyond, but I've always plateaued out at a level where I'm competent but no better. At some point I realized that what motivated me to play was the right kinds of music (this I also learnt when watching my children learn an instrument), and that inexorably led me to my new quest: learning flamenco guitar.Flamenco is a very passionate style of playing - and classical guitar can seem bloodless and sedate in comparison. It also requires many different right hand techniques that are not common in classical guitar problem.The net result is that I'm back to being a beginning student again - struggling with mechanics, hand position and note playing. It's a lot of frustration with the occasional moment of transcendence. I whine at my teacher in the way students whine at me, and he's sneaky enough that now he just asks me "so what would you tell your own students" and that shuts me up.Which brings me to the point of this post (what??? posts need a point?). We spent a lesson last week talking about extracting expression and feeling from the instrument. I kept asking him about what tools I could use (beyond the usual tone control by moving up and down the fretboard and using volume) to express more emotion, and what emotion that would be. His response was first to show me this beautiful video of an interviewer "talking" to Paco De Lucia's guitar width="320" height="266" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/KLtlM5r8rdc/0.jpg" src="https://www.youtube.com/embed/KLtlM5r8rdc?feature=player_embedded" frameborder="0" allowfullscreen>and then explain to me that I have to dig deep within myself to find the way I can relate to the music. And then it hit me (painfully). Aditya Bhaskara and I are running a theory seminar on reading classic theory papers where (much like my previous seminar) there's a strong emphasis on getting to the core ideas and intuitions that drive a result. I'm constantly exhorting students (even more so than Aditya - I think it's interesting to see how different people absorb messages from a paper) to find the core intuition in the paper and be able to express it in a short "story" almost. And that's essentially what my teacher is exhorting me to do. In both cases, the expert is trying to get the student to transcend the purely mechanical aspects of (reading the paper/playing the instrument) and get to the deeper (mathematical/emotional) truth of the (paper/piece). And it's hard precisely because the student in either case is still struggling with the mechanical, and doesn't yet have the facility with the tools to let them fall away. Does this mean I'll be a more enlightened teacher? I doubt it :). But I do have a little more sympathy for my students.  [...]

"X is a social construct" and the perils of mining behavior.

2017-09-29T00:02:06.205-06:00

After the infamous Google memo (and frankly for much longer if you work in algorithmic fairness), the idea of something being a "social construct" has popped up again, and I will admit that I've struggled with trying to understand what that means (damn you, focused engineering education!)Ta-Nehisi Coates' article about race is a short and excellent read. But I also want to highlight something much closer to home. BYU Radio's Julie Rose did an interview with Jacqueline Chen (at the U) on her recent work on perceptions of race in the US vs Brazil.The interview is here (and it's short - starting at around 20 minutes in) and in it Prof. Chen very masterfully lays out the way in which race is perceived and how it changes based on changes in context. The interview is based on a recently published paper ($$).One important takeaway: the way in which one's racial identity is perceived varies greatly between the US (which appears to be influenced by parental information) vs Brazil (where skin color appears to be the dominant factor). More importantly, the idea of race as immutable vs changeable, a categorical attribute versus a continuous one, all vary.And that's what we mean by saying that X (here, race) is a social construct. It's not saying that it's fictitious or less tangible. But that it's defined by the way we talk about it in society.Why is this important? When we collect data as a way to predict behavior, we're making an implicit claim that behavior can be predicted (and explained) by intrinsic and often immutable descriptors of an individual. We use (or don't use) "race" as a feature when building models.But this itself is a huge assumption! It assumes that we can intelligently ascribe features to individuals that capture these notions, and that they are defined solely by the individual and not by context. The brilliant Medium article about the paper that claimed to predict criminality from facial features makes this point very well.But do we capture the entire history of environmental factors that make up the story of an individual. Of course not. We essentialize an individual into a collection of features that we decide captures all their relevant traits for the purpose of prediction, and then we build a model that rests on this extremely problematic idea.Much of the work I do on fairness can be reduced to "check your data, and check your algorithm". What we're also thinking about (and that directly speaks to this issue) is "check your features".It turns out that way back in 1921, Walter Lippman had something interesting to say about all of this. From a longer essay that he wrote on the importance of frames as mediating how we perceive the world (and it says something about fake news and "true facts" as well):And so before we involve ourselves in the jungle of obscurities about the innate differences of men, we shall do well to fix our attention upon the extraordinary differences in what men know of the world. I do not doubt that there are important biological differences. Since man is an animal it would be strange if there were not. But as rational beings it is worse than shallow to generalize at all about comparative behavior until there is a measurable similarity between the environments to which behavior is a response. [...] On free speech, gerrymandering and self-contradictory rule systems 2017-08-23T09:00:00.176-06:00 In the light of the wave of racist and neo-Nazi bile being slung around in Charlottesville and beyond, Karl Popper's Paradox of Tolerance has been doing the rounds. Paraphrased, it can be phrased as In a tolerant society, one must be tolerant of everything except intolerance. There's an interesting self-referential element there that's reminiscient of Gödel's incompleteness theorems. To wit, We can either have a system of rules that is consistent, but cannot account for all phenomena that are consistent with the rules, or have a system that is inconsistent. In other words, the principle of tolerance in society cannot incorporate its own negation and still survive. One nuance here is that Popper doesn't necessarily advocate for restricting intolerant speech as much as speech that leads to incitement, which is somewhat in line with 1st Amendment protections in the US. This reminds me of another situation where self-contradictory rule systems cause problems: gerrymandering. The Supreme Court is soon to hear arguments in a case of partisan gerrymandering from Wisconsin. Roughly speaking, a partisan gerrymander (as opposed to a racial gerrymander) is one in which districts are drawn to favor a certain political party (the "partisan" aspect") as opposed to favoring a certain race. While racial gerrymanders have been repeatedly struck down in court (the latest being a case from Texas), partisan gerrymanders have a much more mixed record, which is why many are watching this new case with great interest. One potential argument in favor of allowing partisan gerrymandering is that if a party wins, their victory should allow them the power to redraw districts -- the "elections have consequences" principle. But it seems to me that this is again an example of Popper's paradox of tolerance. That is to say, if we allow the party that wins an election to do partisan gerrymandering, then we are allowing through the democratic process an action that will serve to directly reduce the ability of the democractic process to represent the people. And for that reason a partisan gerrymander should not be allowed. I wonder if there are other settings where this principle might help clarify the limits of a permissive rule structure.(image) TheoryFest Day 3: Plenaries 2017-06-23T13:28:36.840-06:00 I was the chair of the plenary session on Wednesday, so was too focused on keeping track of time and such to pay full attention to the talks. Having said that, all the speakers we've had so far have done a bang-up job of keeping within their time window without much prompting at all. So I can only give my very brief thoughts on the talks. For more information, go here. Atri Rudra was up first with a neat way to generalize joins, inference in probabilistic models and even matrix multiplication all within a generic semi-ring framework, which allowed the authors to provide faster algorithms for join estimation and inference. In fact, these are being used right now to get SOTA join implementations that beat what Oracle et al have to offer. Neat! Vasilis Syrgkakis asked a very natural question: when players are playing a game and learning, what happens if we treat all players as learning agents, rather than analyzing each player's behavior with respect to an adversary? It turns out that you can show better bounds on convergence to equilibrium as well as approximations to optimal welfare (i.e the price of anarchy). There's more work to do here with more general learning frameworks (beyond bandits, for example). Chris Umans talked about how the resolution of the cap set conjecture implies bad news for all current attempts to prove that \omega = 2 for matrix multiplication. He also provided the "book proof" for the cap set conjecture that came out of the recent flurry of work by Croot, Lev, Pach, Ellenberg, Gijswijt and Tao (and cited Tao's blog post as well as the papers, which I thought was neat). I hope the slides will be up soon. If not for anything else, for Atri's explanation of graphical models in terms of "why is my child crying", which so many of us can relate to. (image) Minority Report and Predictive Policing 2017-06-23T13:16:48.908-06:00 Minority report (the movie) is 15 years old. Who knew! Well I certainly didn't, till I was called by a reporter from CNN who wanted to talk about the legacy of the movie. Here's the link to the story. It was a depressing conversation. We went over some of the main themes from the movie, and I realized to my horrow how many of them are now part of our reality. Precogs are now PredPol. Algorithms that claim to know where crime will happen. The companies building predictive policing software will often take umbrage at references to Minority Report because they say they're not targeting people. But all I say is "….yet". Predictions have errors. The very title of the movie telegraphs the idea of errors in the prediction system. And much of the movie is about a coverup of such a 'minority report'. And yet today we treat our algorithms (precogs) as infallible, and their predictions as truth. VERY personalized advertising. The main character is targeted by personalized advertising and a good section of the plot involves him trying to get a replacement eyeball so retina scans don't detect him. And then we have this. Feedback loops. The debate between Agatha (the minority precog) and Anderton about free will leads him to a decision to change his future, which then changes the prediction system. In other words, feedback loops! But feedback loops work both ways. Firstly, predictions are not set in stone: they can be changed by our actions. Secondly, if we don't realize that predictions can be affected by feedback from earlier decisions, our decision-making apparatus can spiral out of control, provably so (consider this a teaser: I'll have more to say in a few days). What's a little sad for me is because I wasn't sufficiently 'woke' when I first saw the movie, I thought that the coolest part of it was the ingenious visual interfaces on display. We're actually not too far from such systems with VR and AR. But that now seems like such a minor and insignificant part of the future the movie describes. (image) TheoryFest Day 3: Streaming symmetric norms. 2017-06-23T13:02:16.089-06:00 There's a weird phenomenon in the world of streaming norm estimation: For \ell_0, \ell_1, \ell_2 norm estimation, there are polylog (or less)-space streaming approximation algorithms. But once you get to \ell_p, p \ge 3, the required space suddenly jumps to polynomial in n. What's worse is that if you change norms you need a new algorithm and have to prove all your results all over again. This paper gives a universal algorithm for estimating a class of norms called "symmetric' (which basically means that the norm is invariant under coordinate permutation and sign flips - you can think of this as being invariant under a very special class of rotations and reflections if you like). This class includes the \ell_p norms as a special case, so this result generalizes (upto polylog factors) all the prior work. The result works in a very neat way. The key idea is to define a notion of concentration relative to a Euclidean ball. Specifically, Fix the unit \ell_2 ball in n dimensions, and look at the maximum value of your norm \ell over this call: call it b_\ell. Now look at the median value of your norm (with respect to a uniform distribution over the sphere): call it m(\ell). Define the modulus of concentration as$$ mc(\ell) = \frac{b_\ell}{m_\ell} 
Notice that this is 1 for $\ell_2$. For $\ell_1$, the maximum value is larger: it's $\sqrt{d}$. The median value as it turns out is also $\Theta(\sqrt{d})$, and so the modulus of concentration is constant. Interestingly, for $p > 2$, the modulus of concentration for $\ell_p$ is $d^{1/2(1 - 2/p)}$ which looks an awful lot like the bound of $d^{1-2/p}$ for sketching $\ell_p$.

As it turns out, this is precisely the point. The authors show that the streaming complexity of any norm $\ell$ can be expressed in terms of the square of $mc(\ell)$. There are some technical details - this is not exactly the theorem statement - but you can read the paper for more information.

Update: Symmetric norms show up in a later paper as well. Andoni, Nikolov, Razenshteyn and Waingarten show how to do approximate near neighbors with $\log \log n$ approximation in spaces endowed with a symmetric norm. It doesn't appear that they use the same ideas from this paper though.(image)

TheoryFest Day 3: "I forgot to talk about Kant"

2017-06-22T13:19:08.403-06:00

The above is an actual quote from Oded Goldreich in his hilarious speech accepting the Knuth Prize for 2017. This speech was highly anticipated, because most of us have spent years reading and marvelling at Oded's opinions (he addressed the elephant in the room very promptly)

As the title suggests, there was a heavy dose of philosophy, with quotes from Kant, Weber, Frankl, and MacIntyre. He also gave a brief technical exposition of new developments in interactive proofs starting from the "IP for Muggles" paper of Goldwasser, Kalai and Rothblum. This topic is quite dear to me right now, given that my student Samira Daruki just defended a thesis that is at least partly about interactive proofs where the verifier is extremely weak (aka streaming). I'll have more to say about this later.

One can only hope the video of Oded's talk will be available online soon: it's can't miss-TV :).

(I'm keeping these posts short in the hope that I'll actually write them. The danger in longer posts is that they never get written). (image)

TheoryFest Day 3: Panels and Lunches

2017-06-22T13:06:46.597-06:00

(for various reasons, I don't have wifi access at my Airbnb, so my posts are delayed. But it's not like you're hanging on my every word...... are you?)

Day 3 started off with a panel moderated by Anna Karlin, starring Cynthia Dwork, Russell Impagliazzo, Ankur Moitra, Tim Roughgarden, Dan Spielman and Andy Yao. I personally like panels to have a bit of an edge and controversy, and there wasn't that much of it here, but there was some good discussion about various aspects of TCS, as well as some good advice for the young'uns in the audience. Some of the highlights:

• Andy Yao thinks a working quantum computing device is imminent, but one that we could solve Rubik's cube puzzles on might take a while longer.
• Cynthia Dwork made a strong argument for the power of theoryCS to define problems clearly and rigorously even for messy concepts like privacy and fairness.
• Ankur Moitra made a point that I think should be made more often: that modeling and algorithm design are fundamentally different things even if they might be influenced by one another. And it's important not to confuse the two.
• Dan Spielman tried to emphasize that even if it might seem like senior people at a conference know what they're talking about, they don't really know. And that's it's ok to be in a constant state of befuddlement. As Anna Karlin put it I think, "We must learn to dance with chaos".
There was lots more that the panel discussed, but these are what stuck in my mind. One pity was that we ran out of time and couldn't have any audience questions.

As I had mentioned earlier, there were networking lunches for senior (aaaah, I'm a senior now :() and junior people. Never one to let of a chance to pontificate, I had lunch with Sima Hajiaghaei (UVic), Lalla Mouatadid (Toronto), Benjamin Priest (Dartmouth) (ed: you need a website!) and Shravas Rao (NYU) at a nice dumpling place in Chinatown. They won't like me saying this, but I will anyway :) - they all seemed so young and enthusiastic I felt a little old. But we had a great chat, and if you bump into any of them at a theory conference in a future make sure to say hi :).

(image)

TheoryFest Day 2: Directed Spectral methods

2017-06-21T08:44:42.867-06:00

There was an interesting talk on developing spectral tools for directed graphs. The magic of spectral methods for undirected graphs comes from the interpretation of the graph as a Markov chain, and then using spectral properties of the Laplacian (which is symmetric and positive semidefinite) to reason about a host of properties like conductance, mixing times, sparsification, and others.

But for directed graphs none of this works as stated. The Laplacian is no longer symmetric (goodbye real eigenvalues) and even if we symmetrize it it may not be positive definite (goodbye nonnegative eignenvalues). One of the key tricks that this paper exploits (and relates to work by Fan Chung on Cheeger inequalities for directed graphs) is the following:
Suppose our directed graph is Eulerian, which means that each vertex has the same incoming and outgoing degree. Then if we construct the associated Laplacian $L$ and symmetrize it as $L' = (L + L^T)/2$, then the resulting symmetric matrix $L'$ is positive definite!
This turns out to be a key ingredient of designing efficient algorithms for estimating spectral quantities on directed graphs, and is a trick worth remembering.

(image)

TheoryFest: Avi Wigderson's plenary talk.

2017-06-21T08:08:58.687-06:00

The plenary talk on Tuesday morning was by Avi Wigderson, on the nature of TCS.

Now if you haven't attended an Avi Wigderson plenary before, you should imagine yourself lying in a nice warm tub of water with soothing bath salts massaging your aching limbs. Avi's talks are balm for the poor persecuted theoretician who comes to STOC to remember that there people in the world who don't insist on demanding practical value for every darn theorem you prove.

His talk was not technical, and not even particularly informative for anyone who's been paying attention to TCS the last 40 years. What it was, was a love song to TCS, sung by a bard with +5 charisma who makes you feel like you have purpose and meaning in your life once again.

His slides will be up soon, and you can see them for yourself. But if there's anyone who can make the argument for the universality, depth and reach of TCS at its highest conceptual level, it's Avi. I'll link them when they're available and if you care at all about the field, you should take a look. (image)

TheoryFest: Attack of the talks

2017-06-21T08:07:00.657-06:00

Day 2 at TheoryFest, and people are still attending talks. Ok maybe I shouldn't be that surprised. But it's kind of nice to see anyway. The lounge area is deserted during the sessions and full during the breaks.

The TheoryFest organizing committee (as represented by Ryan Williams) has organized lunch time meetups for senior and junior people (where junior means grad students and postdocs, not assistant profs — sorry, assistant profs). The idea is that the senior people sign up and the junior people can email them to meet up. It's a nice idea, especially given the feeling that I've heard from people in the past that STOC is an unfriendly place for students, especially those not from high-powered theory places.

One interesting aspect of coming to STOC after a number of years is while I know many the older folk, I don't know most of the students, and so I float around relatively anonymous (it doesn't help that I haven't blogged much in recent months - I've been distracted, people!). It's almost like being at a brand new conference that I've never attended before.(image)

TheoryFest: Short Takes

2017-06-20T07:02:50.002-06:00

So far, the tutorials appear to have been well attended, The DL tutorial had a full house in a big room, but the other two tutorials did pretty well to. The plenary talks (the reason I'm here) start today and it will be interesting to see what kind of attendance we see.

The business meeting will reveal the official numbers: indications are that they will be quite good especially considering we're not in the US.

Montreal is a nice town. My AirBnB is right next to Chinatown and we're pretty much in the center of everything. The parts I've walked through have this kind of pacing with cul de sacs, pedestrian-only areas, main roads, and parks that feels like well-paced musical composition.
(image)

TheoryFest I: Deep Learning

2017-06-20T06:55:54.346-06:00

(ed: I'm guessing you never thought those words would appear together in the same phrase)Ruslan Salakhutdinov gave a tutorial on deep learning today. Now deep learning is a tricky topic for theory (more on that below), but I thought he did a nice job in his two hours of explaining the basics of how a neural net works and how it's trained, without getting too far into engineering weeds, but also being able to explain important ideas like drop out, SGD, batch normalization and momentum. He skillfully avoided the alphabet soup of architectures in a way that didn't really affect one's understanding (I think). He didn't get too much into RNNs, but I think that was a conscious and fair choice. Discussing the unsupervised element of DL - autoencoders, RBMs, DBMs, and GANs. Now I have a little bit of an advantage here because we're running a summer reading group on GANs, but I liked his framing here in terms of supervised and unsupervised, as well as the different kind of generative criteria (probabilistic/not, tractable/intractable, explicit/implicit) used to classify the different approaches. In a first at STOC (maybe?) the audience got to see a video of an RL system learning to play Doom. It was pretty neat. Having said that, I'm not exactly the right audience for this kind of talk, since I'm decently familiar with deep learning. What surprised me though was that when I polled people during the breaks. most of the people who attended the tutorial felt the same way. And the common refrain was "We've heard so many faculty canddiates talk about deep learning that we know the basics now"!So I almost wonder if Russ miscalbrated the level of the audience. There was also some minor grumbling about the lack of clear open problems. I actually don't fault him for that. I think it might have been useful to expose core ideas for which answers don't exist, and some of these came out in the Q&A. But let me make a more general observation. Deep learning is a tricky topic for theoreticians to negotiate, for a number of reasons. firstly, I don't think it's even useful to ask the most general form of "what does a neural net DO" questions. Neural nets are very very general (in particular a 2-level neural net can approximate any function). So asking general questions about them is like asking to characterize a Turing machine with no constraints. You can't say much beyond recursive and r.e. I think ther right questions are much more specific. DL right now is very much an engineering discipline, which is to say that the practice of DL is focused on trying out engineering hacks that appear to get improvements. And these improvements are significant enough that it really doesn't matter why they work. In other words, DL doesn't need theory… at least now. Even if you don't grant the previous two positions, there's another issue. Descriptions of DL systems feel a lot like experimental physics: "hey we did all of this and it worked this way. Now give us a theory". With the difference that there's no "there" there: there's no fixed Nature that we can design theoretical laws against. Only a gazillion-dimensional highly nonconvex landscape where we don't even try to find a provably high quality answer. So I think we're on our own if we want to (or care to) understand the computational power and expressivity of neural networks. It's very interesting, and we're seeing nice results begin to appear, but we should do it because there's interesting theory to be had here, rather than trying to hew to close to actual DL systems. [...]

Off to TheoryFest 2017

2017-06-18T06:34:38.514-06:00

I'm at the airport getting ready to board my flight for Montreal to attend TheoryFest 2017. And much to my amazement, I discover that STOC has its own event mobile app. Who knows, maybe this means that by next decade theory conferences will do double blind review? (ed: stop it, now that's crazy talk!)

Snark aside, I'm very excited to see how the new format for STOC works. This is an experiment, and like all experiments it will take some time to see how the idea plays out. But first impressions matter, and if this year's iteration goes well, that will be good news for the overall plan.

I'm of course excited for the plenary invited talks, but I'm also excited to see the tutorials and workshops. Even theory land is not safe from the invasion of the Huns deep learning and I'm sure there will be standing-room only space at Ruslan Salakhutdinov's tutorial on the foundations of deep learning.

My "coverage" will be a mix of tweeting and blogging. Stay tuned!(image)

Easy memory aides

2017-05-01T09:44:41.399-06:00

Certain memory aides are so ... well.. memorable that they stick in your mind exactly the way they should. Here are three that I've heard of:

• Keeping track of the Baltic states: I think I heard this from +Fernando Pereira - They are alphabetical in order from north to south. So it's Estonia, Latvia and then Lithuania.
• Converting between miles and kilometers: This is a particularly geek-friendly mnemonic - If the value in miles is (close to) a Fibonacci number, the value in kilometers is (close to) the next Fibonacci number. This works because the golden ratio is very close to the miles-to-km conversion factor.
• Keeping track of type-I/type-II errors: I heard this one on twitter the other day, and it goes like this. In the story of the boy who cried wolf, the errors appear in order.
Things I need memory aides for:
• Bernstein vs McDiarmid vs Azuma, or different Chernoff bounds in general.
• Jensen's inequality, so I don't have to constantly draw the picture of the convex function to derive it.
• Submodularity vs supermodularity: I know you add the diagonals and compare them, but which way is which???
Others?
(image)

TheoryFest at STOC 2017.

2017-04-19T00:51:25.368-06:00

(ed: I can't believe it's been four months since my last post. Granted, I've been posting over at algorithmicfairness.wordpress.com, but still. Time to turn in my theory card...)

One of the things I've been doing is helping out with the STOC TheoryFest this summer. Specifically, I've been on the plenary talks committee that was tasked with identifying interesting papers from other parts of CS and beyond that might be interesting for a STOC audience to hear about. After many months of deliberations, we're finally done! Boaz has more details over at his blog.

Do take a look at the papers. Boaz has a great summary for each paper, as well as links to the sources. You'll see a pretty decent coverage of different topics in CS, including even some theory B !!

But more importantly, please do register for STOC! This year, the organizers are making a real effort to restructure the conference to make it more accessible with a diversity of events like the plenary talks, tutorials, and workshops, and there should be something for everyone. Early registration ends on May 21, so click that link now.

In this day and age, I don't need to emphasize the importance of standing up and being counted. If you've ever grumbled about the old and stultifying format of STOC, this is your chance to vote with your feet. A big success for TheoryFest 2017 will surely encourage more along these lines in future years.(image)

A short course on FATML -- and a longer course on ethics in data science.

2016-12-12T04:44:49.711-07:00

I'm at IIIT Allahabad teaching a short course on fairness, accountability and transparency in machine learning. This is part of the GIAN program sponsored by the Government of India to enable Indian researchers from outside the country to come back and share their knowledge. The clever pun here is that GIAN -- which stands for Global Initiative of Academic Networks -- also means 'knowledge' in Hindi (ज्ञान).
Anyway, I'm excited to be teaching a course on this topic, and that has forced me to organize my thoughts on the material as well (which has already led to some interesting insights about how the literature fits together). I'm writing out my lecture notes, and one of the energetic students at the course has kindly offered to help me clean them out, so I hope to have a set of notes in less than a year or so.

Note that the syllabus on the linked page is evolving: it will change over the course of the next few days as I add more things in.

I'm also excited to to be teaching a (brand-new) undergraduate course on ethics and data science next fall. I'll have more to say about this as I prepare material, but any suggestions/materials are welcome.  (image)

Modeling thorny problems.

2016-10-20T08:00:17.417-06:00

Today, I got into a nice twitter discussion with former colleague and NLPer extraordinaire (even when he hates on algorithms) Hal Daumé. Here's what he said:
(image)

Axiomatic perspective on fairness, and the power of discussion

2016-09-26T15:49:38.827-06:00

Sorelle Friedler, Carlos Scheidegger and I just posted a new paper on the arxiv where we try to lay out a framework for talking about fairness in a more precise way.

The blog post I link to says more about the paper itself. But I wanted to comment here on the tortuous process that led to this paper. In some form or another, we've been thinking about this problem for two years: how do we "mathematize" discussions of fairness and bias?

It's been a long and difficult slog, more than any other topic in "CS meets X" that I've ever bumped into. The problem here is that the words are extremely slippery, and refract completely different meanings in different contexts. So coming up with notions that aren't overly complicated, and yet appear to capture the different ways in which people think about fairness was excruciatingly difficult.

We had a basic framework in mind a while ago, but then we started "testing" it in the wild, on papers, and on people. The more conversations we had, the more we refined and adapted our ideas, to the point where the paper we have today owes deep debts to many people that we spoke with and who provided nontrivial conceptual challenges that we had to address.

I still have no idea whether what we've written is any good. Time will tell. But it feels good to finally put out something, however half-baked. Because now hopefully the community can engage with it.
(image)

Congrats to John Moeller, Ph.D

2016-08-26T21:37:36.539-06:00

My student +John Moeller (moeller.fyi) just defended his Ph.D thesis today! and yes, there was a (rubber) snake-fighting element to the defense.

John's dissertation work is in machine learning, but his publications span a wider range. He started off with a rather hard problem: attempting to formulate a natural notion of range spaces in a negatively-curved space. And as if dealing with Riemannian geometry wasn't bad enough, he was also involved in finding approximate near neighbors in Bregman spaces. He's also been instrumental in my more recent work in algorithmic fairness.

But John's true interests lie in machine learning, specifically kernels. He came up with a nice geometric formulation of kernel learning by way of the multiplicative weight update method. He then took this formulation and extended it to localized kernel learning (where you don't need each kernel to work with all points - think of it like a soft clustering of kernels).

Most recently, he's also explored the interface between kernels and neural nets, as part of a larger effort to understand neural nets. This is also a way of doing kernel learning, in a "smoother" way via Bochner's theorem.

It's a great body of work that required mastery of a range of different mathematical constructs and algorithmic techniques.  Congratulations, John!(image)