Serendipitous reuse of data is good. Finality of data collection is good. Discuss.

I’m at the PrimeLife workshop on Open Data and Privacy. We’ve been trying to even frame the discussion all morning. Here’s my framing of the interesting space of the discussion: Let’s posit that public datasets are likely to include personally ide…

I’m at the PrimeLife workshop on Open Data and Privacy. We’ve been trying to even frame the discussion all morning.

Here’s my framing of the interesting space of the discussion:

  • Let’s posit that public datasets are likely to include personally identified or identifiable information.
  • Let’s posit that the datasets are available for re-use, and that there are overwhelming public policy and economic incentives for that to happen.
  • Let’s posit that the data is actually re-used in a way that involves identifying the individuals the data are about.

Put differently, let’s assume that we have a hard clash between privacy principles and open data principles. What does a meaningful privacy conversation look like in this space?