One potentially negative externality of a “data tax” that we want to avoid is the “most valuable” data that companies hold onto also being the most personal data that we actually don’t want them to track. If Amazon wants to use my past purchases on Amazon to recommend other things that I might be interested in, that’s actually somewhat useful and not that creepy (I did buy it from them, after all). I’d have no objection if a local shop I frequent said to me “hey, you got this thing last month, so I think you’d really love this new thing we just got in,” so why would we object to Amazon doing essentially the same thing? If, on the other hand, they start popping up ads like “we know you just got a tax refund of $XXX amount, here are some great things in that you can buy on that budget,” or “we know your kid just went on their first date, here’s some parenting books you might like,” that’s a lot more creepy, invasive, and unwanted. So coupled with the data tax, there should be some categories of data that simply are forbidden from being monetized (either by directly selling these “profiles” or by using/allowing this data to be used as part of targeting advertisements, recommendations, or personalizations). The categories I’ve chosen are based on the Universal Declaration of Human Rights, which lays out a number of categories of things that people have a unique right to privacy, security, an autonomy about. Here they are:
- Health Information, including not just formal health records but personal posts/statements about health matters and “inferences” about an individual’s health conditions based on other behaviors and statements.
- Financial/tax data and/or credit history. It’s one thing to put people into broad categories like “baby boomers who like to travel (and therefore have a large disposable income).” It’s another to say “based on the records we have, you’re net worth is precisely X and we recommend Y” or “targeting people who paid X in taxes last year” or “targeting individuals known/believed to be behind on X bill.”
- Information pertaining to an individual’s legal/criminal history. No ads specifically for ex-felons, individuals acquitted of such-and-such charge, or (likely) “illegal immigrants.”
- Information about the details of a persons family, home, or sexual activities. You can target broad categories like “families with kids at home” or “people who live in Omaha.” You can’t go to specifics like “has a 12-year old daughter” or “lives on Sullivan Street between 8th and 22nd” or “got laid in the last week.”
- Information posted with a reasonable expectation of privacy, like emails, direct messages, or “private” chats. Things that aren’t posted for public consumption can’t be used for targeting someone with ads or recommendations.
- Information that in a user’s home country or present location could be used to target them for discrimination or persecution. For example, in places where there are laws allowing LGBTQ+ individuals to be arrested (or worse), you can’t allow people to use LGBTQ+ status as a targeting parameter.
The goal here is to drive the data that is used for targeting, recommendations, etc. towards things like “your interests, hobbies, music likes, movie choices, and recent purchases.” While these things aren’t “public” information, they are more in the realm of things that people “expect” to share with business they interact with and which people naturally understand to be part of those platforms “enhancing” or “personalizing” their experience. Things that people expect to be private or semi-private should be off limits to monetization and personalization algorithms. Hold onto (and pay taxes on) that data if you wish, but you can’t use it in your algorithms.