Recently entrepreneur and noted Twitter user Mark Cuban discovered that companies are collecting data about activity of social media users, which was apparently a revelation to Inc Magazine and its readership.
In the hilarious fear-mongering advertorial, Cuban postulates that our digital histories will someday be used against us in court or for job interviews.
This is perhaps only a revelation to Cuban and Inc. The rest of the Internet-using public has been aware of this reality for more than a decade.
In fact, the well of data social media makes available to advertisers was one of the first concerns raised by observers of the then-nascent “social networking” phenomenon when it first appeared in the early 2000s. I did a quick search of databases to find early studies about this topic and to wit, this quote is from a 2005 report created by the Annenberg Public Policy Center at the University of Pennsylvania:
“Most internet-using U.S. adults are aware that companies can follow their behavior online.” (Turow et al, 2005, p. 4)
That same study went on to reference the 2002 Tom Cruise blockbuster “Minority Report” (which Cuban also references in the Inc Magazine interview).
An even older Annenberg report (from 2003) detailed the pitch of the now-defunct Gator Corporation, which embedded tracking software on social networking software services like KaZaA (remember KaZaA?):
“Let’s say you sell baby food. We know which consumers are displaying behaviors relevant to the baby food category through their online behavior. Instead of targeting primarily by demographics, you can target consumers who are showing or have shown an interest in your category. … Gator offers several vehicles to display your ad or promotional message. You decide when and how your message is displayed to consumers exhibiting a behavior in your category.” (Turow et al, 2005, p. 6)
So it’s not a revelation that algorithmic data is mined and analyzed by marketers. What I do find revelatory is that Cuban thinks he has the power to do something about it.
There are two major problems with the claims made by Cuban about two upcoming apps, Cyber Dust (a ripoff of Snapchat with a 30-second window) and Xpire:
- They can’t possibly hide or delete a user’s social media activity from advertisers.
- What a person DOESN’T do on social media can be just as valuable to marketers as what conscious actions they take.
Allow me to explain.
First, one can’t truly delete one’s social media activity to remove it from the prying eyes of marketers using it to produce an algorithmic profile.
You can delete the post from your timeline, sure, but that doesn’t actually mean it’s “deleted.” As far back as 2010, for example, it has been public knowledge that Facebook caches a server-side copy of all of your content. In order to truly delete all of your posts and photos from the prying eyes of advertisers, you would need to hack into Facebook and remove it from the inside (which would be illegal).
Moreover, even if we discount the server-side caching that takes place on social media platforms, simply viewing a social media site like Facebook creates a trail of data that feeds the digital profiles sites like Facebook build for each of us. At the most basic level, Facebook tracks what you scroll past (counted as “impressions”), the time you spend on content, and what you search for.
Apps (like Snapchat or Cuban’s “Cyber Dust”) which purport to delete content within a certain time window are fatally-flawed in concept because of the many touchpoints they have to make as they go from one user to another. If you “snap” a compromising photo, that data can be accessed at many times between Person A and Person B – here are just a few:
- From the data cached on Person A’s phone (tracked by mobile phone carriers).
- Intercepted between the phone and whatever Internet connectivity point is used to send the message (be it wi-fi or cellular).
- From the server used to pass the content through to the app’s (Snapchat’s) servers.
- From the app’s (Snapchat’s) servers.
- From the server receiving the content from the app’s (Snapchat’s) servers.
- From the data cached on Person B’s phone (or by Person B if they decide to take a screen capture of the photo and publish it to the web, which has been the downfall of several Snapchat users recently).
Further, the above scenario assumes you don’t have one app integrated with another (which adds an additional layer of touchpoints upon which this data can reside).
Second, the actions you DON’T take can be just as valuable to marketers as the actions you DO take. This reality plays out in a couple of different ways:
Facebook Caches Unposted Data: In 2013, the public became aware that Facebook tracks and saves posts that users delete at the last minute without posting. Re-read that sentence. Facebook is caching the keystrokes you enter – even if you decide not to publish them.
That data, analyzed by a PhD student from Carnegie Mellon University and a Facebook researcher, was used to produce a report revealed at the Association for the Advancement of Artificial Intelligence. Here’s one of the key findings:
“Our results indicate that 71% of users exhibited some level of last-minute self-censorship in the time period, and provide specific evidence supporting the theory that a user’s “perceived audience” lies at the heart of the issue: posts are censored more frequently than comments, with status updates and posts directed at groups censored most frequently of all sharing use cases investigated.” (Das and Kramer, 2013, p. 1)
“Escher Fish Theory”: I’m loathe to coin a term, but there isn’t really an existing shorthand (that I’m aware of) to describe the value of observing the gaps in our social graphs. For example, who we’re not connected to (interests we don’t have, posts we don’t like, updates we don’t comment on) can be a valuable insight now that we have the computing power to crunch those petabytes of data. The tessellations of M.C. Escher provides a good illustration of this concept (that recognizable patterns exist in between other patterns):
The only way to stop social media platforms from gathering this data would be to try to clog the datastream with phony likes, shares and comments.
Cuban’s premise is flawed for another reason – namely the idea that out-of-context messages will be used to incriminate us. This is pointedly absurd because the same systems that cache all of this data track iterations of that data, which would provide exculpatory evidence in the event someone were to modify them to distort what we posted.
Even if we were to assume that Cuban’s apps worked as intended (they won’t) they could conceivably produce the opposite of their intended result. A social media user with a completely sanitized history could actually create suspicion. A benign and mundane history of digital activity draws less attention than a blank page.
Our privacy is certainly going through dramatic changes – and so are our notions of privacy. The reason social media platforms continue to grow in both the number of monthly active users and the volume of content those users create is that they provide a benefit that transcends the loss of privacy we’re experiencing. No one has a comprehensive solution of how to balance privacy and the utility derived from transparency, least of all Mark Cuban.
Das, S., & Kramer, A. (2013). Self-Censorship on Facebook. In Association for the Advancement of Artificial Intelligence. Retrieved from http://bit.ly/1r9EZ6A
Turow, J. (2003). Americans & Online Privacy: The System is Broken. In Annenberg Public Policy Center. Retrieved from http://bit.ly/1HfZOPV
Turow, J., Feldman, L., & Meltzer, K. (2005). Open to Exploitation: American Shoppers Online and Offline . In Annenberg Public Policy Center. Retrieved from http://bit.ly/1r9Et8M