Why bother with Data Protection?

I’ve been talking about your data in the last post. About how we’re gonna talk about securing it. But why even bother, what is our motivation to save our data? Facebook, Instagramm and Foursquare are built around sharing private things in public. The entire world seems to revolve around the “likes” we get for sharing our story. What is our motivation to secure our data and keep it private?

Well there are several reasons I would like to discuss here.
When we create data, this data contains information about us. Post a funny status on facebook? Data mining can combine this funny status with other information to try and get to know you better. Oh you’re meal on Instagramm has quite high fat and carbohydrate contents, if you keep eating this in high amounts, your health might be affected. Check-In at foursquare at a restaurant? Sure no problem with that, but how come that it took you over two hours between your last check-in and in between you phone was switched off. Did you do something sketchy? This is data you would want to be secured, wouldn’t you?

This is data about your habits and about what you do, but there’s also data that is of higher value to you. Be it your diploma or the video of your child’s first steps, you’re not gonna be happy when that is gone. Maybe you even have data you monetarize which should be saved in some sort of way as well.

Right now I would like to categorize your data into different classes. I suggest four classes here that are very different in importance and impact:

1. Why bother ? category

Is it ironic that I am trying to convince you that you should save and secure your data, but the first category I talk about is a category of data with no value whatsoever? Well, no. This category is the least important category, it will serve as a fall-back option. However, I want to spark an idea that all your data can be valuable. Just a couple ideas how data you would initially sort into this category might not fit here after all:
Example one, is my favorite example and it is quite absurd. There are some twitter accounts out there that are dedicated to post the bowel movements of the owner. This is something of no value to the world whatsoever, but when you have an appointment with a proctologist, he might be able to get some valuable hints from information like that.
Those funny posts on facebook are used by facebook to show you personalized advertisements. The problem with facebook is that they don’t just give you personalized advertisements, they sell your data. One post alone isn’t all that bad, but in combination we will get to category two.
Already adding different friends and groups of friends to facebook will reveal something about yourself, maybe category one is even the hardest category because there can always be something that kicks it out of this category.

2. Cheaper by the Dozen ? category

Statistics today can do a lot with data. Take a quick look at facebook, or google or any other big internet advertising platform. When I posted about London I suddenly got some nice adverts for cheap London hotels. When I commented on something about Red Bull I got ads for some nice Red Bull discounts. Now this is pretty easy to comprehend. But patterns and systems in data can be handled by data mining and this data can tell a lot about you and what you like. Now seriously some guys did a proof of concept on a ?gaydar? checking friend networks on facebook for the sexual orientation of a target Gaydar – Proof of Concept. I have already talked about how data from category one can easily be aggregated and slip into category two, if the gaydar didn’t convince you, I’ll just show you a real life example. This is data from facebook, my personal friend network. It took me ten minutes to create this and the colours were created by an algorithm, not by me. And this is pretty much exactly how my friend network works.
Facebook Friend Network
A couple singled out dots from people I know from somewhere, but don’t know any of their friends. Surrounded by big clouds of networks that have a couple overlaps, from university and hobbies. If you think about it, this colouring was only taken from the interconnections of my friend network and a simple 10 minute playing around yielded a good overview how my friend network looks like and where the best connected people hide. This way I actually found that a friend from a hobby has connections to my friends back home from school.

3. Keeper – category

These are your family pictures, your memories and your dreams. You would maybe show them to some people and even post them on the internet, but losing those would be a bummer. This is data worth securing. We will discuss different possibilities to backup your data and briefly touch the subject of physical security of data. I believe this category is pretty clear and easy, you can tell whether you want to keep this data by all means or not.

4. Don’t touch it ? category

You might not want your diary being read by anyone. This is the category of data that is very personal. You don’t want to lose it and you want to keep it to yourself. This will be interesting to handle! How do we keep something secure when we don’t want anyone else having access to it?

If you only have data, you really do not care about, you are right to ask – why bother. In any other case you should consider following this series, to be able to know how your data can be protected and how you are vulnerable.
We will find appropriate way to work out how our data can be safer in physical as well as electronic ways. Suggestions are highly appreciated on what you would like me to write about, since this field is so wide spread.

The following two tabs change content below.
... is a geophysicist by heart. He works at the intersection of machine learning and geoscience. He is the founder of The Way of the Geophysicist and a deep learning enthusiast. Writing mostly about computational geoscience and interesting bits and pieces relevant to post-grad life.

Latest posts by Jesper Dramsch (see all)

Posted in Software and Programming and tagged .