Foundations of Privacy and Data Protection

1. Foundations of Privacy and Data Protection#

1.1. Introduction#

Privacy has become one of the central concerns of the digital age. As societies rely increasingly on data-driven systems, the ways in which personal data is collected, processed, shared and protected deeply affect both individuals and institutions. While this chapter focuses on privacy, it is closely related to the broader field of information security and to the regulatory frameworks that govern how data may be used.

In this chapter, we introduce the key conceptual, ethical, legal and technical foundations of privacy and data protection, with particular emphasis on the European context and the General Data Protection Regulation (GDPR). Although many examples refer to health and biomedical data, the ideas discussed here apply broadly to any domain where personal information is handled.

1.2. Security and Privacy#

Privacy is often conflated with security, but they are not the same. Security focuses on protecting systems and data against unauthorized access, modification or destruction. Privacy focuses on the rights and interests of individuals whose data is being processed. The two areas overlap, particularly around confidentiality, but each has its own goals and tools.

Information security is often summarized by the so-called CIA triad: confidentiality, integrity and availability. Confidentiality is about ensuring that only authorized individuals or systems can read particular data. It is usually implemented through authentication, access control, encryption and other technical safeguards. Integrity is about ensuring that only authorized parties can modify data; any unauthorized change, even if unintentional, is treated as a violation of integrity. Availability refers to keeping systems and data accessible to authorized users when needed, including in the face of hardware failures, accidents or natural disasters. Together, these three dimensions define what it means for a system to be secure.

Privacy, in contrast, is usually described as control over personal data. It is very much related to confidentiality, since restricting access is one way of protecting privacy, but it goes further. Privacy raises questions such as: Which data about a person are collected? For what purposes? Who can use these data and for how long? What rights does the individual have to access, correct or delete the data? As soon as we ask these questions, we move from purely technical concerns to ethical and legal ones.

1.3. Ethical Foundations of Privacy#

The ethical foundations of privacy are especially visible in fields like medicine and biomedical research, but they apply broadly across society. A key principle is the idea of do no harm. Any activity involving personal data may have both potential benefits and potential harms. For instance, aggregating health records can improve medical research and care, but leaks of the same data may cause stigma, discrimination or emotional distress. Ethical practice requires us to weigh these risks and to take serious measures to prevent harm.

Privacy also relates closely to safety. Oversharing information about daily routines, locations or plans can put individuals at risk of burglary, harassment or violence. Social networks and other platforms make it easy to broadcast such information widely, often without users fully realizing the consequences. When combined with other sources of data, seemingly harmless pieces of information can create detailed profiles of a person’s habits and vulnerabilities.

Another important dimension is fairness and non-discrimination. If organizations collect and analyze large amounts of personal data, they may make decisions that affect employment, insurance, credit, access to services, or opportunities in ways that systematically discriminate against certain groups. Attributes such as gender, age, ethnicity, religion or sexual orientation can be used explicitly or implicitly in decision-making systems. Protecting privacy, especially for sensitive attributes, is one way to reduce these risks.

Finally, privacy is intimately linked to autonomy. When individuals feel that they are constantly being observed, whether by governments or companies, their freedom to act and express themselves may be reduced. This “chilling effect” is particularly serious in societies where freedom of speech and political dissent are already fragile. By limiting pervasive surveillance and giving people control over their data, privacy protections help safeguard the range of actions and choices that individuals can reasonably make.

1.4. Historical and Legal Evolution of Privacy#

Modern notions of privacy as a legal right emerged in the late nineteenth and twentieth centuries. One famous early statement comes from the 1890 essay by Warren and Brandeis, which framed privacy as the right to be let alone. This arose at a time when photography and portable cameras allowed journalists and others to capture intimate images without the consent of the people involved. The concern was not only about property or physical harm, but about unwanted intrusion into personal life.

Warren II y Brandeis, 1890

Right to be left alone.

After the Second World War, privacy was recognized as a fundamental human right in international law. The Universal Declaration of Human Rights, adopted in 1948, states that no one should be subjected to arbitrary interference with their privacy, family, home or correspondence. Originally, this referred mainly to letters and telephone communications, but in contemporary interpretations it applies equally to email, messaging and other digital communications.

Article 12, Universal Declaration of Human Rights

No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.

In the European Union, the Charter of Fundamental Rights goes further by distinguishing between the right to private life and the right to protection of personal data. The latter explicitly requires that personal data be processed fairly, for specific purposes, on a legitimate basis (such as consent or legal obligation) and under the control of independent authorities. This evolution in legal thinking culminated in the GDPR, which is currently the central framework for data protection in Europe.

Article 7, Charter of Fundamental Rights of the European Union, 2000

Everyone has the right to respect for his or her private and family life, home and communications.

Article 8, Charter of Fundamental Rights of the European Union, 2000

Everyone has the right to the protection of personal data concerning him or her.
Such data must be processed fairly for specified purposes and on the basis of the consent of the person concerned or some other legitimate basis laid down by law. Everyone has the right of access to data which has been collected concerning him or her, and the right to have it rectified.
Compliance with these rules shall be subject to control by an independent authority.

1.5. What Counts as Personal Data?#

The GDPR defines personal data as any information relating to an identified or identifiable natural person. In practice, such data can take many forms. It is useful to distinguish three broad categories: direct identifiers, indirect identifiers and sensitive data.

Direct identifiers

Attributes that uniquely identify a data subject, such as their name or social security number.

Indirect identifiers

Attributes that, when combined, can uniquely identify a data subject, such as age, gender, and address. Also known as quasi-identifiers.

Sensitive information

Attributes that reveal especially protected information about the data subject, such as a medical diagnosis, sexual orientation, or religious beliefs.

Direct identifiers are data items that uniquely identify a person on their own. Examples include full name, national identification number, social security number or phone number. Because they point directly to a specific individual, these fields are typically the first targets for protection measures such as removal, masking or encryption.

Indirect identifiers, also called quasi-identifiers, are data attributes that do not uniquely identify a person on their own but that can do so when combined with other information. Age, gender, address, occupation or IP address are typical examples. A single attribute might correspond to many people, but several attributes together may narrow down the possibilities to a single person. A classic illustration is that knowing someone’s date of birth, gender and postal code is often enough to re-identify them in many datasets. Another example discussed in class is a description like “American politician, former senator, member of the Democratic Party, President of the United States,” which leaves little doubt about whom we are referring to, even without stating a name.

Sensitive personal data are those data that reveal particularly intimate aspects of a person’s life, and that can lead to significant harm if misused or disclosed. Health information, diagnoses, laboratory results, mental health status, sexual orientation, political opinions, religious or philosophical beliefs, genetic information and biometric data all fall into this category. The GDPR treats many of these as “special categories of data,” placing strict limits on how they may be processed and stored.

Direct identifiers	Indirect identifiers	Confidential information
Name	Age	Transactions (e.g. purchases)
Email address	Gender	Salary
Mobile phone number	Race	Credit ranking
National ID number	Date of birth	Insurance policy
Passport number	Address	Medical status
Account number	Postal code	Vaccination status
SSN number	Job title	Sexual orientation
Social media name	Company name	Religious beliefs
	Marital status
	Height
	Weight
	IP address
	GPS location

These categories are not separate islands. Sensitive data may be linked to individuals via direct or indirect identifiers. Protecting privacy is therefore not only about hiding names but about understanding how combinations of attributes can reconstruct identity or reveal sensitive traits.

Example

Barack Hussein Obama II (born in Hawaii on August 4th 1961) is an american lawyer and politician who served as the 44nd President of the United States, from January 20th 2009 until January 20th 2017. He is a member of the Democratic Party, and was the first African American to serve as President. Previously, he had been a Senator for the State of Illinois.

Example

Barack Hussein Obama II (born in Hawaii on August 4th 1961) is an american lawyer and politician who served as the 44nd President of the United States, from January 20th 2009 until January 20th 2017. He is a member of the Democratic Party, and was the first African American to serve as President. Previously, he had been a Senator for the State of Illinois.

Example

Barack Hussein Obama II (born in Hawaii on August 4th 1961) is an american lawyer and politician who served as the 44nd President of the United States, from January 20th 2009 until January 20th 2017. He is a member of the Democratic Party, and was the first African American to serve as President. Previously, he had been a Senator for the State of Illinois.

Example

Barack Hussein Obama II (born in Hawaii on August 4th 1961) is an american lawyer and politician who served as the 44nd President of the United States, from January 20th 2009 until January 20th 2017. He is a member of the Democratic Party, and was the first African American to serve as President. Previously, he had been a Senator for the State of Illinois.

1.6. Contemporary Threats to Privacy#

The growth of digital infrastructures, networked services and data-hungry business models has produced a wide range of threats to privacy. Some originate from the state, others from private companies, and others from malicious actors or simple negligence.

State surveillance includes measures such as mandatory identity systems, extensive CCTV networks and targeted or mass interception of communications. The revelations about the NSA’s PRISM program, for example, showed how large volumes of data passing through major online service providers could be collected and analyzed by intelligence agencies. Even in democratic societies, there is an ongoing debate about how far such practices can go without violating fundamental rights.

Corporate surveillance and profiling are driven largely by economic incentives. Many online services are financed through targeted advertising, which requires detailed user profiles. Websites often include third-party trackers based on cookies or fingerprinting (analyzing browser and device characteristics) that record visits, clicks and other behaviors across multiple sites. Social media platforms analyze interactions, likes and shares to infer interests and characteristics.

First and third-party cookies

Example of third-party cookies in digital newspapers

ElPais ElDiario ABC ElMundo

Services like Am I Unique and EFF Cover your tracks can help users analyze their browser and system configuration to assess whether they are vulnerable to fingerprinting tactics.

Recommendation algorithms on platforms such as YouTube, TikTok or e-commerce sites use this data to keep users engaged or to sell more products. Although these systems may provide convenience or entertainment, they also accumulate large amounts of behavioral data that can be misused or repurposed.

Smart devices and the Internet of Things extend data collection into physical spaces. Smartphones, wearables, home assistants, security cameras, robot vacuum cleaners and even household appliances gather and transmit data about locations, movements, daily routines and sometimes images or audio from inside private homes. In some reported cases, these devices have sent far more data than users might expect, raising questions about what is being collected and for what purpose.

Washing machine excessive data collection.

Data brokers are organizations that specialize in gathering, aggregating and trading personal data. They may collect information from public sources, commercial transactions, social media, location histories and other datasets, then combine them into detailed profiles that are sold to advertisers, insurers or other clients. In the European Union, the GDPR places significant constraints on such activities, but in other jurisdictions this market is still largely unregulated.

Finally, data breaches represent an ever-present risk. When organizations store large volumes of personal data, vulnerabilities in their systems or human error can lead to unauthorized access. Once data has been leaked and copied, it is practically impossible to pull it back. Online services such as Have I Been Pwned reflect the scale of this problem by allowing users to check whether their email addresses and passwords have appeared in publicly known breaches.

Foundations of Privacy and Data Protection

Contents

1. Foundations of Privacy and Data Protection#

1.1. Introduction#

1.2. Security and Privacy#

1.3. Ethical Foundations of Privacy#

1.4. Historical and Legal Evolution of Privacy#

1.5. What Counts as Personal Data?#

1.6. Contemporary Threats to Privacy#

1.7. The General Data Protection Regulation (GDPR)#

1.7.1. Key Concepts and Roles in the GDPR#

1.7.2. Principles for Lawful Data Processing#

1.7.3. Rights of the Data Subject#

1.7.4. Obligations of Controllers and Processors#

1.7.5. Practical Considerations#