We expect companies to commit to the principle of data minimization and to demonstrate how this principle shapes their practices regarding user information. If companies collect multiple types of information, we expect them to provide detail on how they handle each type of information.
Note: In some cases, laws or regulations might require companies to collect certain information or might prohibit or discourage the company from disclosing what user information they collect. Researchers will document situations where this is the case, but a company will still lose points if it fails to meet all elements. This represents a situation where the law causes companies to be uncompetitive, and we encourage companies to advocate for laws that enable them to fully respect users’ rights to freedom of expression and privacy.
Clearly disclose(s) – The company presents or explains its policies or practices in its public-facing materials in a way that is easy for users to find and understand.
Collect / Collection – All means by which a company may gather information about users. For example, a company may collect this information directly in a range of situations, including when users upload content for public sharing, submit phone numbers for account verification, transmit personal information in private conversation with one another, etc. A company may also collect this information indirectly, for example, by recording log data, account information, metadata, and other related information that describes users and/or documents their activities.
Data minimization – According to the European Data Protection Supervisor (EDPS), “The principle of ‘data minimization’ means that a data controller [“the institution or body that determines the purposes and means of the processing of personal data”] should limit the collection of personal information to what is directly relevant and necessary to accomplish a specified purpose. They should also retain the data only for as long as is necessary to fulfil that purpose. In other words, data controllers should collect only the personal data they really need, and should keep it only for as long as they need it.”
User information— Any data that is connected to an identifiable person, or may be connected to such a person by combining datasets or utilizing data-mining techniques. As further explanation, user Information is any data that documents a user’s characteristics and/or activities. This information may or may not be tied to a specific user account. This information includes, but is not limited to, personal correspondence, user-generated content, account preferences and settings, log and access data, data about a user’s activities or preferences collected from third parties either through behavioral tracking or purchasing of data, and all forms of metadata. User Information is never considered anonymous except when included solely as a basis to generate global measures (e.g. number of active monthly users). For example, the statement, ‘Our service has 1 million monthly active users,’ contains anonymous data, since it does not give enough information to know who those 1 million users are.
Anonymous data is “data that is in no way connected to another piece of information that could enable a user to be identified.”
This expansive view is necessary to reflect several facts. First, skilled analysts can de-anonymize large data sets. This renders nearly all promises of anonymization unattainable. In essence, any data tied to an “anonymous identifier” is not anonymous; rather, this is often pseudonymous data that may be tied back to the user’s offline identity. Second, metadata may be as or more revealing of a user’s associations and interests than content data, thus this data is of vital interest. Third, entities that have access to many sources of data, such as data brokers and governments, may be able to pair two or more data sources to reveal information about users. Thus, sophisticated actors can use data that seems anonymous to construct a larger picture of a user.
Company webpage or section on data protection or data collection