About the data
The company should clearly disclose what user information it collects and how.
Companies collect a wide range of personal information from users—from personal details and account profiles to a user’s activities and location.
Methodology
We expect companies to clearly disclose what types of user information (as RDR defines it) they collect.
Note: In some cases, laws or regulations might require companies to collect certain information or might prohibit or discourage the company from disclosing what user information they collect. Researchers should document situations where this is the case, but a company will still lose points if it fails to meet all elements. This represents a situation where the law causes companies to be uncompetitive, and we encourage companies to advocate for laws that enable them to fully respect users’ rights to freedom of expression and privacy.
Definition(s):
Clearly disclose(s) – The company presents or explains its policies or practices in its public-facing materials in a way that is easy for users to find and understand.
Collect / Collection– All means by which a company may gather information about users. For example, a company may collect this information directly in a range of situations, including when users upload content for public sharing, submit phone numbers for account verification, transmit personal information in private conversation with one another, etc. A company may also collect this information indirectly, for example, by recording log data, account information, metadata, and other related information that describes users and/or documents their activities.
User information— Any data that is connected to an identifiable person, or may be connected to such a person by combining datasets or utilizing data-mining techniques. As further explanation, user Information is any data that documents a user’s characteristics and/or activities. This information may or may not be tied to a specific user account. This information includes, but is not limited to, personal correspondence, user-generated content, account preferences and settings, log and access data, data about a user’s activities or preferences collected from third parties either through behavioral tracking or purchasing of data, and all forms of metadata. User Information is never considered anonymous except when included solely as a basis to generate global measures (e.g. number of active monthly users). For example, the statement, ‘Our service has 1 million monthly active users,’ contains anonymous data, since it does not give enough information to know who those 1 million users are.
Anonymous data is “data that is in no way connected to another piece of information that could enable a user to be identified.”
This expansive view is necessary to reflect several facts. First, skilled analysts can de-anonymize large data sets. This renders nearly all promises of anonymization unattainable. In essence, any data tied to an “anonymous identifier” is not anonymous; rather, this is often pseudonymous data that may be tied back to the user’s offline identity. Second, metadata may be as or more revealing of a user’s associations and interests than content data, thus this data is of vital interest. Third, entities that have access to many sources of data, such as data brokers and governments, may be able to pair two or more data sources to reveal information about users. Thus, sophisticated actors can use data that seems anonymous to construct a larger picture of a user.
Potential sources:
Company privacy policy
Company webpage or section on data protection or data collection