country flag

Briefly: Minimum Child Safety Measures for Online Platforms

Introduction to the issue and NCMEC's position

What is it?

Certain technology solutions related to CSAM detection have proven to be effective and broader adoption of these solutions would create a set of minimum measures for online platforms to implement as part of their efforts to counter online CSAM offenses.

NCMEC's Position:

Services that allow for electronic storage, transmission, or creation of images and/or videos—including generative AI and livestreaming services—should use cryptographic hashing, perceptual hashing, and/or image classifiers to prevent, detect, disrupt, and report CSAM as appropriate.

Why does it matter?

Technological innovation has created opportunities to improve online safety by incorporating tools to better detect CSAM and facilitate the reporting of online child sexual exploitation incidents. Using such technologies is voluntary, and implementation across online platforms varies widely. More widespread and consistent adoption of CSAM detection technologies and strategies would help counter the proliferation of CSAM, improve identification of offenders and victims, and facilitate more robust reports relating to CSAM distribution. These outcomes would better protect both survivors who are revictimized when CSAM in which they are depicted is re-circulated online and victims whose abuse has not yet been discovered.

What context is relevant?

Cryptographic hashing is a process using a mathematical algorithm to create a string of characters of a fixed length (that may vary depending on the particular algorithm used) based on the content of an image or video. The string, known as a hash or hash value, can be compared against other hashes or hash values to determine whether two files are identical (such as duplicate images or videos). There are several different types of cryptographic hash algorithms, including MD5, SHA-1, and SHA-256, which are commonly used by online child protection organizations.

Perceptual hashing is a process using a mathematical algorithm to assess the similarity of different images or videos based on the perceptual characteristics of an image or video (how it looks to a human observer). A perceptual hashing algorithm will generate similar hash values for visually similar (though not identical) images, even if the images are not identical. PhotoDNA is a well-known perceptual hashing algorithm widely used for CSAM detection purposes; others include PDQ and pHash.

Image classifiers are artificial intelligence applications capable of recognizing objects in an image or video. Classifiers can be implemented in various ways to examine still photos, recorded or live videos, and even live activity (such as on a camera capable of detecting smiles to help capture the best group photo). This technology has rapidly been adopted for a variety of consumer purposes, including smile detection in digital cameras, visual searching of online retailers, and environmental awareness in self-driving cars. Specially trained image classifiers are useful for detecting suspected CSAM that evades cryptographic or perceptual hashing (usually because it has not previously been detected, confirmed, and added to relevant hash databases). In this way, image classification is uniquely capable of detecting CSAM that is newly produced or not widely circulated.

To counter the spread of CSAM, some of the world’s largest online platforms have developed and deployed hashing and classifier tools. Microsoft co-developed PhotoDNA and makes it available for use by qualified organizations. Google’s Content Safety API classifier and CSAI Match (which can detect previously hashed CSAM imagery “amid a high volume of non-violative video content”) are both available to certain partners. Meta developed PDQ, and its video equivalent TMK+PDQF, and subsequently released them both as open-source projects.

Not all companies make use of these resources, or others like them—even companies that have developed tools themselves. Apple developed NeuralHash, a perceptual hashing algorithm, but abandoned plans to implement it after meeting resistance from critics.

NCMEC hosts several different hash sharing initiatives to facilitate the voluntary sharing of suspected CSAM hash values from reporting hotlines to online platforms and among online platforms themselves. The Tech Coalition’s signal-sharing program, Project Lantern, also allows sharing of CSAM hash values and other signals among program participants.

In some jurisdictions, legislative and regulatory bodies are evaluating technology-neutral requirements for online platforms to detect, report, and remove CSAM, focusing on outcomes and deferring to online platforms regarding how to achieve them. This strategy seeks to ensure that whatever relevant laws or regulations are implemented remain applicable as new technologies are developed in the future and existing technologies become obsolete.

Cryptographic hashing, perceptual hashing, and image classification have proven to be effective measures to counter CSAM distribution. NCMEC supports continued innovation to develop other technological solutions and urges the broad adoption and deployment of these tools as minimum measures for online platforms to take as part of their efforts to counter image-based child sexual exploitation.

As with any other type of technology, no single solution is perfect. While hashing and classifier tools enable online platforms to detect, disrupt, and report suspected CSAM at scale, human review at some level remains an important consideration for online platforms and law enforcement.

What does the data reveal?

Largely driven by use of these technologies, online platforms that use hashing and image classifiers report tens of millions of suspected CSAM images and videos to NCMEC’s CyberTipline each year. Online platforms do not typically disclose in transparency reports how many CSAM images and videos were detected by which type of technology.

Google has estimated that about 90% of the imagery it reports to the CyberTipline is known or previously confirmed (through hashing) to be CSAM. Through its transparency reporting, Meta—which operates Facebook, Instagram, and other services—has disclosed that 90% to 99% of the violating content it “actioned” (removed, restricted, etc.) for child endangerment reasons related to child nudity and sexual exploitation was detected and actioned before any other users reported it. In the absence of detailed disclosure about how many images and videos were detected by hashing, classifiers, or other methods, the volume of imagery involved suggests that automated systems are significantly engaged in this work.

The Tech Coalition, in its first Lantern Transparency Report, disclosed that participating companies shared nearly 300,000 hashes as of December 2023, accounting for about 40% of the signal sharing through that project.

What have survivors said about it?

Survivors have expressed frustration about the failure of some online platforms to use technologies that have been proven to help detect, disrupt, and report CSAM distribution. Absent universal legal or regulatory requirements, survivors appeal to companies’ moral, social, ethical, and business responsibilities to advocate for minimum standards to combat CSAM on their platforms.

Opening Quote

Proven technological tools should be utilized by the tech industry wherever the ability to create, disseminate and/or store child sexual abuse imagery exists. Utilizing effective and available technology to prioritize the lives and safety of child victims is a decision that all tech companies should want to implement for their services.

- Survivor

What drives opposing viewpoints?

Opposition to online platforms utilization of hashing and image classification technology as minimum child protection measures comes from multiple perspectives. Privacy advocates may be concerned about practices that subject online content to screening, even if the screening methods fail to disclose any information about legal content. Some opposition also arises from the belief that the interests of user privacy override the safety benefits of technology tools, even those proven to protect children from being sexually exploited or revictimized. Some critics argue that these technologies are imperfect and may lead to false positives, or that the hash lists or training data underlying these tools could be compromised, accidentally or purposefully, to detect and block content other than CSAM. The financial costs of implementing these solutions without a clear financial benefit might lead some companies to resist adoption. Some also may suggest that setting specific minimum standards disincentivizes further innovation relating to alternate measures to combat online child sexual exploitation.