Policy

Content safety & abuse reporting

Effective April 24, 2026

socialAF prohibits illegal, abusive, or harmful content — including material involving minors in sexual contexts, non-consensual intimate imagery, harassment, hate speech, extremist violence, and content violating privacy or intellectual property rights. This page explains the four-layer safety framework we operate and how to report content that slipped through.

Prohibited content

Minors in any depiction. No sexual, suggestive, romantic, or otherwise intimate content involving anyone who is, could be perceived as, or is described as under 18 — including drawn, illustrated, animated, or AI-generated likenesses.
Non-consensual likenesses of real people. No depictions of celebrities, public figures, private individuals, deceased persons, or anyone identifiable without their explicit written consent — and no intimate content of real people under any circumstances.
Illegal content. Material that depicts, facilitates, or promotes criminal acts, including violence against identifiable individuals.
Harassment & doxxing. Content created to harass, threaten, dox, or defame an identifiable person.
Extremism, hate, self-harm. Content promoting violent extremism, protected-class hate, or self-harm instruction.

Four-layer safety framework

Four independent enforcement layers run automatically on every relevant request. If one fails, the others still block. Controls fail in the safer direction — when the prompt-analysis service is unreachable, matched patterns are blocked rather than permitted.

Layer 1 — Prompt analysis. Every text prompt — across every generation tool, the REST API, and any future training flow — is scanned in three stages: celebrity pattern matching, deterministic minor-phrase screening, and an AI-powered classifier that reads context. Rejected prompts return a clear message before anything reaches the model. Server- side only; cannot be bypassed via direct API submission.
Layer 2 — Image classifier. Every reference image uploaded to a character and every image returned from a generation call is run through a purpose-built minor-detection classifier. Flagged uploads are rejected before reaching providers or storage; blocked outputs refund credits automatically.
Layer 3 — Storage-layer CSAM scanning. All stored images — generated outputs and uploaded references — run through Cloudflare’s CSAM Scanning Tool, matching against fuzzy-hash databases maintained by NCMEC and partner child-safety organizations. Detections are reported to NCMEC’s CyberTipline per 18 U.S.C. § 2258A. Scanning operates independently from application servers.
Layer 4 — Provider filters. Every upstream model provider applies their own filters before returning generations. Treated as belt-and-suspenders — we never rely on provider filters as the primary safety.

Audit & observability

Every scan across every layer writes a ModerationEvent audit row with: scan type, triggering layer, verdict, target asset, flags, raw provider response, and timestamp. Records are retained for seven years to support compliance obligations under 18 U.S.C. § 2258A and law-enforcement preservation requests. An internal dashboard surfaces these events for real-time investigation and trend analysis.

How to report violating content

[email protected]

Include: the URL or asset ID of the content, a description of why it violates policy, and any supporting evidence (screenshots, timestamps, account names). We target a five-business-day resolution for standard reviews and take immediate action on clear violations.

For suspected CSAM specifically you may also report directly to NCMEC at report.cybertip.org. For active human trafficking or immediate danger, contact the U.S. National Human Trafficking Hotline at 1-888-373-7888 or text “HELP” to 233733.

Enforcement

Violating prompts are blocked at the gate — no credits are deducted, no content is generated.
Violating uploaded images are rejected at the reference step and are not served on the CDN.
Confirmed violations may result in immediate account termination without refund, permanent IP ban, and reports to law enforcement where required by law. Accounts terminated for child exploitation content are ineligible for refunds.

See also our Content Removal Policy for the takedown workflow and appeal process.

Appeals

If you believe content was removed or an action was taken in error, submit an appeal to [email protected] with the subject line “Appeal” and include the asset ID, the action taken, and the reason you believe the decision was incorrect. Appeals are reviewed by someone other than the person who made the original decision.

Related policies

Terms Privacy Content safety Anti-trafficking Content removal DMCA Complaints Law enforcement Sub-processors Refunds Cookies