HomeCompaniesSafetyKit

SafetyKit

AI agents for risk, compliance, and safety

SafetyKit's AI agents do the dirty work for big tech and finance companies so humans don't have to. Today companies rely on armies of low-paid outsourced humans to protect their platforms from fraud, harassment, and crime. It’s super important work, but the jobs are difficult and sometimes traumatizing, and it’s impossible to keep up. Our mission is to end this manual toil. We learned this domain first-hand at Stripe and Airbnb, where first we built software for human reviewers, and then built models to replace them. We’re a group of engineers from Stripe, Meta, Netflix, MIT, and Princeton that love language models and unironically love B2B SaaS. We know the economy is going to be rearranged over the next 4 years and we want to be in the driver’s seat. We build vertical AI that works and have some of the world’s largest companies as paying customers. Patreon, Eventbrite, Upwork, Character.ai, Substack, Faire, along with a $10B+ marketplace, a $100B+ payments company, and more rely on SafetyKit in production.
Active Founders
David Graunke
David Graunke

David Graunke, Founder

David led engineering for risk reviews at Stripe for fraud, credit, content moderation, and financial crimes. He built the policy and workflow engine that scaled Stripe from internal reviewers to thousands of outsourced vendor agents.
Steven Guichard
Steven Guichard

Steven Guichard, Founder

Steven is a cofounder of SafetyKit, working to replace human risk reviewers with language models. Previously he worked at Carbic as the first software engineer and later as the CEO. There he helped build ultrasonic flow sensors which were installed on oil pipelines and offshore rigs around the world. Steven also cofounded Thomas Street, a software design and engineering consultancy, where he worked with companies like Cisco, Roche, and DirecTV.
Jobs at SafetyKit
San Francisco, CA, US
$150K - $200K
Any
SafetyKit
Founded:2023
Batch:Summer 2023
Team Size:10
Status:
Active
Location:San Francisco
Primary Partner:Gustaf Alstromer
Company Launches
SafetyKit — AI-powered Trust and Safety automation
See original launch post

The problem

Trust and Safety teams at large companies spend tens of millions of dollars on human reviewers. These reviewers make decisions about what is or isn’t allowed on the platform. This includes content moderation, but also things like checking Airbnb listings for discriminatory language, or reviewing Stripe accounts for Sanctions violations. These reviewers are outsourced agents following prescriptive workflows.

Managing this workforce is painful and expensive. Workflow changes take months to deploy, quality monitoring is inaccurate and ad-hoc, and tooling improvements and automation require very scarce engineering resources.

Companies use humans because they’re flexible, but those humans aren’t particularly good at it. Human decision-making sticks around because ML and automation is expensive and rigid, and requires eng resources T&S teams don’t have. This is despite the fact that humans are not particularly accurate reviewers (accuracy is frequently around 70%).

Our solution

We use GPT-4 and other language models to directly interpret and apply the workflows that would otherwise be performed by humans. GPT-4 performs well at these tasks, but letting T&S teams run it on thousands or millions of pieces of content safely and confidently requires work.

We’ve built a policy manager/editor that makes T&S teams feel like they’re editing a policy or workflow in Google Docs. We never want our T&S users to feel like they’re prompt engineering!

Users can add policy definitions, pick out of the content signals that matter to them, and build up automated rules based on those signals.

We slice and dice the input document into a series of prompts that we run through a suite of LLMs and image models. Our user can then can run their policy across examples to see how it performs:

Explainability and decision-making papertrails are super important to our users. Traditional ML and human review fall pretty short here. SafetyKit gives our users clear reasoning for each decision:

We think that this transparency plus built-in quality monitoring and a much much faster feedback loop will make SafetyKit more reliable and precise than human reviewers.

We provide a simple API for evaluating content against SafetyKit policies along with a prebuilt integrations for Salesforce and Zendesk.

Right now we evaluate policies over text and image content and we’re working on audio and video support.

Who are we?

We’re Steven, David, and Alex.

David and Alex worked at Stripe, where we worked on a platform to break big complicated policies into small steps that human reviewers are good at.

Alex did the same thing at Airbnb before joining Stripe, focusing on marketplace risk and offline safety.

Now we’re using AI to give every company the same scale and precision.

How you can help!

We want to talk to Trust and Safety teams! Please email us at founders@getsafetykit.com! Beyond T&S, if you have repetitive human decisions you want to automate in Customer Service, Legal Ops, or another back-of-house function, we’d love to talk!