https://buildersbox.corp-sansan.com/entry/2026/02/26/100000

In December 2025, I joined the Product Security group at Sansan as an intern. Our team is responsible for the security posture of Sansan's entire multi-product ecosystem. This includes our sales digital transformation solution Sansan and our accounting AX solution Bill One.

To maintain a high security bar without slowing down development, we focus heavily on "Security Design Reviews" (SDRs for short). SDRs are a proactive approach to identifying vulnerabilities before a single line of code is written. In this post, I’ll share how I built Hayami, an AI agent designed to automate this process.

The Role of SDRs in the SDLC

The primary goal of an SDR is to shift left. Identifying architectural flaws and security risks during the design phase comes at a lower cost than finding the same flaw during implementation, and having to go back to the design phase.

In a traditional Software Development Life Cycle (SDLC), this review sits between requirements definition and implementation.

Our review process is categorized into two parts: 1. Mechanical compliance: verifying the design against our 160+ internal security guidelines and industry standards (like RFCs). 2. Contextual analysis: using intuition to identify complex logic flaws. This is the security equivalent of sniffing out "code smells."

Security as a Bottleneck

As Sansan scales, our Product Security team faced two challenges: - Coverage gap: manually auditing 160+ guideline items for every design document is operationally intensive for a lean security team. This often led to a "best-effort" approach where reviewers relied on intuition to remember which guidelines to check. - Velocity problem: with the rise of AI-assisted development, engineering output is accelerating. If security reviews don't scale at the same pace, they become a bottleneck, limiting the organization's overall delivery.

To solve this, we needed a way to ensure both quality (no missed items) and quantity (high throughput) without adding headcount.

Evaluating Existing Solutions

Before building a custom tool, we evaluated the AWS Security Agent (Preview). While the tool provides impressive out-of-the-box features, it fell short for our specific needs*1:

Operational overhead: mapping our 160+ evolving internal guidelines into AWS’s specific requirement format was a significant maintenance burden.
Workflow integration: we needed a tool that could handle triage. That is, using our separate internal guidelines in determining if a design even requires a review. We also wanted the system to integrate directly with our existing Slack-based request flow.

Hayami

We developed Hayami to bridge the gap between static guidelines and human intuition. Below is the basic composition of Hayami.

How it works:

A developer initiates a review via Slack.
Hayami retrieves the design document and relevant internal guidelines.
The LLM processes the data through a context engineering layer.
The analysis is delivered to the Product Security team for verification.

Benchmark Results

We benchmarked Hayami against actual design documents, using human expert reviews as the ground truth. The results demonstrated that Hayami outperformed generic agents in our specific environment:

Metric	Hayami	AWS Security Agent
Accuracy	95.8%	72.1%
False Negatives (Misses)	0%	4.8%
False Positives (Noise)	4.2%	23.0%

In security, false negatives are critical. Hayami was tuned to be over-cautious. It is slightly noisy, but that's better than having an overlooked security requirement.

Additionally, by automating the initial triage and feedback loop (identifying missing information or obvious guideline violations), we estimate a reduction in total review time by up to 18.76 business hours (this is based on the average time it took for initial response for SDRs between July to November of 2025).

Future Plans

Building Hayami was just the first step and our roadmap for scaling security at Sansan includes:

Automated continuous benchmarking to ensure LLM outputs remain accurate as guidelines evolve and different models are used.
Documenting the expert intuition into prompts so Hayami can detect more sophisticated architectural flaws.
By automating the design phase, our team now has the capacity to work alongside the product teams during the requirements phase, shifting further left.

Key Takeaways

The most important lesson from this project was that even with AI, the basics matter more than ever. Hayami’s performance seemed to correlate with the quality of the input. (Humans and) Hayami perform better on well-written design documents with clear structure and appropriate levels of context.

By leveraging our internal SDR playbooks and codifying them into Hayami, we’ve successfully turned security from a manual gatekeeper into a scalable, automated enabler. The goal is to continue this automization effort in order to focus on the next frontier of product security.

*1:Note: This evaluation was performed in Dec 2025 (Dec 3–24). Some limitations may have changed since then.