Overview
Ramp’s engineering team shares lessons from building AI agents for finance automation, covering their journey from multiple specialized agents to a unified framework. The key insight is transitioning from building a thousand agents to creating one agent with a thousand skills, which requires significant infrastructure and cultural changes to successfully deploy AI products at scale.
Key Takeaways
- Start small and iterate - Begin with constrained problems like coffee expense approvals rather than trying to automate all of finance at once, allowing you to learn what context and capabilities are truly needed
- Build comprehensive evaluation systems early - Create ground truth datasets through cross-functional labeling sessions and maintain both offline and online evaluation metrics to catch regressions when adding new capabilities
- Users are often incorrect - Don’t assume human decisions are the gold standard; finance teams may approve expenses incorrectly due to laziness, lack of policy knowledge, or trust, requiring you to define your own correctness criteria
- Infrastructure abstractions enable speed - Creating internal tooling for model switching, batch processing, and cost tracking allows product teams to focus on user value rather than technical implementation details
- Cultural shift toward judgment over coding - As AI handles more implementation work, engineering success depends increasingly on understanding users, making good design decisions with incomplete information, and maintaining momentum through complex projects
Topics Covered
- 0:00 - Introduction to Ramp and AI Strategy: Overview of Ramp as a finance platform and introduction to their agent-building approach, including the paradigm shift from multiple agents to unified framework
- 1:30 - Simple Expense Use Case: Walking through coffee purchase example showing how Ramp automates transaction processing from card tap to receipt classification
- 3:30 - Paradigm Shift in AI Architecture: Lesson learned about moving from building thousand agents to single agent with thousand skills, requiring stack simplification
- 8:00 - Policy Agent Deep Dive: Detailed look at their most popular agent that reviews expenses against company policies, including real examples of approvals and rejections
- 12:30 - Building Process and Iterations: How they started with simple coffee expenses and gradually added complexity, tools, and context through iterative development
- 16:00 - Evaluation and Ground Truth: Importance of defining correctness criteria, building evaluation datasets, and conducting cross-functional labeling sessions
- 20:30 - Key Learnings and User Behavior: Insights about model changes, user trust building, and how customers evolved from suggestions to auto-approvals
- 23:00 - AI Infrastructure at Scale: Technical infrastructure including their AI service layer, tool catalog with hundreds of shared tools, and cost tracking across teams
- 28:00 - Ramp Inspect - Internal Coding Agent: Their background coding agent that handles 50% of production PRs, with multiplayer capabilities and full development environment access
- 32:00 - Engineering Culture Evolution: How AI changes engineering roles from coding focus to judgment, context, and user understanding, with predictions for future team dynamics