Deep dive into Depth

Foreward: The writings below may not be objectively true, they are simply my opinion formed through my experiences building depth.

Background (+ How I met my co-founder)

Back in 2019, I was freshman at Cal Poly SLO and in my first quarter I met Shehbaj. He and I were both from India and had a very similar outlook for our careers and we clicked. Through the next 4 years we stayed in touch when it came time to collaborate on hackathons and help each other to interview prep for internships. This was enough for both of us to keep each other in our radar's for the future.

Now come October 2023, Shehbaj was in town and wanted to catch up. During this time I was working at Tesla working on the nitty gritty firmware for energy systems. The work was interesting but I did feel a bit underutilized and the big questions of what/how my career is evolving were everpresent. The details of the conversation were murky but I remembered coming out of it with a larger drive pushing me to wonder what's next.

Through conversation, we attended a Bluesky Hackathon (at the YC building in SF in February 2024). This was a spontaneous hackathon where Shehbaj and I built a Bluesky bot (named dreambot) that generated images using Dall-E allowing for people to reprompt and generate more images using the tweets (or more aptly skeets) that bluesky had. This experience really solidfied the cohesion we had solidifying our confidence that we could build together.

This was also around the time when LLMs were getting good at interpretting images and to an extent videos as well. One thing led to another and we realized there was a gap in how session replay in most analytics tools were set up. They had the collection and retrieval of these sessions well but the execution of delivering the last mile insight for the end user was lacking.

Tesla Exodus (April 15th 2024), This was a pivotal moment for me and Depth. Previously, I was only able to put in nights and weekends but the prospect of putting more time in was exciting. We rushed and had our heads down building a MVP.

But before diving deeper....what the hell was our MVP ? Looking at an old archive we didn't have a tagline, but our initial take was a website builder that could "improve" backed by real user data. What we actually had a tool that could run a video against an LLM that understood the user behavior and gave back insights for improvement.

This initial unlock was incredibly powerful since user session processing by an LLM was something brand new. No single product in the market for analytics like Posthog, Heap.io, Amplitude, FullStory, and HotJar had a something like this. Shehbaj experienced this first hand when working at a startup earlier and realizing how obtuse these tools can be when all any SMB wants are actionable insights they can shape their product with. Noticing all this we wanted to address it by building Depth.

(Note: A user session in this context can be understood as a video / screen recording almost of a user navigating the website. The way this information gets recorded is using a tool called rrweb. It enables you to essentially take snapshots of the DOM of the browser stitching it together with each action/mutation you make on the website to end up giving you a raw json dump that can be converted to a video.)

YC, pear and incubators alike (May 2024)

While not on the topic directly for Depth I think it is valuable to document the thoughts that run through a founder's head when it comes to institutions like YC. Above any benefit that any incubator can provide, at a very base level it establishes legitimacy. It's what most first time founders craves for. However, the more you run around chasing prestige the quicker you realize that this is just a game of distraction of the many distractions that exist while building a startup.

However not everything is black and white, a benefit of applying startup incubators is that it forces first time founders (and engineers alike) to take a step back and drill down on the vision, business, strategy and everything else that separates a fancy tool from a company.

Regardless, coming back to our experience we were able to interview with YC where we spoke with Garry Tan and Aaron Epstein and they focused on our pricing model during our call which in complete transparency wasn't something we rigorously thought about at the time.

"ARE WE BLIND OR WE HAVE CONVICTION" (May 31st 2024):

This was an event on our calendar meant to be a checkin for us to see what progress we have made both on the product + business aspects. We did this early on so we could see if it was worth it for either of us to keep putting in more time and energy. The way we evaluated it at the time wasn't the best but interpretting the rejections calls from the VCs did still serve as a weird positive signal since we weren't even expecting the call in the first place :p

In hindsight, I wish I put more emphasis on having this be a regular check-ins on the business (not just the product) more often. As a side note, moving forward for future projects I want to try out a more radical approach that I read in this blogpost where every 1-2 weeks you decide if you want to extend the life of the project by either +3, +7, +14, or +21 days. Essentially forcing yourself to constantly evaluate your progress and be more time-consious early on.

Technical learnings

This project was probably the most impactful when it comes to improving my skills as a software engineer. Since we started from scratch, every technical decision we made early on had cascading effects later. The thoughts below might be a bit scattered, so buckle up!

Saga of scaling:

This is probably the one thing that founders quickly realize when they productionalize their product. When we first started, our architecture didn't take into consideration the sheer number of user sessions we were processing, so everything was synchronous and very brittle.

While it served us well in our initial testing, the second we onboarded Taro, things broke immediately. Taro brought in a lot of traffic since it was a user-heavy website with around a thousand unique visitors every single day. We quickly realized that we needed an event-driven architecture and also needed to split up points of failure to ensure we didn't break the overall pipeline of processing sessions. This meant separating our ingestion from our processing of user sessions. (For context, ingestion is the collection of user sessions from the underlying customers of our users.)

Event Ingestion: Client-Side Event Collection → Kafka buffering/chunking of these events → Store in S3 for replay reconstruction & store in ClickHouse for structured analysis. We implemented chunking because a user might be on a page for any number of minutes, so we had to make sure we didn't prematurely stop ingesting or wait too long before terminating a session. We heuristically chose 5 minutes of inactivity as our buffer time before ending the session. In parallel, we had a PostgreSQL database that handled the source of truth for all the relational data (session state, but not individual events).

Event Processing: User Session Complete → Use Kafka to queue up the session for AI processing → run LLMs → store embeddings of each session's analysis in Weaviate (vector db) for future analysis.

Here's a old diagram that I made when thinking about the system:

Session Ingestion Limits: How We Control LLM Processing Costs

We quickly realized that processing every session through LLMs would be financially unsustainable. With thousands of sessions per day (for a single customer), even at $0.01 per session, costs would skyrocket to hundreds of dollars monthly. Also processing 100 high-quality sessions provides more valuable insights than processing 1,000 random sessions. Thus this all culminated in us needing to come up with a sophisticated filtering system to process only the most valuable sessions.

Session Quality Scoring - Before any session reaches the LLM, we evaluated its potential value:

Error Indicators (Highest Priority) Sessions with JavaScript errors Sessions with network failures Sessions with console warnings Sessions with 404/500 errors
User Behavior Patterns Sessions with form submissions Sessions with checkout processes Sessions with high interaction density Sessions with multiple page views
Technical Metrics Session duration (longer = more valuable) Number of unique interactions Page diversity
Business Context Sessions from paid users (higher priority) Sessions from new users (higher priority) Sessions during peak business hours Time spent on high value pages (determined from user input in the project setup flow)

To really nail down how impactful this was in our operations here's a estimate in our savings:

	Before Optimization	After Optimization
Cost	1,000 sessions/day × $0.01 =$ 10/day = $300/month	100 high-quality sessions/day × $0.01 =$ 1/day = $30/month
Quality	Many low-value sessions processed	Higher quality insights from better sessions

"What is your x factor?":

Our x-factor was straightforward initially: we were going to be the first product that used LLMs in conjunction with session replay. While we acknowledged this, reality is it's difficult to execute without getting biased by how the industry operates or simply chasing feature parity with incumbents since it would make sales easier to handle.

For example, a common request we received was to implement the capability to track what exact paths on a website a user had traveled. Instead of pushing ourselves to come up with a more AI-native/robust system that would allow many different identifiers to be indexed, we went with the path of least resistance, simply building a feature that manually tracked URLs much like any other platform. By playing it safe in the interim, we ultimately diminished our x-factor and increased our overhead in what we actively maintained.

This also meant I should have questioned harder what engineering x-factors I wanted to solve. Was it pertinent that we had a fully managed Kafka instance that we deployed to DigitalOcean instances, or would it have been better to just offload that work to a serverless cloud solution? While the argument can be made either way depending on where you draw the line and what bottlenecks your company is facing, it still fundamentally digs deeper into where you value your engineering efforts.

However in any early stage case, putting in efforts that made the actual aspects of the product incredible is 100x more valuable than the cost optimizations or control you get over owning the scaffolding that holds everything in place.

Closing thoughts:

While there were a bunch more examples and moments during our process that ultimately led to the end of Depth, I wanted to highlight the stuff that I look back at the most. Regardless, from being exposed to the nitty-gritty challenges to actually hearing from our users, the experience of building Depth was incredibly rewarding. I'm grateful to our users for actually taking a chance with us and to Shehbaj for building with me! Moving forward, I hope to continue to build and tinker with tech no matter where it leads me. While making sure I don't forget the learnings I've had with depth :)

Fun links

Product Hunt Launch https://www.producthunt.com/posts/depth-4?utm_source=other&utm_medium=social