The Two-Way Highway Of Drift’s Data Engineering And Data Analytics Teams with Arun Venkateswaran and Kyle Thelemann

Media Thumbnail
  • 0.5
  • 1
  • 1.25
  • 1.5
  • 1.75
  • 2
This is a podcast episode titled, The Two-Way Highway Of Drift’s Data Engineering And Data Analytics Teams with Arun Venkateswaran and Kyle Thelemann. The summary for this episode is: Operators are often referred to as unsung heroes. But there's a group of unsung heroes within the unsung heroes: the Data Team. For companies in hypergrowth, the volume of data requests and the variety of data sources to access can be daunting. So we've turned to Drift's own Data team for help in navigating the world of big data. Together, Arun Venkateswaran, Senior Data Engineer, and Kyle Thelemann, Senior Manager of Business Intelligence, form the two-lane highway of Drift’s data team. In our conversation, we talk about how they’ve thoughtfully crafted their partnership, learn about the 4 V’s of Big Data, and talk through a real-life example of how they translated a tough business problem from something very technical to something simple and digestible. Like this episode? Be sure to leave a ⭐️⭐️⭐️⭐️⭐️⭐️ review and share the pod with your friends! You can connect with Sean on Twitter @Seany_Biz @HYPERGROWTH_pod

Sean Lane: Hey everyone, welcome to Operations. The show where we look under the hood of companies in hyper growth. My name is Sean Lane. You often see articles about operations, or posts on LinkedIn that talk about operations folks. They talk about ops people as the unsung heroes in an organization. But I got to tell you, I think that there's a group of unsung heroes within the unsung heroes. And those people are the members of the data team. For companies that are in hyper growth, the sheer volume of data that's available and the number of requests coming in to access that data can be daunting. And from the outside looking in, it can seem pretty simple. The data exists. I want it. Go get it. But from the inside, from under the hood, it can be a lot more complicated. Which is why I am super excited about our two guests today from Drift's own data team, senior data engineer Arun Venkateswaran and Senior Manager of Business Intelligence, Kyle Thelemann. Together, Arun and Kyle formed this two lane highway of Drift's data team. In our conversation today, we're going to talk to them about how they thoughtfully crafted their partnership. We're going to learn about the four V's of big data and we're going to give you a real life example of how they translated a tough business problem from something pretty technical, to something so digestible that even I can understand it. And for bonus points, I'm going to learn how big a Terabyte of data is. But to start, let's look at Arun and Kyle's roles on the team, Data Engineer and Business Intelligence. What exactly do those mean and what exactly do they do?

Arun Venkateswaran: So when you say Data Engineer, you basically are looking at how... A persona that does software engineering work. But is not building features that are directly exposed out to your customer. You kind of are the internal engineering team that's enabling other teams to be efficient in how they monitor their work, how they push their data out to you, and how you get that data into a central place, where they can easily access their own data and monitor that, how their performance goes. Within the data engineering space, you need to not only know the product knowledge... Know every of the products ins and outs. You ought to be able to know how the other departments correlate too. How does the customer success measure their performance? How does marketing measure their performance? How do sales managers their performance? You end up being the central engineering department that caters to all these departments. On top, as a Data Engineer you should also have a little bit of background in Business Intelligence and Data Science work. If you're interfacing with BI and Data Science teams, you want to be able to speak their language, and also understand how to get their requirements satisfied too. It is a mix of business and engineering disciplines to be a Data Engineer. It is definitely a valuable field in this current day and age.

Sean Lane: Yeah, and also I feel a pretty rare combination of skillsets too. We can talk more about that. Kyle, would love to hear more from you on what your specific role means inside of the Ops team at Drift, and how you think about your role in terms of partnering with what Arun does.

Kyle Thelemann: Arun said a lot of things that are similar to how I think about the BI and Analytics role at Drift. Both from a combination of an Engineering and a business type of role. I would say BI Analytics leans a little bit more towards the business side. Arun could correct me if I'm wrong, but Data Engineering leans a little bit more on the Engineering side. Ultimately there's this Venn diagram, central part that Arun and I overlap within, which is... And Arun said this, a centralized data warehouse that we both kind of play within. So my role is taking all the hard work and data processing that Arun and his team does, to get data from the product, flow that into a data warehouse, and then making that data readily accessible for our business teams, like Marketing, Customer Success, Sales and Product. So they can action off of that data. So it's a little bit more forward leaning towards the business side, but nonetheless, I still need to be able to speak Arun's language from a SQL standpoint. Arun can speak my language from a Business Intelligence tools standpoint. So there's quite a bit of overlap, and Arun and I do end up working on things together and having some of that overlap. But just sort of the end- user goals are slightly different. Whereas my end- users and my core customers being directly interacting with the business teams.

Sean Lane: I want to definitely come back to some of that end- user stuff, because I think that's going to be super important for people to understand. Before we do that, I want to go back to the center of the ven diagram that you mentioned Kyle. If I'm out there listening to this, and I'm either on the Data Engineering side, or I'm on the analytics and BI side, how do you guys think about the center of that ven diagram, and how you work best together. Because you guys have been working together for a little while now, and you've kind of felt it out. But I would imagine as you were first getting started, you would have to figure out exactly where that crossover point begins and ends in that ven diagram. Is that fair?

Arun Venkateswaran: I would agree with that. I think the ven diagram where the intersection happens is the data warehouse, and our front facing reporting tools. Yeah, I do look at the back- end engineering, getting the data to its place, to the data warehouse. But from there, these are all raw data streams, there's no way to make sense of these raw data streams unless we build other tables, or other layers on top of this raw data, that make sense of it. This is where I would say Kyle and I work hand in hand, I work with dealing with org facts, or conversation facts, where we roll up each and every raw piece of information into our by day, by customer, or by day, by conversation. I'll let Kyle add more on the specific details that we've built up within these tables too.

Kyle Thelemann: Totally agree with what Arun just said. I think even a practical example helps a little bit here. Sean, I mentioned this to you earlier. This two- lane highway concept of myself working with other folks within the Operations team, as Arun does too. But they're gathering requirements from the actual business teams on reporting, data needs, insights that they're hoping to gather from data. They're bringing that to myself in a lot of cases, and asking for me to service that data, whether that be in our BI tool, or whether it just be in a raw form. I'm going back like Arun said in our central location which for us is our data warehouse, and I've seen what's there, I'm trying to make sense of it, I'm trying to understand the data model to try to surface that data and satisfy that need. A lot of times on a daily basis, either I don't know where that data exists, or we don't have it yet, or I need it in a different format, a different structure. That's when Arun and I, and Arun's team, that's when we really come together where it's, " Okay, we have a business need over here. We don't have it readily available right off hand to satisfy that." We have to come together and make that data available, in the right structure, in the right place so that we can efficiently give it back to the business teams, so that they can start using those insights and making those business decisions with that data.

Sean Lane: Okay, pay particular attention to the two- lane highway analogy that Kyle is outlining here. This highway, this partnership between data engineering and data analytics is really the focal point of our entire conversation. When Kyle talks about getting the data into the hands of the end- users, I'm one of the consumers of that data within Drift. I'm at the end of the highway. But for me, and for all of us in Ops, it's important to understand what's happening along that highway, and to be able to understand the core components of how data and analytics teams think about their function. In my preparation for our conversation, I learned about something called the four V's of big data. Really those four V's are the core components. They're the building blocks for these teams. Luckily, you don't have to listen to me try to explain it, Arun is here for that.

Arun Venkateswaran: So I could start with the first V, that is volume. When I started at Drift around two years ago, I actually measured our total warehouse size, we're at one Terabyte. So if you put all of our conversations, all of our Salesforce data, everything that happened at Drift at that time, it had been one Terabyte. Now within these two years, I just checked right before we started talking, we're at 29 Terabytes in our data warehouse. In terms of hyper growth, we're talking around 1, 400% year over year growth. In terms of our data, that's exploding and how much each team is tracking, each team has their own SaaS tool. Customer Support has their own satisfaction tool that they use. How would we get that data? So all these contribute to the volume, and that's how much explosive growth we had. Now the next is the variety. Again, as I said each and every team has their own types of different data that they bring into the system. Sales Force is more transactional, whereas our Drift chat bot is essentially a conversation text. Whatever happens on the Drift chat is essentially a blob of text. You want to see, " Oh, how many times has an end- user said pricing, or talked about pricing in their chat, or in their conversation?" To be able to analyze all of this, this indicates the variety of type of data that we deal with from very highly structured within the Sales Force domain, to very loosely, semi- structured data within what happens in the chat bot itself. The next one would be Veracity. Do we also want to check are we getting the right data? Are we making sure we're double checking the inaudible that comes into the system? So we've built a lot of checkpoints where we ensure that if we're getting this piece of data, are we running a bunch of these checks to check if there's an old data load? Are there pieces of information that are outside of the average or median ways we look at? We do put in checks for that. This enables trust within the data team, we can also go ahead and say, " Okay, this is what we believe at this point in time is the truth." The last one would be Velocity. I would say we are experiencing like I said, each and every chat bot that happens anywhere, each and every chat that happens anywhere in the world, we're getting data inaudible into our back- end, pretty much within two or three minute latency delay. This enables us to analyze or look at computations almost in real- time. Again, as an engineer you want to be able to know, " Hey, is the chat bot working properly? Is there any issues in terms of transactions? Are we getting the right level of transactions coming in?" So these are the four V's that we've kind of embraced, and have gone towards building this data warehouse and infrastructure that supports the warehouse.

Sean Lane: That's incredible, and Kyle I want to dig into specifically Veracity with you, because if we're going from one to 29 Terabytes of data, certainly a lot of Drift conversations are contributing to that. But the variety that Arun is talking about too, makes Veracity harder. As we have increased our variety, one of the things that I can imagine, if I extend the highway metaphor a little bit, is that where you and Arun meet, or where we as anybody inside of the organization meet on the definitions of some of these things. That to me is where this stuff can break so quickly. Because you could end up with 17 different versions of what is actually the same metric, but nobody really knows the true definitions of what's going on. So as you've come on to the team, and then ultimately had to learn it yourself, and then also expand upon what was already there when you got here. How have you kept track of all of the different definitions, so that you can feel as confident as Arun was saying about the veracity of the data we're producing.

Kyle Thelemann: Yeah, and that's a very real example of a problem that you can face when you have the growth that Arun spoke to. A lot of definitions for a lot of the key metrics that we look at, they're all very close, but they're all very different at the same time, or can be interpreted really differently. So the un- sexy thing that helps cut through this is documentation. Since coming onboard here at Drift, I've tried and implemented some data dictionaries, user guides, other forms of documentation that break down the different definitions that we have that our business teams are using. Whether that be data that we're piping into Sales Force so that our sales and customer success teams can use, and look up on the fly. Or within a Looker dashboard, if we're aggregating data, making sure that there's documentation both from a business user standpoint, so that they know what a conversation really is, but then also from a technical standpoint too. So that if somebody like an Arun, or down the line if there's somebody else in the Analytics or BI role that were to come in, they could actually see the SQL behind it that actually powers that. So that we could both troubleshoot, and also help clarify any questions that we have, which we get all the time of what is this metric actually showing? What does average over the last week mean? Is it seven days? Is it six days? Is it a grouping by week? You can slice and dice it a million different ways. So having that documentation there, and having that there and up to date is really critical to keeping all of these things organized. So it's the documentation as a data catalog, it's all of those things that isn't like I said the sexiest thing to do. It's really, really important to keeping your head straight.

Sean Lane: Okay, let us recap the four V's, volume, variety, velocity and voracity. Before we go on though, quick side note. Arun talked about the size of Drift's data warehouse, exploding from one Terabyte, to 29 Terabytes. Data sizes this big are hard to imagine. But I did a little bit of research, and here's some context. Every single book scanned into Google Books combined, is about 40 Terabytes of data. The entire Library of Congress is 74 Terabytes of data. So 29 Terabytes, put quite simply, is a lot. Okay, anyways, back to our conversation. To keep up with this hyper growth that Arun and Kyle are talking about, and all that data. Veracity was the four V's component that really jumped out at me as the one that everyone on the team can play a role in maintaining. Maintaining that veracity can often mean the unsexy documentation, and definition work that Kyle eluded to, because here's what's inevitably going to happen. Someone is going to come to your data or analytics team with a new ask. There are really just two options at that point. Option one, the data that they're looking for already exists, and someone has done the work to pull it in the past. Or option two, it doesn't and it's a net- new ask. Without good documentation, your team likely won't know the answer to that question, and could spend a bunch of time duplicating work that someone else has already done. So I wanted to understand how Kyle reacts when this exact scenario comes up. What's his order of operations for looking at what is already available to him? And deciding the best, most efficient course of action from there.

Kyle Thelemann: I'd say the first thing I'd do is we probably have, or at least from my point of view, about ten key tables that really house all of the core data points that probably 90% of our reporting is built off. So I know those tables like the back of my hand, over the last nine months or so of being here, being able to dive into those and use those every day. I kind of know what's there, or know where to go within that small group of tables to get that. But every once in a while there's a request that comes up, I've never heard of the metric, I've never even thought of it, or seen it or what have you. To be totally honest, almost 99% of the time at that point I go to Arun. I say, "Arun, I just got asked to see how many conversations our automation bot is routing. I think that probably exists here, but have you ever seen that? Have you heard of that? Have you moved that data into Snowflake in any form?" And I'm assuming that Arun, well he usually knows the answer, but if he doesn't he's going back even further into the product teams and back upstream to either find that answer or to get that to me, and that's where we're coming full circle onto the two- lane highway ven diagram analogy, of him going back and getting that data, and surfacing it within our data warehouse so that I can use it and build it out in reporting.

Sean Lane: Arun, those core tables that Kyle is talking about. You have to educate me here, if I'm thinking about the core tables that you have built, how much of that is you anticipating the questions that are going to come in, and how much of it is adapting to the newer questions that are coming in and figuring out whether or not the tables you've got help solve that, or you do have to spin something else up that's new? How do you manage that so that Kyle can go back and say, " You know what? 90% of what I can get that is the most critical is coming from these core tables?"

Arun Venkateswaran: Right, I would say it's a mix of both. As I architect these tables to look the way they are, I know that, " Hey, these are for a conversation. We want to see if there's a meeting book." Even before anyone asks that, you want to know for a given conversation idea, are we able to capture an e- mail or a meeting through that. Now given that information, obviously these tables are living and breathing things. I would not say that they're static in time, these are always evolving. As we order, as we build more features in Drift, we keep adding more things to the table. So I would say when we built the Zoom integration to our Drift app, to be able to directly call our chat going to a conversation directly from chat into Zoom call. We want to see, " Oh was there a Zoom call involved?" So all of these things keep getting added to this table, we always see these tables as an ever growing and a living document of how we measure and define things in Drift. Now this again as I said, it's going to be us anticipating, and also as the product releases new features, we make sure that we have the capacity to add these into the new tables.

Sean Lane: I'm honestly in awe of the foresight and the true architectural work that Arun done with these tables. He is both reacting to the current needs of the business, and proactively planning ahead for what he thinks will be necessary in the future. As he put it, the tables are living, breathing things that are constantly evolving. In my opinion, Arun, Kyle, along with folks who work in similar roles to them, are these special unicorns that have both a specific technical expertise, and they have to have clear business context in order to leverage their technical acumen. So I wanted to go deeper into a specific example of a problem that the business brought to them, so that we can all see a really practical example of how their partnership works in real life. We're going to start with something that is pretty technical, and end with a deliverable that is digestible and valuable to somebody like me. The example that they told me about has to do with measuring customer health.

Arun Venkateswaran: I think one of the best examples that was a huge ask from the Customer Success department, was to measure our customer health score. Now this is a multi- disciplinary approach. It involves Engineering, it involves a lot of Business Intelligence, it involves a lot of Data Science. It is a three- way intersection of how we look at a customer health score, we call it at risk score with in the company. It definitely is the I would say highest level of collaboration within data end space and data science at Drift. Now how we went about it was a three- step approach. We first looked at what do we need in terms of gathering information on how a customer is using Drift. Is it the number of conversations they had? Is it the number of playbooks they have created? Is it the number of meetings they booked in terms of ratios? And all these metrics that we look at, what are these relevant inputs? It requires a first level of business understanding, and a little bit of data understanding in the first place. As we iterate through, I feed this data into the data science team, and they look at it and they're saying, " Hey, we want more data." So we go back to the drawing board. What other things can we provide to the data science team? As this model keeps getting iterated and built on top of, we put this into... So this is where I get a little bit more into the technical depths of how we built it. All our tech stack is on AWS, and we use Air Flow to do a bunch of our data orchestration and execution on the back- end. So anything that gets written to Snowflake, our data warehouse, is essentially getting routed through Air Flow. Once our data science team is like, " Okay, hey we're good to go. We can push this model to testing and production phase." Now we deploy that into Air Flow. Air Flow basically runs this model every day, back- ended, gathers all the data, it does a bunch of heavy computations in the back- end, and spits out a score into Snowflake. So this is how the step would work, it gets the data from Snowflake, computes it, and then puts it back in Snowflake. Now the thing is, as a Customer Success Manager, you're not going to look directly into Snowflake, it's a database. No one knows how to get to it.

Sean Lane: And to be clear Arun, this score that's popping back out, this is a score that's going to measure whether someone is at risk to potentially leave Drift? Or it measures the efficacy of the product usage? What exactly is this number that's being spit back out indicating to me as a CSM?

Arun Venkateswaran: As a CSM, so the score essentially is graded from 0 to 100, and the lower you are on that stream indicates, " Hey, the customer is not using Drift very well, and there's a high risk of them leaving us as a customer." If one of our customers has a score of 98, 99, you know what? We don't need to worry about them until expansion, or next renewal. They know how to use Drift, and they're utilizing Drift to the maximum capacity. Again, there are so many inputs that go into this model, it pretty much is very accurate to deduce whether the customer is going to stay or not. Given the data is in Snowflake, we need to be able to get it to a transactional system like Sales Force. We use an internal tooling called Tray. IO, pretty effective in getting data right from Snowflake and push it back into Sales Force. So this is a nightly job, that runs a bunch of computations on the back- end, gets a score, and then pushes it right into Sales Force. So a Customer Success Manager logs in at 7: 00AM to look at all their meetings, and look through the customers in our Sales Force for instance. They can say, " Oh, I know how they're doing as of today." When I go into this meeting I'm prepared on what I can talk about and what I don't need to talk about.

Sean Lane: I think first of all, you don't give yourself enough credit for the beginning of that first step in that process that you mentioned. Just the fact that that data was even available to be given to the data scientist in the first place. We take that very much for granted I think at Drift. Just having that available based off of everything we talked about before, about the way you have constructioned the tables, and pulled in the wide variety of sources. Just having that I think is an amazing foundational start. And then, I feel like Kyle we've reached the part of the highway where you need to take over. If you've got a simpleton like me on the other end of this score, we've got to do a significant amount of enablement around what this score means, and then how I can use it. So how do you then take a score like this, in this particular example. Who do you need to talk to? Who do we need to get this in front of? Who do we need to train on how to use it? What does that look like?

Kyle Thelemann: As you mentioned Sean, Arun's downplaying the work it takes to pipe that in. So for me I would say it's pretty much easy money once it's in Snowflake, for me to manipulate and use and to build into reporting. So once all that hard work is done by Arun, and by Mickey our data scientist of building the score, or finding the score with all these different data points. Then they're just putting a number back into Snowflake, which then I can basically in that sense, once it's there I can either build raw format reporting, think like Excel, or Google Sheets, and be able to dump that out for some SQL queries. But I can also do things that are more autonomous, like build it into a Looker dashboard, build it into already existing reports that our teams, who would be the consumers of this new score, that they can have it readily available at all times. Arun also mentioned putting it into Sales Force, which is great because it's right there. That's a key system that our teams are using. But also something like Looker is going to allow our users to see how that actually trends over time, and to be able to identify any pockets of concern, or any improvements that are encouraging. Looker is going to be able to do that at scale, and really efficiently.

Sean Lane: Kyle, real quick on that. You mentioned that concept of the trends in Looker, is that your main guide about where you want to present that information? I would imagine for every single one of these examples that we could talk about, at the end of that highway, we need to decide where we want someone to see this, and where we want someone to take action. So is there a specific way that you think about whether or not, " Okay, this thing should live in Sales Force, this thing should live in Looker." Based on the different teams that you're servicing?

Kyle Thelemann: Yeah I think, Sean you know this better than anyone. First of all with Sales Force, we don't want to clutter Sales Force, which we maybe already are doing. But there's a million data points there, we don't just want to keep adding, adding and adding. So if we know it's something that's of value, and needs to be readily available all the time for everyone at their fingertips, I think that's when it makes sense to add it into Sales Force. Especially for our users that are speaking with customers, because they're going to be going into Sales Force on the fly, while they're on calls and just want to look at it right there. Other types of projects that don't require it to go to Sales Force or Looker, is if it's something that's ad- hoc, or it's a one- time request. These come in all the time, I want to be able to see how our top ten percentile of customers, how they're chatting, or how their chats are converting into meetings and into leads. That might not be something that we want to run every single day, but might just be once a quarter, or it's just a one- time request. That doesn't need to have the overhead of putting it into a Business Intelligence tool until we have that repeatability that we need that Looker's then going to unlock that speed scale. But as I'm already eluding to, if it's something that needs to be looked at across the masses, across a bunch of people, and it's something that's going to be looked at frequently, the quicker that you can put that into a tool like a Looker, and be able to see those trends over time, and be able to run it, be able to adjust timeframes, be able to adjust which customers you're looking at, and making it self- serve in that setting. Is going to save me at on of time, is going to save the users a ton of time. That's when BI really starts to unlock our potential of providing data and insights back to the business teams.

Sean Lane: I really, I'm not just saying this, I really genuinely believe that both of you guys are this weird, unique unicorn. Because you do have this very technical skillset, and at the same time you can not do your job, at least you can not do your job in the most effective way possible without understanding the business context, of everything that you're doing on the technical side as well. This could be for either of you or both, what have you guys done and what do you continue to do to sharpen both sets of skills? Both on the technical side, while making sure that you're not losing context, I think is the most important word, on the business side?

Arun Venkateswaran: I can start with on the data engineering space, people want as we said, the data engineering team is an engineering team that's not building features, that the customer for the data engineering team is the partners, or the internal co- workers themselves. So it makes sense for the data engineering team to be able to remove friction for our own co- workers and partners in the company, to get to their metrics, and get to their data as fast as possible. One of the things that I look forward to, is how can we reduce our... Oh there's a new data ingestion. So the Rigid team wants to send this new data to me, how can I make it faster for them to be able to send that data into Snowflake, and be able to start running, or start doing their metrics as fast as possible? So the time to get to that data is always what I measure, and I would say with our fast pace at Drift, we definitely have a turnaround time that's less than four to five hours of getting to that data as fast as possible. Secondly, we want to look at quality of the data. I normally do that much quality check. I make sure as long as the pipelines are working, and there's some data flow that's coming through, I can say, " Okay, we are fine." If there's no data, I do get alerted saying, " Seems like there's an issue. I haven't seen any new data for the past two hours. Do you mind checking?" So these are alerts that I put in place that the data engineering team gets the issue, before our downstream partners even figure that out.

Sean Lane: Kyle, anything to add?

Kyle Thelemann: One thing that's been really helpful for me, considering my core customers... Well for the most part it's still an Operations team member. So it'd be the Marketing Operations team, or the Customer Success Operations team. They essentially for me serve as a liaison into the business. One thing that's been helpful over the last two quarters as I've worked on two bigger projects both with Marketing and with Customer Success, is really attaching at the hips with those other Operations folks. Whether that be Marketing Ops, or whether that be CS Ops, going really, really deep with them, and having constant back and forth communication with them. Not just satisfying the requirements of the build, but really trying to go deeper and ask those questions about what would be valuable, and what does the marketer actually care about so we can get into that next level of requirement. Where we might start unlocking things that aren't just on the surface of what I'm providing, but could actually drive even more value for them. The inevitable too is for the teams where I'm building something large, or as we call a big rock type of build. I'm also trying to go to the metrics meetings that those teams are having, so I can hear directly from the leadership, of what they're asking for or what they would be asking my Operations team members for. So that we don't have to play this constant game of telephone, and I can hear it right then and there, and try to get that context as well. So really trying to submerse myself within the teams where I'm spending a lot of time building a lot of things for. It's an investment both for myself, but it's also an investment for those other Operations team members as well to babysit me along the way, and get me up to speed on what the teams that they support really care about. Which then I can start to see around the corner a little bit, and hopefully start to fend for myself after a few weeks of hearing the different conversations that are happening within the business.

Sean Lane: Yeah I mean that's what I was going to say too. I feel like one, that curiosity you're showing is so, so important. But also, at a certain point you're going to be able to identify and come up with ideas that CS Ops, or Marketing Ops, or a Sales Ops person might not have thought of. Just because you have better context and more depth of knowledge there on technically what's possible.

Arun Venkateswaran: I actually have to add that Kyle does act as a great buffer for the data engineering team, because sometimes a sales team, or a CS team, they are trying to materialize what they want to even measure, and Kyle acts as this buffer where he can actually formulate what they're trying to look at, and come to me and say, " Oh, this is kind of what they want to do." It's easier for me to parse that out, rather than actually going all the way up to the customer, or the partner and parse it out. So it definitely as I said, two- way highway where Kyle and I integrate well. I'm able to easily parse out requirements coming through Kyle, facing each and every business user and trying to figure out what the requirement actually is.

Sean Lane: Before we go, at the end of each episode we're going to ask each guest the same lightning round of questions. Ready? Here we go. Best book you've read in the last six months.

Kyle Thelemann: Shoe Dog.

Sean Lane: Arun, do you have one?

Arun Venkateswaran: I have read Shoe Dog, but it's been more than six months. I would have to say,'Homegoing', but inaudible. It's a book about Ghanaian and American history along the way, and slavery through the years. Since I am from Ghana, it was a pretty cool aspect to see both histories play out in time.

Sean Lane: That's really cool. Kyle, favorite part about working in Ops?

Kyle Thelemann: I would say it's the exposure to all the different business units. So us being a centralized Operations team, I have exposure to all these teams that Arun and I have talked about, CS, Marketing, Sales, Product. We're not siloed so I can constantly see how a business is truly run from the core, and I think that's really, really, really interesting for me.

Sean Lane: Arun, least favorite about working in Ops?

Arun Venkateswaran: Least favorite part would be the amount of knowledge that you need to... It's basically like a fire hose sometimes. I can be in CS world, and then I need to jump into Marketing Ops, and their world. It's like I need to start from scratch. So it is a lot of things to take in at one point.

Sean Lane: Kyle, someone who impacted you getting the job you have today.

Kyle Thelemann: There's a lot, I would say a good friend of mine that I worked with at my last company, Pat Brown. He both helped mentor me through the process of what I wanted, and what I wanted to find, and the type of job that would be a good fit for me, and groomed me that way, and helped me think outside of my own head, and to really look at companies like Drift which are on the up and up. So he was a huge help for me in approaching this job here at Drift.

Sean Lane: That's awesome. I think I'm going to have asked both of you to answer this last one, Arun we'll have you go first. One piece of advice for people who want to have your job some day.

Arun Venkateswaran: I actually have a lot of people who want to get into the data engineering space. I would say curiosity to learn to code. I think there are so many online courses now that you can get up to speed into writing SQL, writing some Python code. There's so much ways to self- learn outside in terms of course, all these machine learning courses. Just to be able to register and give yourself time commitment to finish this course at a certain point in time. It's definitely gotten easier to do that within the COVID space, although it's unfortunate to be in this space. You're not going anywhere, you're kind of at work, and then you're at home after that, or doing something else at home. Easy to dedicate that time to learning something new, and learning to code is definitely probably the first step to get into data engineering.

Kyle Thelemann: Yeah I would say something similar, from a Business Intelligence, Analytics standpoint. I didn't go to Computer Science school, I wasn't a Math major, I was basically just a Business major. So got started as a Business Analyst, and I just sort of followed the trend and stayed curious on what was happening in the market, and what companies and people that I looked up to what they were doing, and types of things that I thought were interesting. Which I always steered towards data, and just kept going deeper and deeper. I was a big data user, then I wanted to get my hands even more around it, so I started to build things. Started out as Excel, and to now be into Business Intelligence is sort of my entire end to end. Most of that has been self- taught and just out of curiosity of wanting jobs like this, and just working back from that and knowing what did I need to learn, and what did I need to do so I could get into a role like this someday. And just like Arun said, all those resources are available to you, everything is open- source now, you can watch YouTube videos for days on how to write code. Just to double down on what Arun said, learning to code in any sort of language I think is huge. Whether it's going into Excel and writing VBA, or if it's just downloading some super easy SQL query guide that can help you learn the structure. Just at last start talking that language, because it's not rocket science, but it is something that you just have to get comfortable with and learn the structure and the formatting. It just takes off like wildfire. But it all starts out being curious and taking that first step and doing some self- teaching.

Sean Lane: Huge thank you to Arun and Kyle for coming on this weeks episode of Operations. I recognize how lucky I am to work with both of them, and hopefully all of you have folks on your team like them, or this was a blueprint for you to be able to start a team like the one that Arun and Kyle work on here at Drift. If you like what you heard from those two today, please make sure you're subscribed to the Operations podcast so that you get a new episode in your feed every other Friday. If you're really enjoying the show, please leave us a six star review on Apple Podcast's six star reviews only. By the way, if you haven't heard, the originator of the six star review, our CEO David Cancel, has restarted the original podcast from Drift, Seeking Wisdom. If you are a fan of this show, or any of the Drift podcasts, I promise you, you will also be a fan of Seeking Wisdom. Check that out on your feed as well. That's going to do it for me, thank you so much for listening. We'll see you next time.


Operators are often referred to as unsung heroes. But there's a group of unsung heroes within the unsung heroes: the Data Team. For companies in hypergrowth, the volume of data requests and the variety of data sources to access can be daunting. So we've turned to Drift's own Data team for help in navigating the world of big data. Together, Arun Venkateswaran, Senior Data Engineer, and Kyle Thelemann, Senior Manager of Business Intelligence, form the two-lane highway of Drift’s data team. In our conversation, we talk about how they’ve thoughtfully crafted their partnership, learn about the 4 V’s of Big Data, and talk through a real-life example of how they translated a tough business problem from something very technical to something simple and digestible. Like this episode? Be sure to leave a ⭐️⭐️⭐️⭐️⭐️⭐️ review and share the pod with your friends! You can connect with Sean on Twitter @Seany_Biz @HYPERGROWTH_pod