Console

Why engineering sucks

S04 E07

2023-06-08

Why engineering sucks - a devtools discussion with Eli Schleifer (Trunk). Console Devtools Podcast: Episode 7 (Season 4).

Episode notes

In this episode, we speak with Eli Schleifer, Co-CEO of Trunk. We discuss why engineering sucks, what developers can learn from how software gets built at Google and Uber, how individual developers can improve their coding experience, and why Git commit messages are useless.

Things mentioned:

About Eli Schleifer

Eli Schleifer is the founder and co-CEO of Trunk, an all-in-one solution for scalably checking, testing, merging, and monitoring code. It helps developers write more secure code and ship faster to redefine software engineering at scale. He was previously a technical lead manager and a systems architect at Uber ATG, where he led the architecture and engineering of its self-driving platform. He also lead a team of engineers and technical leads in the development of multiple products under the YouTube Director umbrella and was a lead senior software development engineer at Microsoft.

Highlights

Eli Schleifer: We should trust our engineers and also understand that code is constantly – it's a living document. It's changing all the time. If something gets in that's imperfect but not terrible, that's also okay. So if you have an engineer put up a pull request, you have feedback, leave that feedback and stamp the pull request. Assuming there's trust, then the engineer is going to follow up, fix up your comments, and then land that. There's no additional cycle. If you don't stamp it, that means you're going to— you’re basically saying to this person, “I'm going to hold up your work until you show me that you can actually follow through on the things I'm asking about.” That's a level of distrust that, I think, is not good in a highly collaborative working environment.

Eli Schleifer: I think this is the biggest thing between a smaller startup and a giant tech company: At a giant tech company, at the end of the year, the giant tech company comes to the employee and is like, “Tell me what you did this year and why you have this job. Tell me all the good stuff you did for us.” At a smaller company, all management knows what all the people are actually doing for you. There’s a clear visibility into what those engineers are adding and contributing to the actual company's efforts. I think the biggest thing to focus on when it gets to 200 engineers or 2,000 is: what are these people actually working on? Who's making sure that there's a director of engineering for each of these smaller groups of 30, 40 people to make sure they're actually pushing towards something that matters, that matters to the company, that's going to move the needle? And that those engineers can still feel pride in and feel like they have impact?

David Mytton [00:00:04]: Welcome to another episode of the Console DevTools Podcast. I'm David Mytton, CEO of Console.dev, a free weekly email digest to the best tools and beta releases for experienced developers.

Jean Yang [00:00:15]: And I'm Jean Yang, CEO of Akita Software, the fastest and easiest way to understand your APIs.

David Mytton [00:00:22]: In this episode, Jean and I speak with Eli Schleifer, Co-CEO of Trunk. We discuss why engineering sucks, what developers can learn from how software gets built at Google and Uber, how individual developers can improve their coding experience, and why Git commit messages are useless. As a disclosure, I'm an investor in Trunk. We're keeping this to 30 minutes, so let's get started!

David Mytton [00:00:45]: We're here with Eli Schleifer, Co-CEO of Trunk. Let's start with a brief background. Tell us a little bit about what you're currently doing and how you got here.

Eli Schleifer [00:00:54]: So Trunk is building a developer experience toolkit. Our goal is really to partner with engineering companies across the planet and help engineering move faster. We basically see a major problem in building software at scale as quickly as possible. Engineering is a very hard discipline to actually do at scale. As soon as you start working with more than one person, you run into trouble. What we're building is a solution to really accelerate engineering. So we get right alongside your source code. We have a bunch of products out there, both at the code test and merge layer of that flow and we try to accelerate engineering anywhere possible.

Jean Yang [00:01:30]: So, Eli, when we were planning this episode, you said you wanted to talk about why software engineering sucks. What did you mean by that? Let’s dig in.

Eli Schleifer [00:01:39]: It brought to mind this old cartoon comic in the eighties called Family Circus, and there was this little character who would be trying to go from point A to B. There was a very straight line. But instead, the process to get there was super meandering, and he’d run all over the neighbor's yard and trip over a dog, whatever it might be.

I think that engineering is like that. It's like if you actually look at your ticket in Linear or Jira, it says, “Go build that.” The engineer gets in their mind, “This is a very straightforward path. I can go build this thing. I think it will take me a couple of hours and be done with it.” But the actual process of doing that at scale with other engineers around you, working together, means instead of that process taking an hour or two, it actually takes all day. You look back on your week, and you’re like, “How did I get the little accomplished?”

I think that, historically, this has been an issue of communication, right? I think that if you look back, even the earliest discussions of computer science efficiency, like the book, Mythical Man-Month, this discussion of you can't just throw bodies at the problem. I think back then the problem was really “How do you coordinate all these people working at their individual console workstations to try to build something complicated?” There was so little tooling at the time. So much of it had to basically be done like, “Let's go to meetings and work through things and figure that out.” That doesn't scale, which is why we had that problem in the past.

Now, in modern engineering, it's a different problem. I think that you have amazing communication tools, right? We have instant communication over Slack, email. Then we have tools to track work. We have cloud-based repositories. I think that all of that makes it really quick to actually build stuff and communicate. But there's all this giant layer of software underneath every single engineering project today, right?

So now, we're building on top of stacks and stacks of the operating system, your development tools, all the different libraries you're working with. Building against those things becomes complicated. Then every single project you're approaching has all these different dependencies you're trying to manage with other people. That whole piece ball of wax becomes very complicated. Basically, to make progress against that has now become our new modern problem.

I think that so much of if you look at the problem, it would seem intractable, but at Google’s scale, they have literally tens of thousands of engineers able to land code really quickly all day long without problem. Then you look at a company with 300, 400 engineers, they basically can't get anything done. It’s clear that it's not an extra inch of that you can't build software at scale. Google deploys a monorepo. Every single engineer is working in the same exact giant book of code, but they can work really efficiently. When it comes to other companies, they can’t. What's the difference?

It's really about tooling, right? It's about like, when you're an engineer with good tooling, the tool has to keep you on rails and let you work fast, clean. When you don't have that in place, you're really out in the wild, and you can very easily get distracted by all the different things we discussed. I like to talk about modern engineering as death by 1,000 paper cuts. It’s like there are all of these different things that will trip you up on the path to actually landing your code. What we try to do at Trunk is solve those problems along the way.

David Mytton [00:04:45]: Do you think there's a difference between being a software developer and a software engineer? Does that get into what you're talking about? Because there's no qualification, or you can do software engineering as a degree, I suppose. But a lot of good coders are self-taught. How do you think about that?

Eli Schleifer [00:05:01]: That's a great question. My background in computer science is from a liberal arts background. I think that the writing of software is really a problem of collaborative writing, less about science. I think it'd be great if, actually, more people learned computer engineering or programming. Your normal software engineering job was backed by the ability to write with a course and a logic and rhetoric or in reading versus understanding a background in chemistry or physics, right?

I think that it's because computer science has a history of coming out of engineering departments, you end up focusing specifically in those coursework on hardcore algorithmic work, hardcore understanding computability. But the truth is computability, whether something is Turing-complete, whether you can quickly solve this crazy B-tree algorithm is not really what most engineering is about. That's not the problem that most software engineers face day in, day out. Most engineers aren't working on the kernel. They're not working on things super low-level. What they're trying to do is basically put systems together. They’re trying to build a metaphor for the world, and they're trying to represent that in software and code. Then work with other people to make that metaphor make sense.

I think that the distinction of engineering versus programmer I don't think actually exists. I think that's mostly semantics. But I think you're right, David, to say that to focus on what really matters is to make things understandable. Let's make things small enough that other people can digest it.

David Mytton [00:06:27]: What's your take on that, Jean? You've come from a long period in academia and more formal approach to building programming languages.

Jean Yang [00:06:34]: When I was teaching at Carnegie Mellon, there was actually a course on software engineering, but it was highly controversial because it was unclear whether what we were teaching in the course really simulated the real world. Very much, it was group projects, it was a projects course, it was over the course of a semester, but there were often debates over whether that was the right thing.

I really liked what Eli says about rhetoric and writing being the thing. I think the main thing that was taught here was collaboration, which Eli also touched upon. But, yeah, there are a lot of human aspects to software engineering that makes it very different from simply writing code. But I'm really curious to hear from Eli on the engineering practices you've seen at places like Uber, Google, and BitTorrent. You mentioned that at least Google is doing some things, right? So I'm curious what you see as practices that work from places like that?

Eli Schleifer [00:07:31]: Yeah, I think that the practices that work are to basically automate everything as much as possible and to protect people from themselves. I'd say I've seen this work successfully at Google. I've seen it unsuccessfully while trying to build self-driving cars at Uber.

On the unsuccessful side, we had a landing process at Uber ATG that was two pages of a Word document that was like, “Here are the steps you should go through when you're actually trying to approve this pull request for landing.” I think that I gave a lot of pushback to the people who are driving this process out of the autonomy team. I was like, “We cannot have processes that require humans to follow paper checklists, whether they're in a Word doc or written down somewhere.” Humans are terrible at following these steps, and what you end up doing is introducing a ton of friction to the process. You're going to end up people missing steps and things will be broken.

The only way to actually protect engineering time and efficiency is to automate things, right? So the canonical tools, our first tool out of the gate at Trunk was Trunk Check. It's our universal meta linter. The idea here was that right alongside where the engineer was working or running all the right static analyzers, formatters, security tools, where the engineers are building their code before it goes to CI. That means that the engineer gets insights into what they should be doing right away. They can go and get feedback, they can fix those things up, and they never have to push code and find out from a tool in the cloud or, even worse, during code review that they missed this stylistic issue or that they didn't follow the idiomatic practice for this particular group.

Same thing it’d be like for coding guidelines. I think every style guide that's ever been written down should be thrown out. They should be replaced by tooling. There is no way humans basically looked at a 40-page doc on how to write proper C++ and then go incorporate that into their work practice. Tools can help them basically enforce those rules and that are represented in 40 pages of prose. The tool is going to force it properly. You can fix up the code right in line and get working. So that was really at the code level.

The next phase of development, obviously, is testing. I think Google had amazing tooling around test analytics and test prediction. So, obviously, Google's monorepo would expose any small change to tons of testing throughput. You'd have to run a lot of tests. All this is automated for you. If your test failed in CI, you'd basically be told the reliability of that test. “Has this test reasonably been flaky,” to have instant information about like, “Is this my problem that I just introduced? Or is this something that actually might be systemic? I need to help someone else fix up this systemic problem?” So the more that the tooling can tell you, “Is this your fault? Are you introducing this problem or not,” the better.

Just the other day, we actually had an engineer on my team, they’re just a junior engineer just starting out, and I was asking how things are going. He said, “Well, I've been fighting against this problem. I'm vacillating between thinking it's my fault versus the system's fault.” So is there a problem inherent in our current service implementation? Or is it just the engineer making mistakes along the pathway to landing code? If you lose faith in your system, then often you might be like, “I think it's just the system's fault.” I see this all the time, as engineers are putting up pull requests. This was a massive problem at Uber. It's been a massive problem at a lot of companies, a lot of our customers we're talking to.

Someone pushes up a pull request. That's going to kick off 10, 20 jobs in CI, and those CI jobs fail. One randomly will fail. What is the engineer’s first reaction? He’s going to click the retry button. The engineer basically has no faith in general in the system's ability to test their code. So, let it make sure that it's not the system's fault, right? So instead of the initial reaction of a system you trust, it would be, “I must have broken something. Let me go fix this thing and make it work.”

But in a system, though, you don't have to state. You’re basically kind of quickly knee-jerk click-through try button, which is going to kick off a whole bunch of more jobs and delay your actual investigation of the problem. So the closer we can get to the system being honest about what's working, what's not working, the quicker the engineer will actually have the right follow-through to go fix up their work correctly and move forward.

David Mytton [00:11:34]: Why are these tools so bad? Why does this not come out of Google sooner?

Eli Schleifer [00:11:38]: Well, Google has built everything internally for themselves, right? They have – everything is so tightly wrapped up with their internal implementation. It’d be basically impossible for them to productize it. But I think you see, there are a lot of companies out there pursuing better development experience, better development tooling in the test base and the coding space. I think that the industry has woken up to the fact that these tools should be available, they should be easy to set up, and the value return to the engineering organization as a whole is tremendous.

I think engineers, at the end of the day, like to write code. They want to push things forward. I think that if you have a poor development experience at your company, the engineers will leave because they aren't able to do the thing they like to do, which is write code. They don't want to be bashing their heads against the wall over and over again to get their work done.

Jean Yang [00:12:26]: So here's a maybe spicier question. What engineering practices at the places you've been are wrong and that you would like to fix?

Eli Schleifer [00:12:37]: So I'd say, at the heart of it, the practices we employed at Uber ATG — I can’t fix them because that business was sold off — but really, the issue of having any type of process where someone has to manually step in and go and check a list before they can land code is problematic. I'd say, for broader insight into how to build code better, I don't want to see code review practices where engineers are having religious battles. I think that is the place where collaboration goes to die, especially from senior engineers. This is something where senior engineers can really get into trouble.

So a senior engineer might write a nit on a pull request, right? So sometimes, if they're writing a bunch of code, the senior engineer is going to leave a note nit. This could be done a little better this way or that way. They're not recognizing the cost of writing that word nit down on that pull request, right? So a junior engineer is going to be like, “Well, the senior engineer said I should go send this nit thing. Even though they prefaced it with nit, I’m going to go fix it up, right?”

So that means they're going to go spend – even if it's 30 seconds to go fix that issue up, what actually happens is they do that 30 seconds, and they commit that code. Then they push it up to the cloud, and that runs all of CI all over again, which could be 20, 30 minutes. Then they're ready to go and try to land it again right? Maybe then they get the stamp. So I think that we don't think about the small time cost of if I write nit, it might take me two seconds to write that down but it's going to cost the junior engineer 30 minutes of their day. Maybe that means that they're going to not get to the next thing on their list to get done that day.

This is just that death by 1,000 paper cuts. There are these small things, really, that add up to a larger problem. There's no burning fire I could point to be like, “This one problem, if that company stopped doing this, it’d be better.” It's more like all these small interactions at the engineering level are important to think about, right? It's like what is the point of a code view? Let's make sure that the code is legible, that we understand what it's doing. It’s not a place for writing nits. You can go say that nit to the person after the thing merges maybe. So they won't be tempted to go fix it up upfront because those things really don't matter in the end. What matters, in the end, is “Will the customer’s success with your product move forward or backwards”, right? “Are you making positive direction in where the software is supposed to be going?” And nits generally aren't about that. It's just It's more of a zero-value game.

David Mytton [00:15:02]: To what extent do you think that's to do with tooling versus the culture around how engineering is done at the company?

Eli Schleifer [00:15:09]: I think that that is less of an issue of tooling. It's definitely an issue of culture. This is a thing that we talk about in Trunk internally. We just had a brown bag on “What is a code review about? What should we be focusing on? What should we not be focusing on?” I think that it doesn't take much space to just remind people like, “Please don't write these kinds of comments inside a code review.”

The other thing I'd say is the request changes button inside GitHub with this giant red plus icon that shows up under code review. No one wants to see that. I think we try to dissuade people from using that, unless, really, there's something dangerous inside of it. If you don't want something to land, don't stamp it.

I think the other piece of advice we have just in this piece of the flow is: We should trust our engineers and also understand that code is constantly – it's a living document. It's changing all the time. If something gets in that's imperfect but not terrible, that's also okay. So if you have an engineer put up a pull request, you have feedback, leave that feedback and stamp the pull request. Assuming there's trust, then the engineer is going to follow up, fix up your comments, and then land that. There's no additional cycle.

If you don't stamp it, that means you're going to— you’re basically saying to this person, “I'm going to hold up your work until you show me that you can actually follow through on the things I'm asking about.” That's a level of distrust that, I think, is not good in a highly collaborative working environment

David Mytton [00:16:28]: What do you think AI will do here to the development cycle? GitHub’s just released a load of functionality to add to copilot in IDE chat and pull request summaries and docs Q&A. Do you think there's a role for AI just to help and before it even gets to the PR?

Eli Schleifer [00:16:45]: Yeah, I think that, overall, there's always opportunity for better automation. AI is just one level of automation that we can introduce into the system. Those PR summaries are great if they get it right. It's basically cribbing a little bit of notes for you, and you can go fix it up. If someone basically starts ignoring it and just sees it as noise, then it's counter-valuable, right? So I guess it really depends on what is the real purpose of this.

At the end of the day, when it comes down to engineering, I think the code is what matters. The comments basically can easily get out of sync with that. Same to the description of a pull request, all these things are not the truth. The code is the only thing that's the truth. That's the only thing that computer is actually going to execute, and everything else is a lie of some sort.

Jean Yang [00:17:28]: Eli, that's interesting you say that. I have the opposite view. I think code is a lie because people think it's the truth. Really, what runs in prod is the truth. So I'm curious how you reconcile the fact that there's so much code for people to read and clean up these days, and that code is playing with other code, some of which people can see, some of which people can't see. So what is the truth, and how do you think people should be engaging there?

Eli Schleifer [00:17:55]: If you basically live at head, then your code and your prod are almost identical. So in most ways, yeah, whatever is in production should be what is at the head of your repository. We try to live pretty close to head with following the Google style of “Live ahead, make sure everything's always working, don't run weird branches that makes it unclear what's actually happening in the system.”

At the end of the day, honestly, these systems are so complicated that there's no way you can read all the code. There's no way you can even understand the full models of this stuff when people create elaborate flow diagrams of their systems and like— Unless it's actually automatically generated live, those things are also going to be incorrect very quickly. It's why I would say just like comments inside code could be lies, I’d say product specifications, engineering documents up front are like the greatest quasi-liars. It's like an opportunity, basically, to be like, “If the system could be built perfectly, this is might be how I built it.”

But then the reality meets the engineer.

An engineer discovers a bunch of things about this system that they didn't think about in advance. Then the code diverges from the actual specification which is why in general at Trunk, we try not to actually write lengthy engineering documents. What we try to say is, “Where is the YAML? Where's the Protobuf specification for how the service is going to operate? Those are the most important things to get the shape of what we're going to be building. As long as that spec eventually gets published, as long as that product gets published, that's what it's actually going to be doing. Everything else you write down is going to be best guesses at what actually gets implemented.

David Mytton [00:19:27]: Do you think that changes depending on where in the stack you are? Because on an earlier episode, we had Russ Cox, the creator of Go, and he was saying one of the advantages he thought about Go was that they had a spec versus other languages which didn't. I suppose building a programming language is a very different challenge to building an application using that language. Is that how you also think about it?

Eli Schleifer [00:19:50]: Yes. I would say that when it comes down to something very specific, something binding a language or an algorithm for doing a computation, that's a case where you can actually be as specific. You want to be exhaustive in your unit testing, right? If you're writing a string parsing library, or you're writing a Regex parser, you want to write a ton of unit tests that'd be highly localized to that tool because the way it operates has to be clear and documented, and testable.

When you get further away from those kind of core algorithms, two things around like “How's the general system going to operate? How is the front end going to react to this in a request?” Those things really should not have a ton of documentation upfront written and engineer reviews and whatnot to describe how they're going to work. You should go build them and adapt them because they're going to change really quickly. You can spend a lot of time writing documents.

I think if you had some kind of magical tool and you could assess how many pages of Google Docs of engineering specs have been written versus how much of it should have been read, it'd be mind-boggling how much wasted time was put in by these really senior staff engineers at companies doing a lot of good work, thinking about things, and writing it down in pros, and then never reading it.

I thought the same thing when I started out at Microsoft as a program manager. I would write really lengthy documents, and the engineering team would never read it. Then the engineers would write similar long documents of how they're going to build that program specification. I was like, “No one is reading any of this.” At the end of the day, the engineer just starts writing code. If they have a question, they just go and ask someone else anyway. They look back at this doc. So much wasted effort there. I think we should move further and further away from that as much as possible.

David Mytton [00:21:29]: That makes sense. So what should individual engineers do then to improve their own coding experience?

Eli Schleifer [00:21:35]: Having tooling to work with quickly and a language you're familiar with and testing at the right level, locally, as much as possible is the biggest win you can have, right?

I think I broke my teeth programming in Visual Basic a long time ago. That language was so beautiful because you could literally edit it and run the code at the same time. It would recompile on the fly, just like we could do today with JavaScript. So you could be in a program. You'd be like, “Oh, that was a mistake.” Go quickly change it, and then continue and even change variables on the fly and change content, just make the whole thing work. It was a really cool system. I think we have languages that allow us to do that today as well. You can do similar things in Python and other less non-compiled languages. But I think, in general, for an engineer to be able to write code, compile quickly, and execute that code and test it and verify it and understand how it's working is the quickest path to success to build something quickly.

In-house, we have a lot of tooling to help engineers who are working on front end quickly visualize all their changes that can go up. You can see it in code review. You can see, “Oh, these changes is what it actually looks like in implementation.” All those previews are autogenerated, you can see it inside the pull requests, leave a little comment that’s autogenerated, you can go look at the deployments. And all those little steps make that whole process so much faster. I think that anything you can do to help someone get context really quickly and move through your code will be a win.

Jean Yang [00:22:57]: So, Eli, what does developer experience mean to you?

Eli Schleifer [00:23:01]: Developer experience is everything that I know I have to go through to write my code and merge it into the main branch of the system, right? It's my IDE, it's the tools that underlie the IDE, it's Git, it's GitLab or GitHub, it's my code review process, it's my testing infrastructure, it's my ability to see my CI jobs, it's all the things that I need to do to actually get my job done. Or it might be Copilot if you're using an AI helper.

Developer experience is like what's it actually like to do my day-to-day work. I guess in life, it’s supposed to be like a lot of Linear and Slack and other tooling to make sure that we know we're working collaboratively together.

David Mytton [00:23:41]: What are the common failure points in that flow? I suppose when should a company or a new startup start putting effort into improving all these things?

Eli Schleifer [00:23:49]: I think, in general, the sooner, the better, right? Because it'll get away from you very quickly. But it's a – the first thing that a small startup should invest in is good integration testing because they're baking into their system broad ideas of “when this happens, this other thing should come out the other end.” If you write an integration test that enforces that, then you can continue to change the box in the middle. You can change the black box, you can totally throw it out and replace it, and those tests will still function correctly. Make sure that you're not breaking anything. Because the success path is basically you get some customers, and those customers are going to ask for new things. If you don't have those integration tests in place, as you start to add new features, you're going to break the existing stuff. That means you're going to be chasing your tail, things will fall apart. So I think early on, integration test is super important. Don't over-index on unit testing. I think it's a misadventure because your actual implementation is going to change and unit tests are really about testing the actual implementation versus the broad thing you're trying to solve for.

Then beyond that, once you get the larger teams, make sure you have reasonably scoped stand-up groups. Make sure you have good issue tracking, so you actually understand what's being built or not. Then good tooling like Trunk Checks, CI Analytics, Emerge Queue as you're scaling up to make sure that you can work collaboratively without slowing each other down. Make sure that during the code review process, like the things you're actually getting reviewed on are – those have to make sense. Is it actually solving the problem? Do we have tests in place to make sure this thing is working?

David Mytton [00:25:16]: Does that change at all when you're going from just the individual developer building side projects through to, I suppose, hiring that first person within a team or adding someone to a project through to 20; 200; 2,000 people? What the common failure points you've seen there?

Eli Schleifer [00:25:30]: Early on, you can work with two or three people really efficiently, really quickly. That's kind of a beautiful thing how quickly you can build software that way. Then as you get to the next factor up and to get to 10, 20 people, you really need to start actually breaking back down into smaller teams because those people should not all be in a single standup together. I think the important thing to always remember as we scale is that the most efficient engineer is the engineer who wants to work on the solo project. Then as soon as you add people, you start to slow down. So as we scale up, we want to keep breaking down our teams into more logical units that make sense for them to work together.

We actually just implemented something that we're not labeling yet, but it's kind of like a micro-team structure where we have three to four people at most really working on a product together on a feature area, and they're doing a standup together. The standups are no longer eight people, standups are really three to four people. There’s, generally, a TL there who's going to be driving the work board, making sure that everyone's working together. They're going to be doing code reviews for each other.

When you work at the smaller scale, no one basically is zoning out during the process of the standup, right? Because I think when you have eight people, you're going to have people typing away at their computer, people are going to be dozing off because whatever is being shared is not actually relevant to what they're doing. I think that if your standups are really relevant to everyone present, you will have a better engineering flow. So I'd say as you get to 10, 20, make sure you're actually breaking up your standups into smaller units.

Once you get to a scale of 200, now you have multiple projects, right? Really, this starts to be the point of like you probably have a lot of products out there. There's probably a lot of engineering inefficiency happening at this scale because you've hired beyond what you actually need. I'd say if you look at most, what's happening in industry of late, it’s like there are companies that scaled engineering like crazy without actually having a need for these particular engineers. Or the value proposition is being lost in what they're actually working on.

I think this is the biggest thing between a smaller startup and a giant tech company: At a giant tech company, at the end of the year, the giant tech company comes to the employee and is like, “Tell me what you did this year and why you have this job. Tell me all the good stuff you did for us.” At a smaller company, all management knows what all the people are actually doing for you, like a clear visibility into what those engineers are adding and contributing to the actual company's efforts.

I think the biggest thing to focus on when it gets to 200 engineers or 2,000 is what are these people actually working on? Who's making sure that there's a director of engineering for each of these smaller groups of 30, 40 people to make sure they're actually pushing towards something that matters, that matters to the company, that's going to move the needle, and that those engineers can still feel pride in and feel like they have impact?

I think this brings me back always to this idea of Marx's concepts of alienation from labor. I think there's no one who's greater alienated from their labor than an engineer at Oracle or Microsoft or Google. They basically do a bunch of work, and their ability to actually impact the whole, to impact what's actually happening is so remote and so obfuscated that it's very hard to feel an attachment to what they’re actually doing.

I worked at an innovation lab inside Microsoft before I started my first company. It was always about, well, is this idea a billion-dollar business? I was like, “Oh, I don't know if it's a billion-dollar business. Maybe worth $100 million.” But $100 million business to Microsoft was not interesting. A billion-dollar business at Google is maybe a little interesting. It's like the scale is so crazy that it's very hard to innovate, right? Because it's very rare to be like, “Oh, yes. Obviously, there's a billion-dollar idea right here lying on the ground. Let me just pick it up and go execute against that.”

I think that that's why so much innovation comes from the startup world because you can say, “Okay, listen. We don't need to get to a billion right away. Let’s get to a billion in 5 to 10 years.”

But upfront, you don't have to prove that that pie is all there for you.

David Mytton [00:29:22]: Yes, definitely. It's interesting that you used the example earlier of Google and the tools allowing them to ship code really quickly. But one of the criticisms recently is about how slowly they've been able to ship products to customers. I suppose it’s a different thing from getting code into production versus getting a product into the hands of users.

Eli Schleifer [00:29:42]: Yes, for sure. Writing code in Google is really fast. Shipping product onto Borg and then getting it deployed, that launch process, that's really– where the regulatory frameworks were insane. There was then a lot of paper. If you wanted to launch something on Google, you have to file a lot of paperwork. Then teams would show up and go and review that stuff. You'd have to get on this launch calendar. There was so much very specific Google stuff to get anything out the door.

Whereas at a startup, if you want to launch something, you can just open up a route in route 53, and it'll start showing up. The Internet can see it, and you're off to the races. That's why startups can innovate so quickly. They don't have all that other red tape out there. These larger company has really built with great levels of redundancy to protect their giant money-making machines, which makes sense for them. But it also comes with the side effect that it's very hard for them to build things quickly.

Jean Yang [00:30:34]: You have a really interesting position on Git commit messages. In fact, you think they're useless. Based on my understanding, this seems to be at odds with other things you've said. So I'd love for you to reconcile this and explain why.

Eli Schleifer [00:30:48]: So you might be referencing this blog post I wrote about how Git commit messages are useless. This is really about your local Git commit messages you'll write down when you're working in your own private branch. So an engineer’s first action starting to work, is like Git checkout, minus C, minus B, Eli’s branch. Then I'm going to do some work, and I'm going to save it. I basically will say Git commit-A.

Now, git, by default, has this silly requirement that I also provide a message on top of that work. I basically think of this as the equivalent of if every time I went to save or auto-save a file inside Google Docs or Microsoft Word in an old world, if I had to leave a comment of what I thought about that checkpoint, that's what this is. That's what that Git commit message is. I don't think you should have to leave little meta tags about your work, while you're saving files, which is, essentially, what Git commit is doing.

We work in trunk-based development. We create a branch locally on our machine, we do a bunch of work that's going to contain five commits, 10 commits, whatever it is, push that up to the cloud, run a bunch of CI, do follow feedback, you even might do a bunch of more work that might have three, four more commits. You push that up. All of those 10, 14 commits that might live inside a pull request, they don't need documentation. The documentation is the code itself. I’m making some changes. The changes are clear. If you wanted to, you could throw an AI system behind this to go autogenerate those messages. But I think that's a waste of energy. I think the better thing to do is just read the code.

Because at the end of the day, in a trunk-based development, what's going to happen is that branch, that pull request is going to get merged onto main. We're going to squash-merge it. We're going to take all those commits and smush them all together because all that matters is actually the final result of what I'm trying to change. When that thing gets squash-merged onto main, absolutely, you need a good description. There should be a title and a description, which is what you'll get from GitHub or GitLab. But all the stuff that went on in the middle, I don't need that. I don't need to document it. It actually doesn't really matter. How I got from point A to point B is not worth documenting. All that matters is that I got there and what it looks like when I am landing that code.

So it did light up Hacker News a little bit. Everyone got a little upset. I think if you read it through the post clear, it's not about that all Git messages are worthless, just the ones that we spend a lot of times saying block or fix or checkpoint. It's just another moment where the engineer wants to move forward and has to remember, “Oh, yes. Let me quickly write some pithy little thing, so I can actually get my work done.”

David Mytton [00:33:15]: Good example of a clickbait title there to get people to get annoyed on Hacker News without actually reading the article.

Eli Schleifer [00:33:22]: Yes, absolutely.

David Mytton [00:33:24]: Well, before we wrap up, I have two lightning questions for you. So the first is what interesting DevTools or tools, generally, are you playing around with at the moment?

Eli Schleifer [00:33:34]: At the console level, I use Warp a lot recently. That's my OSX terminal, better than the built-in one. So I like using that. It's clean. I'd say in the cloud world, we do a lot of time playing around with Slack and Linear because that's how I'm communicating and working with the team.

Then what I actually do on the rare occasion I get to write some code, everything happens in Visual Studio Code. So I'm glad everything has come back to Visual Studio because I started programming in Visual Studio back in Visual Basic time. I still get to use Visual Studio, though, it's no longer on Windows. I don't think I've actually touched a Windows machine in 5, 10 years. I'm very happy not to be working on that stack anymore.

David Mytton [00:34:14]: That leads into the second question, which is what is your current tech setup? What software and hardware do you use?

Eli Schleifer [00:34:19]: Yes. I have a MacBook Pro M1, mostly because I needed the extra RAM because I ended up opening a million tabs inside Chrome, which is basically where I live. I was too lazy to close all my Chrome tabs, so I got – I needed the machine with more memory. So I really liked that box running OSX.

Then when I do a little bit of coding, it's all in Visual Studio. Then all of our coding, generally, happens to remote machines. So we have a local server farm of Linux boxes that are basically used for engineering. I think having that split of a remote box and a laptop is great because your laptop can be used for research and doing coms and other work. Then your engineering box can be dedicated for all of its RAM and CPU for compilation and running tests.

David Mytton [00:35:03]: You’ve got that in your office, right? You're not using codespaces or anything like that?

Eli Schleifer [00:35:07]: We have that in our office. The codespaces solution are cool but super expensive. At the end of the day, they're basically putting a large markup on top of hardware that is commodity, and we can get these machines– No, we actually have a bunch of Dell machines. Those Dell machines have fixed costs, and we can just leave them on and have no worries about the cost pieces. I think we looked into it at some point. Codespaces or similar products are going to be 3x, 4x that price. We also cannot control all the hardware we had, right?

So here we can control the hardware piece. Make sure we have enough RAM for the engineers who are doing a bunch of C++ work. For the engineers to do frontend work, they have less RAM in their boxes, and it's just like a clean solution. The costs are definitely much better costs if you can have a place to put all your machines.

David Mytton [00:35:54]: Unfortunately, that's all we've got time for. Thanks for joining us, Eli.

Eli Schleifer [00:35:58]: Thanks so much for having me. This is my favorite topic to geek out about.

David Mytton [00:36:03]: Thanks for listening to the Console DevTools Podcast. Please let us know what you think on Twitter. I'm @davidmytton and you can follow @consoledotdev. Don't forget to subscribe and rate us in your podcast player and if you're playing around with or building any interesting DevTools, please get in touch. Our email is in the show notes. See you next time.

[END]

David Mytton
About the author

David Mytton is Co-founder & CEO of Console. In 2009, he founded and was CEO of Server Density, a SaaS cloud monitoring startup acquired in 2018 by edge compute and cyber security company, StackPath. He is also researching sustainable computing in the Department of Engineering Science at the University of Oxford, and has been a developer for 15+ years.

About Console

Console is the place developers go to find the best tools. Each week, our weekly newsletter picks out the most interesting tools and new releases. We keep track of everything - dev tools, devops, cloud, and APIs - so you don't have to.