Everyone wants AI. Founders rethink their business with generative AI. Engineers rethink the infrastructure underneath the application. Each layer of the tech stack has volatility as multiple people rush to build what they think it’s missing. Not enough time has passed for gravity to settle in and for a standard tooling and typical workflow to emerge. This situation carries a unique potential for burnout, and we want AI, but we don’t wish for burnout. I invited a top ML engineer to talk with the builders I invested in to help them prevent burnout. Yaroslav is ex-Google Brain, ex-OpenAI, ex-PyTorch. He had to scale up infrastructure for distributed training four times, sometimes out of necessity rather than desire. He shared the lessons and techniques to stay happy and productive when working with uncertain timelines, complexity, and technical support staff that hates supporting you.
Each of us can prevent burnout. Awareness is all you need. Burnout happens once work becomes routine and work is no longer rewarding. You must set time off to evaluate your workflow and include the reward if it’s missing. This blog shares hacks that everybody can use to make their work more enjoyable.
Main points
Longer timelines and integration work can lead to burnout. When you summit a mountain, you must wear appropriate hiking boots. But when you fit together a system with components made by different people (e.g., setting up ML infrastructure), you don’t think you need “brain boots.”
Each of us has “brain boots” - mental virtuous loops - something triggers a behavior, you do something, and then you get a reward. Sometimes, this loop gets broken. Some people cut out the reward from this loop, get triggered, and do a routine. This 2-node cycle is unsustainable and will eventually cause you to become unhappy and unproductive. People dealing with volatility or integration work are especially at risk.
Self-awareness is the solution to prevent burnout. You can become self-aware by taking time off to reflect on whether you are getting the reward or by asking a friend for help. Once you know you must get the reward back into your work; you can do that by 1) shifting to curiosity - “How terrible is this? What infra can I get in these conditions?”, 2) structuring your job to have shorter-term milestones in a visible place to get a sense of accomplishment, ideally daily, and 3) finding another personal goal that synergizes with your work and is immediately rewarding - “the more bugs I fix the more I become the Chuck Norris of ML and I can write a book on setting up infra”.
Yaroslav is the Chuck Norris of Machine Learning in my eyes. Even Chuck Norris sometimes has to do work he’s not excited about, like building ML infrastructure. While setting up ML infrastructure from scratch, Yaroslav observed some of the engineers started to burn out: “I realized if they get burned out, I might get sucked into doing infrastructure myself. So I'm like, no, I have to stop this at the root. People must be happy and productive, so I don't have to do it” 😂. He started researching and found this standard diagram of reward-based learning. Something triggers the behavior. You do it, and then you get the reward. You define what reward is; it's subjective. And sometimes, this loop gets broken. Some people have cut out the reward from the loop. This kind of 2-note cycle is unsustainable and will eventually cause you to become unhappy and unproductive.
People doing integration work are especially at risk. A typical work scenario for an engineer is to build something new or to fix a new problem. So you come in and need to “pull out the snake from the box,” but you don't know what's in the box. Maybe it's the tail of an elephant. So you pull, and it's a herd of elephants extending as far as the eye can see, which is a possibility you have not considered. Your first instinct is to start caring for the elephants to reach your original goal. But this means putting off the reward. And if you put off the reward for too long, you end up in this loop. So, if you stay in the cycle long enough, you will forget about the reward. A typical response of burned-out people is, I don't know what brings me joy anymore. Awareness is all you need. You must know when to get the reward and perform self-intervention to return to a healthy state. The time you can stay productive without a reward is limited to a few months.
While at Google Brain, Yaroslav took on a project to develop infrastructure to make all his teammates more efficient. However, this turned out to be more complicated than expected because of a long tail of bugs. His manager gently tried to discourage him from this work. He gave him to read an essay called, Let it Break. It's a story about a hero engineer pulling nights and weekends, ensuring that some service stays up, leading to burnout. The correct action is that he should have just gone home and let it come crashing down. One of 2 things would have happened at this point: 1) service goes down. Nobody notices because it's not that important, or 2) service goes down, and everybody notices. So, in the first case, the engineer should have not been working on the service. In the second case, this would force management to examine the issue and realize that a critical service was hinging on a single burnt-out engineer. At this point, they could staff a whole team to maintain it and promote the engineer to team leader.
When Yaroslav got this advice, he ignored it because he was stubborn. His thinking was that his manager was not as invested as he is: “He doesn't realize how much work I've put in and how close I am to the finish line! But in reality, I was not close at all”. Dealing with this turned him into a mindless bug-fixing machine, so every day, he would go to the office, fix a couple of bugs, discover even more bugs, and then go home, not feeling very good. Yaroslav was both unhappy and unproductive. The reward was missing, and a trigger-routine loop captured Yaroslav, which led to several months of inefficient work.
Whose fault was it? Yaroslav lacked self-awareness. But then, how can you correct your lack of self-awareness when you don't have self-awareness in the first place? It may be his manager's fault. He should have done that. However, the manager needed more direct reports. So he might have been going through a mini burnout himself. It may be the director's fault because the director oversees the management structure and the number of reports managers have. But then the director cannot materialize managers out of thin air. They have to work with the resources they have. It may be the fault of the hiring manager. However, higher management has to deal with changing business conditions and market forces and sometimes takes unplanned actions.
The person responsible is the person who has any control over the situation - you. So, the lesson here is always to maintain self-awareness. You must recognize in time that you can't sustain working without joy. You must always ask yourself if you are happy. Are you annoyed? Are you stressed? Where do you see this trajectory going in the long term? What can you do to improve it? If you have a higher-level goal you're trying to achieve, which requires deferring your reward, try to be realistic about how long you can go without receiving some kind of reward. If your goal is to launch something, it's a year away, and you may not be able to defer your reward for a year without burnout. You might end up miserable and also fail at achieving your goal. You have to structure your work so that you're happy in the meantime. There are three kinds of reward hacks:
1. Curiosity-driven rewards
2. Short-term milestones
3. Synergy with another direction that provides immediate reward
Curiosity-driven reward
After Yaroslav left Google, he joined a new company - “first, I felt lost because I didn't have access to this amazing infrastructure I had at Google. But then I got curious. I probably couldn't get this amazing Google-level infrastructure. But I thought, what kind of infrastructure could I get?”. He decided to build it from scratch, using AWS. Whenever he encountered issues, he filed a ticket in the AWS support queue. They would tell him how to do it, and he would fix it. The work was enjoyable because there was no deadline. And he was working, mainly motivated by curiosity. He just wanted to see what is possible outside of Google. After a while, the tickets dried up. At that point, the infrastructure tool was complete. He was surprised he had infrastructure much better than at Google because everything wasn't Python.
During this complex integration work, he became aware by setting aside time to work on his awareness. One easy way to do this is to walk to the office and back. You can grow your awareness often during this time if you spend an hour just thinking about your feelings. Set aside time, and tell yourself, I will allocate 40 hours working on my awareness next quarter. Read the books of Judson Brewer, Brown University, to learn more. Google trains its engineers for awareness. You can add the reward to the loop if you're curious about what's happening. The act of doing this is rewarding in itself. So you can use it when working on infrastructure, and things are terrible. Instead of being annoyed that things are awful, you could get curious precisely how terrible they are. Is it the most horrible thing you've ever seen or not, so you can motivate yourself this way?
Short-term milestones as a reward
At another company, Yaroslav had to set up infrastructure once again. Still, he needed help using his AWS experience because the company decided to use and sign a contract with another cloud provider, which required better support. Instead of 50 tickets away, he was 200 from a working solution, which meant more knowledge and more lessons to gain. The problem did not become harder; its scope had just changed. Then Yaroslav was thrown yet another curveball; the founder decided to no longer pay even for the less good support from this cloud provider, so the startup saved some money. Yaroslav didn't let this upset him because attachment brings suffering, and attachment to your workflows is just another kind of attachment. Instead, he reorganized his workflow around Google Docs. He would run benchmarks, find bottlenecks, and share his benchmarking code. And then, another engineer would look into these issues and respond in Google Doc comments.
It was weird to go outside of the standard ticketing queue. But this approach appeared to work. And since Yaroslav didn't have a ticket queue, he had a Google Doc where he kept track of issues and, over time, saw the list of resolved issues grow, which was rewarding—the leadership at the time needed to understand his approach. At one point, the CTO called him and said, “Yaroslav, I am not happy about your execution. You're spending all your time in useless meetings and making docs”. Yaroslav didn't let that upset him because I wasn't writing docs for them. He was writing docs for himself and this other infrastructure person in the company who appreciated them. So this is a more general lesson to keep in mind. Don't do something because somebody wants you to because you have no control over their wishes. Maybe tomorrow they'll change their mind. Or they'll just forget what they wanted. Or perhaps you misunderstood what they wanted. Instead, you could let the person convince you to do something or to want to do something; at this point, you will continue doing it because you want it, not because somebody asked you.
Also, if you're acting as a manager, you want your reports to do things only if you told them to do it because they will do a lousy job, and you will regret asking them. Ensure you only do what you want to do, and ensure your reports do what they want. So you have to convince them to want them to do those things. You could experience this when dealing with infrastructure issues. When asking for support, you act as a manager and assign work to the support team. If the support team is not motivated to help you, they will find an interpretation of your request that generates minor work for them, like things don't work. I don't know. I tried. It works.
The lesson from this experience is that you need some intermediate goals to keep track of your progress and feel good about this progress. And then, one day, you want to look back at what you did and think you've accomplished something. You may have yet to get a thing working. But you eliminated yet another way in which it is not working. Integration work is like opening a door with many locks. It only moves once all the locks are unlocked. So, instead of keeping track of how many times you open the door, you could keep track of how many locks you have unlocked.
Synergy with another direction that provides immediate reward
Synergy is when you combine your eventually rewarding work with some immediately satisfying direction. There is a trick they teach managers. Two people are needed to work on a less desirable task. And the reason for this is that teamwork is inherently rewarding unless they hate each other.
Yaroslav worked in his career with another accomplished engineer, Stas Beckman. When he started encountering unexpected problems, instead of seeing them as obstacles, he saw them as content sources for his book. The more bugs you see, the better because it improves the book. You can find the book here; it’s a great resource with hard-earned lessons for any ML engineers: https://github.com/stas00/ml-engineering. Yaroslav keeps a collection of documents describing how he solved them, and over time, he cleans them up and makes those docs better. So, looking at his work this way, he can feel like he’s becoming more knowledgeable about every issue he encounters.
In summary, you must maintain self-awareness and do something when your work stops rewarding. If you want to continue going to the goal without burning out, you could do some reward hacks: shifting to curiosity, structuring your work to have some shorter-term milestones, ideally daily, and possibly finding if there is another direction that has synergy with your work, like writing a book or training someone to do your job better than you.
Make work rewarding every day. source
Thanks to Yaroslav for coming to chat with us and share his experience. Full recording below.
Well written, thanks for taking the time to put this together !