10 things new software grads can learn at Big Tech

Bhanu Agarwal
5 min readJun 6, 2018

--

As I was graduating in December 2016, my friends, seniors, and professors repeatedly emphasized the importance of gaining experiences that maximized learning after graduation. They recommended me to find a role where I could quickly develop tangible skills that would remain valuable for years to come. In a world where startups are typically considered some of the best places for super-fast skill development and big corporations are traditionally considered as laid back places, I seemingly ignored this advice when I joined one of the world’s largest companies — Microsoft — as a Software Engineer. It turned out that I learned some very important things about building large-scale products that I didn’t plan for.

I thought I’d share my 10 biggest learnings as a recently graduated engineer working at a big tech company.

1. Whatever can break will break in prod.

This is the fundamental truth about real-world software. Downstream services will timeout and be too busy, 3rd party APIs will have downtime, and factors beyond your control, such as infrastructure problems, framework bugs, upgrades, etc., will lead to errors. The feature is doomed to failure without active alerting and monitoring, and one can’t practically test for all of these things in test environments. To safeguard against such unforeseen failures, appropriate retry and exception handling logic needs to be put in place at the time of feature design. The design architecture has to account for failure, and incident mitigation plans need to be worked out before hand. The art of defensive programming comes handy when building production-ready software with a large user base.

2. Edge cases become base cases at scale.

You could say that 0.1% failures is a small number, but 0.1% of 1 million users is still a 1000 users. Even rare, edge case type failures have great significance in production. While designing a feature, if a decision is made by claiming that an edge case can be ignored because no one would be ridiculous enough to do something so obscure, then think again. Customers always come up with a use case. Page size limits, header size limits, throttling limits, etc. are often hit because there will always be some users who’ll come up with a legitimate scenario that will push things beyond the limits of the designed system.

3. Without a migration story, a feature is incomplete.

Let’s say there is a product with feature X that needs to be replaced with a new feature Y, then there has to be a well thought out migration plan for this. Going from State X to State Y requires a state transition path. It needs to be part of the original design. An absence of the migration story often leads to downtime and breaking changes during a new release.

4. Distributed, asynchronous systems can lead to dangerous race conditions.

Features involving distributed asynchronous execution with overlapping state changes, usually in a message bus, background task, etc. type of paradigm, are tricky to get right. Creating timing diagrams and enumerating through all the possible orders of execution is useful to discover and weed out any Heisenbugs before hand.

5. Customers know your feature better than you.

A team of engineers, PMs, designers, etc. vs. all the millions of customers using your feature, just think about it, who would know better? This means that when your customers report something, they are likely pointing out a real problem that needs to be proactively investigated and fixed.

6. Last-minute feature changes are the biggest source of hacky code.

An important part of engineering involves successfully balancing the trade-off between velocity and quality. Products need to meet rigorous standards of quality and be delivered quickly. Therefore, unsurprisingly, the biggest reason for most hacks in design and implementation is an impractical timeline. When feature behavior is changed last-minute but the release timeline is not extended, code quality gets often compromised. Engineers are forced to fit the existing solution to the new behavior rather than redesigning it with a future-proof architecture for new functionality. This leads to short-term success, but adds on to the pile of long-term tech debt. It is important to spend time describing and agreeing upon the feature behavior early on in detail, in order to avoid last-minute overhauls.

7. You won’t remember your own code after a few weeks.

Software engineers work on numerous projects one after another. It is impractical, if not impossible, to remember every line of code one writes. It is only a matter of time before you’ll find yourself scratching your head about what a weird looking method does, cursing the author, only to later realize that it was your own doing. This is perfectly normal, so don’t be surprised when this happens.

8. Simple versioning can get complex very quickly.

Data model versioning, although sounds simple, can get very complex quickly, especially in the world of microservices. A small oversight can lead to unexpected behavior or deserialization errors due to a missing version on an innocuous parameter. Appropriate versioning is what makes backward compatibility work, and it requires one to be very meticulous and patient to get right. It becomes important to build a test matrix to verify the many combinations of UI and service version mismatch behaviors before the code makes it to production.

9. Parallel processing is great but often impractical.

In principle, parallelization is a great concept — one can typically get incredible latency gains by breaking down a large task into smaller subtasks executed in parallel. However, with finite resources, the pattern can sometimes be dangerous. On a single machine, consuming more threads for processing one request means the availability of fewer threads to process another. And when the goal is to make things faster, this can be counterproductive since only a finite amount of resources are available. Further, over-parallelization can sometimes lead to thread pool exhaustion and starvation, which can be a pain to debug.

10. Keep it simple stupid.

Simplicity is the ultimate sophistication. Software development really comes down to writing clean interfaces that build the many layers of abstraction, and frequent refactoring to clean up code does pay off in the long run. The maintainability aspect of software becomes much easier if things are kept simple to begin with.

I hope you can relate to some of these learnings as a recent tech graduate. Please share your own opinions and experiences as you get that first tech job out of college.

--

--