10 Things I Learned About Web Development
I thought I knew everything, but I knew nothing. Here’s what I learned doing Node, React and web development in general.
1. Everything starts with URLs.
Whatever’s in the URL, should be reflected on the page, and vice versa. Are you updating the state? Update the URL first, and derive the new state from there. You’ll get lots of things for free: browser navigation, reproducible errors, easier reporting.
We had data-based URLs, and it was a nightmare. We didn’t really know what kind of page we were displaying until we called an API, and only then we could start fetching the appropriate data and assets. That means we had a blocking request sitting in front of every page load. Things like analytics and monitoring get much harder as well, because you cannot do any kind of static analysis on the URLs.
2. Caching is hard
We had a CDN in front of our website that would cache the HTML we sent down the wire for a bit. Then we disabled that cache because we started doing server-side A/B testing (with cookies). Or at least we thought we did, because setting Cache-Control: private, max-age=0
does not mean that a user never gets a cached response. This very well could have affected our A/B test results, but we’ll never know by how much.
3. Caching is really hard
We then opted for a lower-level cache with Redis sitting in front of our backend APIs. Turns out that did not really work either; we have millions and millions of pages and we’d only see benefits from the cache with a huge Redis instance, which was costing way more than it was worth. Additionally, we started seeing weird errors, like redirect loops, because the cache works on-demand, and it will cache API requests at various points in time, leading to invalid states. As an added bonus, the Redis cache eviction policies broke our login functionality, because its eviction strategy seemingly works by sampling keys across databases, and our sessions were being dropped instantly because of the high eviction rate. (PS: I love the word eviction.)
4. No one actually knows how SEO works
I learned a lot about (technical) SEO, but I also didn’t learn much. It’s basically a black box. Google is intentionally (and understandably) mum on how the ranking algorithm works, and it leaves everybody guessing. And when you stop ranking, that guesswork leads to weird hacks like using buttons for links, hidden (but rendered) content on the page, and the aforementioned keyword-based URL structure.
It’s not easy to test your assumptions either; you usually have to wait a couple weeks to see the effect of any change, and there are so many things at play (seasonality, data changes, algorithm updates) that it is impossible to properly isolate the effects of any possible improvement that you make. We’ve implemented so many changes for SEO, but we know very little about what did and did not work, so it at least feels like a waste of time.
🎁 Bonus war story: last year I went to visit Yoast, and it was an eye-opener. They mentioned the fact that usually at least 20–30% of traffic comes from bots. When I got home I added some logging, and discovered that around 90% of our traffic was bots. Turns out that we had a huge issue with a broken noindex
directive, and millions and millions of (virtually identical) pages were continuously being crawled by Googlebot. (To fix this issue, we added 1400 programmatically generated URL patterns to robots.txt).
5. A fast website, client-side rendering and server-side rendering. Pick two.
We have a server-side rendered, re-hydrated single-page application. It is the worst of both worlds. In terms of code, you’re stuck with the lowest common denominator. You can’t use async/await because it leads to bundle bloat. Can’t use a great Intl API because it’s only available in the browser. You have a fast time-to-first-paint, but a slow time-to-interactive. You run into weird bugs where server-side navigation leads to a different state than a client-side transition. You have to render all content immediately because of SEO reasons, and because React does not support partial rehydration (yet), you cannot code split or lazy load either.
Do you have static content? Use SSR with progressive enhancement. Do you have a super interactive app and is SEO not a concern? Use client-side rendering (bonus for embedding data on the initial UI-less render). Try to avoid doing both — the world is just not ready yet.
6. Errors need your attention
If you want to know what’s going on your platform (you do), you’ll need some kind of error monitoring tool, like NewRelic, Sentry or Rollbar. All provide a drop-in script snippet to place on your website, and any error will automatically be tracked. Please do this! And then resign to a life of anxiety and rushed bug fixes.
Usable error reporting requires discipline. Commit to hygiene by fixing smaller innocuous bugs, too. Reject promises with, and throw, Errors so you have stack traces. Exclude all errors that are not from your domain if you have third-party scripts. Make sure you have static error messages so errors are properly grouped and store metadata like IDs separately (your tool will have an API for this). On the server, add some request or response data, but not too much. Logging the entire request and response objects brought down our entire website any time the error rate was slightly above normal. That was fun to figure out.
7. Own your server
One huge benefit of doing server-side rendering is that you have a server that executes JavaScript. Why is that important? It means that you can execute code in a single environment, that you control. It means that you don’t have to deal with the huge amount of combinations of device types, browsers and screen sizes. It means that you can escape from whatever limits your browser matrix imposes on you. Here’s how you can put your server to good use:
- Use the strangler pattern to incrementally migrate away from a legacy stack
- Build aggregation services that gather data from external and internal APIs
- Execute all that expensive computational stuff on the server instead of the browser
8. Elasticsearch is sooooooooooooooooooooooooo cool
You might know Elasticsearch as a powerful full-text search engine, but I know Elasticsearch as the coolest thing in the world. Filters, aggregations, post-filters, logging, metrics, full-on application monitoring, alerting, whatever you can think of: Elasticsearch can take care of it. The query DSL is super flexible and Kibana is an amazing companion. I’ve used its full-text search capabilities for our search box, and its amazing aggregations for our faceted navigation. I use it for logging/metrics as well — it’s so easy to upload a bunch of events and metrics without thinking about the schema too much. It’s everything I always miss in REST APIs. It’s perhaps the biggest productivity boost I discovered in the last two years.
9. Tag managers will kill you
If you don’t know what a tag manager is, here’s the executive summary: it’s what marketing teams use when they don’t want to (or can’t) talk to developers for new features. It’s a system that allows non-engineers to include third-party scripts on a page. Does that sound bad? Well, it is. It’s just too easy to use. Unfettered access to scripts without any performance assessment will destroy any dreams you have of keeping that Lighthouse score over 50 — even if your tag manager claims to have an optimized loading strategy.
10. Everything is better than unit testing
Here are things that work better than unit testing for a website:
- End-to-end testing
- A type system
- Visual regression testing
- Manual testing
- Smoke testing
Admittedly, I’m biased by my laziness. And I won’t spend too much time convincing unit test aficionados that they are wrong. But, here’s my two cents: there’s a very small surface where unit testing a website code base makes sense. If you have a module or function that has a lot of different branches, and it’s impossible to guarantee safety via a type system, and if it’s too costly to run a huge number of E2E tests, yes, by all means, unit test the heck out of it. But, in most cases, it just creates a barrier to change.
If you’re bored, here are three more articles I wrote: