(Editor’s note: transcripts don’t do talks justice. This transcript is useful for searching and reference, but we recommend watching the video rather than reading the transcript alone! For a reader of typical speed, reading this will take 15% less time than watching the video, but you’ll miss out on body language and the speaker’s slides!)
注意：演讲稿文本便于搜索和引用，但不推荐只看演讲稿，不看原视频。虽然只看演讲稿能省下 15% 的时间，但错过了演讲者的肢体语言和幻灯片（本文还删掉了一些演讲过程中抖的机灵——译注）。
And it’s a code base that I worked on a long time ago. And in the code base, there were two different modules, two files. And my colleague and friend was working on a new feature in one of those files. And they noticed that actually that feature, something very similar was already implemented in another file. So they thought, well, why don’t I just copy and paste that code because it’s pretty much the same thing?
And they ask me to review the code. And I just read all the books about the best practices. Pragmatic Programmer, Clean Coder, Well Groomed Coder, and I knew that I needed to-- you’re not supposed to copy and paste code because it creates a maintenance burden, it’s pretty hard to work with. I just learned this acronym DRY, which stands for don’t repeat yourself. And I was like this looks like a copy paste, so can we DRY it up a little bit?
他们复制粘贴好后，让我做 code review。那段时间我刚好读了一大堆讲述编程最佳实践的书，我学到了不应该复制粘贴代码，因为这对后续维护造成了困难。我刚学到一个术语——DRY，don’t repeat yourself，看了同事的代码，就想，何不对这些复制粘贴的代码来个 DRY？
And so my colleague was like, yeah, sure, I can totally extract that code to a separate module and make those two files depend on that new code. And so an abstraction was born. OK. So when I say abstraction, I mean it doesn’t matter which language you’re using. It could be a function or a class, a module, a package, something reusable that you can use from different places in your code base.
And so it seems like, this is great. And they live happily ever after. So let’s see let’s see how that abstraction evolved. So the next thing that happened, we hadn’t looked at that code for a while but then we were working on a new feature and it actually needed something very similar. So let’s say that the original abstraction was asynchronous, but we needed something that had pretty much the same exact shape, except it was synchronous.
So we couldn’t directly reuse that code anymore, but it also felt really bad to copy and paste it because it’s pretty much exactly the same code except it’s slightly different. And, well, it looks like we shouldn’t repeat ourselves so let’s just unify those two parts and make our abstraction a bit fancier so that it can handle the case as well. And we felt really good about it. It is a bit unorthodox, but that’s what happens when code meets real life, right? You make some compromises, and at least we didn’t have to duplicate the code, because that would be bad, right?
So what happened next is we found out that actually, this new code, this new feature, had a bug in it, and that bug was because we thought that it needs exactly some the same code as we have. But actually it needed something slightly different. But we can fix that bug, of course, by adding a special case. So our abstraction, we can have an if statement. If it’s like this particular case, then do something slightly different. Sure. Ship it. Because that happens to every abstraction, right?
然而，后来我们发现，这段新功能代码里存在 Bug！Bug 的原因是，我们误以为两个功能的实现在代码上是一致的，然而不是！两个功能其实有一些小小的不同。不过，我们能搞定这个 Bug，只要在那段抽象里加一个特殊 case 处理的逻辑就行，简单来说就是加个 if 语句的事儿！OK，搞定了，毕竟每个抽象都会遇到这种麻烦事儿，我们属实是轻车熟路了。
And so as we were working with that code, we actually noticed that the original code also had a bug. So those two cases that we thought were the same, they were also slightly different, we just didn’t realize it at the time. And so we added another special case. And at this point, this abstraction looks a bit weird and intimidating. So maybe lets make it more generic. Why do we have all those special cases in the abstraction?
然而，在我们做这个改动时，又发现了那段代码里存在的类似问题！好吧，再加个 if 语句。此时，这段抽象看起来已经有点奇怪了，它处理了很多具体的 case，这不好。
Let’s pull them out from the abstraction where they belong in our concrete use cases. So looks like this. So now our abstraction doesn’t know about any concrete cases. It is very generic, very beautiful. Nobody really understands what it represents anymore. Oh, by the way, we need to add, now that it’s parametrized from different places, we need to make sure that all code size are parametrized.
那么，让我们把这些具体 case 的处理移出这段抽象吧。ok，现在这段抽象里不包含任何具体 case 的处理了，这段抽象现在非常通用，非常漂亮，通用到一般人都看不出它要干啥了。顺便一提，我们得给这个抽象加个参数，这个参数表示抽象在哪里被调用。
But it was such a gradual progression that at each step it makes sense to the people writing and reviewing the code, so we just left it at that. And some time passed. And so, during that time, some people have left the team, some people have joined the team. There were many fixes. Somebody needed to just do this one small fix here. I don’t really know what this thing is supposed to be doing but just fix it up a little bit, add this new feature, improve the metrics. So we ended up with something like this, right?
And again, each of those individual steps kind of made sense. But if you lose track of what you were trying to do originally, you don’t really know that you have a cyclical dependency or this weird thing that is growing somewhere to the side just because you don’t see the whole picture anymore. And, of course, in real life, that’s actually where the story ends because nobody wanted to touch the part of the code base and it just was stagnant for a long time and then somebody rewrote it. And maybe got a promotion. I don’t know.
But if we could go back in time, because it’s a talk, it’s not real life, if we had a time machine we could go back and fix it, right? So I want to go back to the point where the abstraction still made sense. But if we had this third case and we really didn’t want to duplicate that code even though it needed something slightly different. And they were like, yeah, sure, let’s compromise on our abstraction. Make it funny. So this is if I from today was there, what I would’ve told myself is, please inline this abstraction.
那么，假设我们有个时光机器，可以回到一开始，那我们就能挽回局面。让我们回到那个抽象的意义很清晰的那个时间点。在第三个 case 出现时，我会告诉那时候的自己，妥协吧，把那段抽象“内联”就完事了！
And so what I mean by inline, I mean literally take that code and just copy and paste it back to the places that use it. And that creates some duplication but that destroys that potential monster we were in the process of creating. And of course duplication isn’t perfect in long term, but wrong abstraction is also not perfect in long term. So we need to balance these two problems. And so the way this helps us is that now if we have a bug here and we realized actually this thing is supposed to do something different, we can just change it. And it doesn’t affect any of the other places because it’s isolated. And similarly, maybe we get a different bug here and we also change it.
And I’m not suggesting that you should always copy paste things. In longer term, maybe you realize that these pieces really stabilized and they make sense. And maybe you pull something out and it might not be the thing that you originally thought was a good abstraction. Might be something different. And a thing like this is as good as it gets in practice. And if I heard this when I was a sweet summer child, I would have said that that’s not what they tell us. I heard that copy pasting is really bad.
And I think it’s actually a self-perpetuating loop. So what happens is that developers learn best practices from the previous generation and they try to follow them. Because there were concrete problems and concrete solutions that were born out of experience. And so the next generation tries to pass them on. But it’s hard to explain all this context and all this trade off, so they just get flattened into these ideas of best practices and anti-patterns.
And so they get taught to the new generation. But if the new generation doesn’t understand the trade offs and the reasons they came to these conclusions, they don’t have the context to decide when it’s actually a bad idea and how far can you stretch this. So they run into their own problems from trying to take these best practices and anti-patterns to extreme. And so they teach the next generation. And maybe this is just you can’t break out of this loop and it’s just bound to happen over and over again, which is maybe fine.
I think one way to try to break this loop is just when we teach something to the next generation, we shouldn’t just be two-dimensional and say here’s best practices and anti-patterns. But we should try to explain what is it that you’re actually trading away. What are the benefits and what are the costs of this idea? And so when we talk about the benefits of abstraction, of course it has benefits. The whole computer is a huge stack of abstractions. And I think concrete benefits are-- abstractions let you focus on a specific intent, right? So if you have this thing and they have to keep it all in their head.
But it’s actually really nice to be able to focus on a specific layer. Maybe you have several places of code where you send an email and you don’t want to know how an email is-- I don’t know how emails are being sent. It’s a mystery to me that they even arrive. But I can call a function called send email and well, it works most of the times. And it’s really nice to be able to focus on it. And of course another benefit is just being able to reuse code written by you or other people and not remember how it actually works.
So if we need something, exactly the same thing that we already use from different places, it’s very nice to be able to reuse it. So that’s a benefit of abstraction. And abstraction also helps us avoid some some bugs. So in the example where we have a bug, maybe we copy pasted something. And that’s an argument against copy paste, is we copy pasted something and then we found the bug in one version and we fix it, but then the other version stays broken because we forgot about the copy paste. So that’s a good argument for why you’d want to extract something and pull it away.
所以，假如我们要在代码的不同地方复用同一个功能，抽象是很有用的。抽象也能帮我们减少一些 Bug，假如我们一味复制粘贴，那么当我们修复一个地方的 Bug 时，其他地方则还留存着同样的 Bug。这就是我们有时要“抽出一些的东西”的正面理由。
But when we talk about benefits we should also talk about costs. And so one of these costs is that abstraction creates accidental coupling. And what I mean by that is, so we have these two modules using some abstraction, and then we realize that one of them has a bug. And we have to fix it in the abstraction because that’s literally where the code is. But now it’s your responsibility to consider all of the other call sites of this abstraction and whether you might have actually introduced a fix in another, introduced the bug in another part of the code base. So that’s one cost. Maybe you can live with it. Most of us live with it. But it’s a real cost.
And I think an even more dangerous cost is the extra indirection an abstraction can create. So what I mean by that is that the promise was that I would just be able to focus on this specific layer in my code and not actually care about all the layers. Is that really what happens? I’m sure most of you probably had this bug where you started one layer, oh, it goes here. And it’s like, well, actually, no. You need to understand this layer and this other layer because the bug, it goes across all of those layers. And we have a very limited stack in our heads.
另一个代价更为危险：抽象会带来 extra indirection（不好翻译，保留原文——译注）。这里的意思是，抽象通常许诺我们说，我们只要关心我们的这一层次就好，其它层次都可以藏在抽象后面。但这是真的吗？我确信你们都遇到过这种情况：你在某一层发现了 Bug，再仔细一查，哦，原来根源在另外一层。也就是说这个 Bug 使得你不得不理解当前层级以及另一层级，搞不好最后你得搞明白所有层级。然而我们的大脑容量说到底是有限的。
And so what happens is you just get a stack will fall, which is probably why the site was coded that way. And so what I see happen a lot is that we try so hard to avoid the spaghetti code that we create this lasagna code where there are so many layers that you don’t know what’s going on anymore at all. So that’s extra indirection. And all of them wouldn’t be that bad if they didn’t entrench themselves.
我常见到的一种情况是，我们为避免意大利面条式的代码引入了千层饼式的代码：代码库里的层级多到我们根本搞不清怎么回事了。这就是 extra indirection。当然，如果他们比较好改动，那也不算糟糕透顶。
So abstraction also creates inertia in your code base. And that’s a social factor more than technical. What I’ve seen happen many times is you start with an abstraction that looks really promising and makes sense to you. And then with time it gets more and more complex. But nobody really has time to refactor or unwind this abstraction, especially if you’re a new person on the team. You might think that it would be easier to copy and paste it, but first you don’t really know how to do that anymore because you’re not familiar with that code. And second you don’t want to be the person who just suggests worst practices. Who wants to be the person who says, let’s use copy paste here? How long do you think you’re going to be on that team?
So you just accept the reality for what it is and keep doing it and hope that this code is not going to be your responsibility anymore soon. And the problem is that even if your team actually agrees that the abstraction is bad and it should be inlined, it might just be too late. So what might happen is that you’re familiar with just this concrete usage and you know how to test it. If you unwind the abstraction, you can understand how to verify that change didn’t break anything. But maybe there is another team who uses it here and another team who uses it there, and maybe this team has been reorged so there is no team that maintains that code, and you don’t really know how to test it anymore. So you just can’t make that change even if you want to.
So I really like this tweet. It’s a bit hard to read. Easy-to-replace systems tend to get replaced with hard-to-replace systems, which is kind of like the Peters Principle. There’s this Peter’s Principle that everybody in the organization continuous raising until they become incompetent and then they can’t raise anymore. And it’s similar that if something is easy to replace, it will probably get replaced. And then at some point you hit the limit where it’s just a mess and nobody understands how it works.
So I’m not saying that you shouldn’t create abstractions. That would be a very two-dimensional or one-dimensional takeaway. I’m saying that there are things that, we’re going to make mistakes. So how can we actually try to mitigate or reduce the risks from those mistakes? And so one of them that I learned on the React team in particular is to test code that has concrete business value. So what I mean by that is, say we have this a little bit wonky abstraction, but we finally got some time to write some proper tests, because we fixed some bugs and we have a gap before the new half of the year starts and we can fix some things.
再次强调，我不是说我们不应该编写抽象，那就太独断论了。我是说（在写抽象的过程中——译注）我们很可能会犯错。那么我们如何转移或是减少此类错误带来的风险呢？我在 React 团队里学到的一件事是，测试具有具体业务价值的代码（而不是抽象的模块——译注）。而对于那些似乎不太靠谱的抽象，我们只在它经历了几次 bugfix 之后，趁着新的半年度开始前，才给它加上一些测试。
So we want to write some unit test coverage for that part. And intuitively, where I would put unit test is, well, here’s the abstraction where the complex code lies. So let’s put unit test to cover that code. And that’s actually a bad idea in my opinion, because what happens is that if later you decide that this abstraction was bad and you try to turn it into copy paste, well, guess what happens through your tests? They all fail. And now you’re like, well, I guess I’ll have to revert that because I don’t want to rewrite all my tests. And I don’t want to be the person who suggested to decrease the code coverage. So you don’t do that.
But if you have a time machine you can go back and you can write your unit tests or integration tests or whatever you want to call them, fad of the day tests, against the code that we actually care about, that this code works against concrete features. And then there’s this test that don’t care about your abstraction. So you can inline the abstraction back. You can create five layers of abstraction. The test will tell you whether this code works. So actually they will guide you to refactor it because they can tell you that your refactoring is in fact a correct one. So testing concrete code is a good strategy.
Another one is just to restrain yourself. You see this full request. You get this itch, like, this looks duplicate. And you’re like, no, take a walk. Because if you have this, you might have a high school crush and they are really into the same obscure bands on Last.fm that you’re into. That doesn’t mean that you have a lot in common and they’re going to be a good life partner. So maybe you shouldn’t do the same to the code. Just because the structure of these two snippets looks similar, it might just mean that you don’t really understand the problem yet. And give it some time to actually show that this is the same problem and not just accidentally similar code.
另一个要点在于约束你自己。当你看到一个需求时，你心里可能痒痒的：啊，这和之前的某个功能很类似，让我们抽出来！此时，赶紧停下来，去散个步冷静下。因为这种感觉很可能类似你的高中恋爱，你和 TA 仅仅是喜欢同一个摇滚乐队，就好像是天造地设的终身伴侣一样！对你的代码，可别犯下这种错误。两段代码结构看着相似，说明不了什么，很可能只是因为你没发现问题所在罢了。让时间来验证这两段代码是否真的解决了同一个问题，而非只是偶然相似罢了。
And finally, I think it’s just important that if that happens, if you make a mistake, it should be part of your team culture to be OK with, this abstraction is bad. We need to get rid of it. You should not only add abstraction, but you should also delete them as part of your healthy development process. So that means that it should be OK to leave a comment like this and say, hey, this is getting out of control. Let’s spend some time to copy and paste this and later we’ll figure out what to do with it.
But there is also a technical component to this. So if your dependency tree looks like this, it might actually be really challenging to inline anything because you’re like, well, I have this thing I want to inline but, OK, I can copy it, but there’s some mutable shared state that is now being duplicated. And I need to figure out how to rewire all of those dependencies together. And it might not even be feasible. So you just give up. And I don’t really have a good solution for this. What I’ve noticed is that, for some code, you can’t really avoid it. For example, in the source code of React itself, we do have a problem like this. Because we try to mutate things for you so you don’t have to mutate them. So we have all this interdependencies between modules that can be a bit difficult to think about.
那内联任何东西都是非常困难，甚至无法实现的，最后只能作罢。对于这种情况，我也没啥办法，只能说，确实存在某些场景（例如 React 内部的实现）是无法避免这种情况的。
But then what’s cool about React, in my opinion, is that it lets you write apps with dependency trees that are more like this. So you have a button component that’s used from form, and that form is used from app. And so on like this. And it follows this tree shape. And we have these constraints for data flows only in one direction. So you don’t really expect things to get weird circular. And what it means is that you’re going to make mistakes, you’re going to create bad abstractions, but does your technology make it easier for you to get rid of them?
React 很棒的一点在于，React 项目牺牲自己，只为成全广大用户的项目不出现上图的那种情况。React 是单向数据流的，因此出现循环引用的概率不大。重点在于，你会犯错，会弄出糟糕的抽象，但你的技术方案是否让你更容易紧急回避它们？
Because I think with React components and some other constrained forms of dependency, like management, you have this nice property where it’s usually a matter of copy and pasting things in order to inline them. And so even if you make a bad decision, you can actually undo it before it gets too late. So this is something to consider in both social and technology part of it. So don’t repeat yourself. DRY is just one of those principles that are probably pretty good ideas.
我认为，React 组件以及 React 施加于你的某些限制，为你的项目带来一个很好的特性：复制粘贴通常行得通。所以即使你弄了个坏抽象，也能在事态恶化之前挽回。
And there are many good ideas that you might hear about as a developer and entering this industry. Or even as somebody who’s been doing it for 15 years and then stepping outside for a few months. And we see a lot of evangelism around those things. And that is fine. But I think it’s important that when we try to explain what those things do or why they’re a good idea, we should always explain what exactly are you trading away and which things led us to that to that principle or idea. And what is the expiration date for those problems? Because sometimes there is some context that is assumed and that context actually changes but you don’t realize that. And so the next generation needs to understand what exactly was traded off and why.
And so my challenge for you is to pick some best practices and anti-patterns that you strongly believe are true, whether from your experience or because somebody told you or because you came up with them, and really try to break it down and deconstruct why you believe these things and what exactly is being traded away. And if you found this talk interesting, you might like these other talks. So All the Little Things by Sandi Metz is an amazing talk that goes into way more detail on these ideas and many others. Minimal API Surface Area is a talk by my colleague, Sebastian, who I learn all of this stuff from. And On the Spectrum of Abstraction is an interesting talk by Cheng Lou, who goes into how abstractions help us trade the power and expressiveness for constraints and how those constraints can actually limit us, but let us do things we wouldn’t be able to do otherwise. It’s a good talk. And thank you for having me. That’s all I have.
- Sandi Metz 的一些讨论
- Minimal API Surface Area by Sebastian
- On the Spectrum of Abstraction by Cheng Lou