如何处理 11000 行 C++ 源文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3615789/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What to do about a 11000 lines C++ source file?
提问by Martin Ba
So we have this huge (is 11000 lines huge?) mainmodule.cpp source file in our project and every time I have to touch it I cringe.
所以我们的项目中有这个巨大的(有 11000 行吗?) mainmodule.cpp 源文件,每次我不得不接触它时,我都会畏缩。
As this file is so central and large, it keeps accumulating more and more code and I can't think of a good way to make it actually start to shrink.
由于这个文件是如此中心和大,它不断积累越来越多的代码,我想不出一个好的方法来让它真正开始缩小。
The file is used and actively changed in several (> 10) maintenance versions of our product and so it is really hard to refactor it. If I were to "simply" split it up, say for a start, into 3 files, then merging back changes from maintenance versions will become a nightmare. And also if you split up a file with such a long and rich history, tracking and checking old changes in the SCC
history suddenly becomes a lot harder.
该文件在我们产品的多个 (> 10) 维护版本中被使用和积极更改,因此很难重构它。如果我要“简单地”将其拆分为 3 个文件,那么从维护版本合并回更改将成为一场噩梦。而且,如果您拆分具有如此悠久而丰富历史的文件,跟踪和检查历史中的旧更改SCC
突然变得更加困难。
The file basically contains the "main class" (main internal work dispatching and coordination) of our program, so every time a feature is added, it also affects this file and every time it grows. :-(
该文件基本上包含了我们程序的“主类”(主要的内部工作调度和协调),所以每添加一个特性,它也会影响到这个文件,每一次它的增长。:-(
What would you do in this situation? Any ideas on how to move new features to a separate source file without messing up the SCC
workflow?
在这个情况下,你会怎么做?关于如何将新功能移动到单独的源文件而不打乱SCC
工作流程的任何想法?
(Note on the tools: We use C++ with Visual Studio
; We use AccuRev
as SCC
but I think the type of SCC
doesn't really matter here; We use Araxis Merge
to do actual comparison and merging of files)
(关于工具的注意事项:我们使用 C++ with Visual Studio
; 我们使用AccuRev
asSCC
但我认为这里的类型SCC
并不重要;我们用来Araxis Merge
进行文件的实际比较和合并)
采纳答案by Steve Jessop
Find some code in the file which is relatively stable (not changing fast, and doesn't vary much between branches) and could stand as an independent unit. Move this into its own file, and for that matter into its own class, in all branches. Because it's stable, this won't cause (many) "awkward" merges that have to be applied to a different file from the one they were originally made on, when you merge the change from one branch to another. Repeat.
Find some code in the file which basically only applies to a small number of branches, and could stand alone. Doesn't matter whether it's changing fast or not, because of the small number of branches. Move this into its own classes and files. Repeat.
在文件中找到一些相对稳定(变化不快,分支之间变化不大)并且可以作为一个独立单元的代码。将其移动到它自己的文件中,并就此移动到它自己的类中,在所有分支中。因为它是稳定的,所以当您将更改从一个分支合并到另一个分支时,这不会导致(许多)“尴尬”合并必须应用于与最初创建的文件不同的文件。重复。
在文件中找到一些基本上只适用于少数分支的代码,并且可以独立存在。因为分支数量少,所以变化快与否无关紧要。将其移动到它自己的类和文件中。重复。
So, we've got rid of the code that's the same everywhere, and the code that's specific to certain branches.
所以,我们已经去掉了到处都一样的代码,以及特定于某些分支的代码。
This leaves you with a nucleus of badly-managed code - it's needed everywhere, but it's different in every branch (and/or it changes constantly so that some branches are running behind others), and yet it's in a single file that you're unsuccessfully trying to merge between branches. Stop doing that. Branch the file permanently, perhaps by renaming it in each branch. It's not "main" any more, it's "main for configuration X". OK, so you lose the ability to apply the same change to multiple branches by merging, but this is in any case the core of code where merging doesn't work very well. If you're having to manually manage the merges anyway to deal with conflicts, then it's no loss to manually apply them independently on each branch.
这给你留下了一个管理不善的代码核心 - 它在任何地方都需要,但它在每个分支中都不同(和/或它不断变化,以至于某些分支运行在其他分支之后),但它在一个文件中你尝试在分支之间合并失败。别那样做。永久分支文件,也许通过在每个分支中重命名它。它不再是“主”,而是“配置 X 的主”。好的,因此您无法通过合并将相同的更改应用于多个分支,但无论如何这是合并不能很好地工作的代码核心。如果您无论如何都必须手动管理合并以处理冲突,那么在每个分支上独立地手动应用它们并没有损失。
I think you're wrong to say that the kind of SCC doesn't matter, because for example git's merging abilities are probably better than the merge tool you're using. So the core problem, "merging is difficult" occurs at different times for different SCCs. However, you're unlikely to be able to change SCCs, so the issue is probably irrelevant.
我认为您说 SCC 的类型无关紧要是错误的,因为例如 git 的合并能力可能比您使用的合并工具更好。所以核心问题,“合并难”,不同的SCC在不同的时间出现。但是,您不太可能更改 SCC,因此该问题可能无关紧要。
回答by Kirill V. Lyadvinsky
Merging will not be such a big nightmare as it will be when you'll get 30000 LOC file in the future. So:
合并不会像将来获得 30000 个 LOC 文件时那样是一个大噩梦。所以:
- Stop adding more code to that file.
- Split it.
- 停止向该文件添加更多代码。
- 拆分它。
If you can't just stop coding during refactoring process, you could leave this big file as isfor a while at least without adding more code to it: since it contains one "main class" you could inherit from it and keep inherited class(es) with overloaded functions in several new small and well designed files.
如果您不能在重构过程中停止编码,您可以将这个大文件保持原样至少一段时间而不添加更多代码:因为它包含一个“主类”,您可以从它继承并保留继承的类( es) 在几个新的小型且设计良好的文件中具有重载功能。
回答by Brian Rasmussen
It sounds to me like you're facing a number of code smells here. First of all the main class appears to violate the open/closed principle. It also sounds like it is handling too many responsibilities. Due to this I would assume the code to be more brittle than it needs to be.
在我看来,您在这里面临着许多代码异味。首先,主类似乎违反了开放/封闭原则。这听起来也像是在处理太多的责任。因此,我认为代码比它需要的更脆弱。
While I can understand your concerns regarding traceability following a refactoring, I would expect that this class is rather hard to maintain and enhance and that any changes you do make are likely to cause side effects. I would assume that the cost of these outweighs the cost of refactoring the class.
虽然我可以理解您对重构后可追溯性的担忧,但我预计该类很难维护和增强,并且您所做的任何更改都可能导致副作用。我会假设这些成本超过重构类的成本。
In any case, since the code smells will only get worse with time, at least at some point the cost of these will outweigh the cost of refactoring. From your description I would assume that you're past the tipping point.
在任何情况下,由于代码异味只会随着时间的推移而变得更糟,至少在某些时候,这些的成本将超过重构的成本。根据你的描述,我认为你已经过了临界点。
Refactoring this should be done in small steps. If possible add automated tests to verify current behavior beforerefactoring anything. Then pick out small areas of isolated functionality and extract these as types in order to delegate the responsibility.
重构应该以小步骤完成。如果可能,请在重构任何内容之前添加自动化测试以验证当前行为。然后挑选出孤立功能的小区域并将它们提取为类型以委派责任。
In any case, it sounds like a major project, so good luck :)
无论如何,这听起来像是一个重大项目,祝你好运:)
回答by Beno?t
The only solution I have ever imagined to such problems follows. The actual gain by the described method is progressiveness of the evolutions. No revolutions here, otherwise you'll be in trouble very fast.
我曾经想象过的对此类问题的唯一解决方案如下。所述方法的实际增益是进化的渐进性。这里没有革命,否则你很快就会遇到麻烦。
Insert a new cpp class above the original main class. For now, it would basically redirect all calls to the current main class, but aim at making the API of this new class as clear and succinct as possible.
在原始主类上方插入一个新的 cpp 类。目前,它基本上会将所有调用重定向到当前主类,但旨在使这个新类的 API 尽可能清晰简洁。
Once this has been done, you get the possibility to add new functionalities in new classes.
完成此操作后,您就有可能在新类中添加新功能。
As for existing functionalities, you have to progressively move them in new classes as they become stable enough. You will lose SCC help for this piece of code, but there is not much that can be done about that. Just pick the right timing.
至于现有的功能,你必须在它们变得足够稳定时逐步将它们移到新的类中。您将失去对这段代码的 SCC 帮助,但对此无能为力。只要选择正确的时机。
I know this is not perfect, though I hope it can help, and the process must be adapted to your needs!
我知道这并不完美,但我希望它能有所帮助,并且该过程必须适应您的需求!
Additional information
附加信息
Note that Git is an SCC that can follow pieces of code from one file to another. I have heard good things about it, so it could help while you are progressively moving your work.
请注意,Git 是一种 SCC,它可以跟踪从一个文件到另一个文件的代码片段。我听说过有关它的好消息,因此在您逐步转移工作时它会有所帮助。
Git is constructed around the notion of blobs which, if I understand correctly, represent pieces of code files. Move these pieces around in different files and Git will find them, even if you modify them. Apart from the video from Linus Torvaldsmentioned in comments below, I have not been able to find something clear about this.
Git 是围绕 blob 的概念构建的,如果我理解正确的话,它代表代码文件的片段。在不同的文件中移动这些片段,Git 会找到它们,即使你修改它们。除了下面评论中提到的来自 Linus Torvalds 的视频之外,我还没有找到清楚的内容。
回答by fdasfasdfdas
Confucius say: "first step to getting out of hole is to stop digging hole."
孔子说:“走出坑的第一步,就是停止挖坑。”
回答by Ian
Let me guess: Ten clients with divergent feature sets and a sales manager that promotes "customization"? I've worked on products like that before. We had essentially the same problem.
让我猜猜:十个具有不同功能集的客户和一个促进“定制”的销售经理?我以前做过这样的产品。我们遇到了基本相同的问题。
You recognize that having an enormous file is trouble, but even more trouble is ten versions that you have to keep "current". That's multiple maintenance. SCC can make that easier, but it can't make it right.
您知道拥有一个巨大的文件很麻烦,但更麻烦的是您必须保持“最新”的十个版本。那是多次维护。SCC 可以使这更容易,但它不能使它正确。
Before you try to break the file into parts, you need to bring the ten branches back in sync with each other so that you can see and shape all the code at once. You can do this one branch at a time, testing both branches against the same main code file. To enforce the custom behavior, you can use #ifdef and friends, but it's better as much as possible to use ordinary if/else against defined constants. This way, your compiler will verify all types and most probably eliminate "dead" object code anyway. (You may want to turn off the warning about dead code, though.)
在您尝试将文件分解成多个部分之前,您需要使十个分支彼此同步,以便您可以同时查看和调整所有代码。您可以一次执行一个分支,针对同一个主代码文件测试两个分支。要强制自定义行为,您可以使用 #ifdef 和朋友,但最好尽可能对定义的常量使用普通的 if/else。这样,您的编译器将验证所有类型,并且很可能无论如何都会消除“死”目标代码。(不过,您可能希望关闭有关死代码的警告。)
Once there's only one version of that file shared implicitly by all branches, then it's rather easier to begin traditional refactoring methods.
一旦所有分支都隐式共享该文件的一个版本,那么开始传统的重构方法就容易多了。
The #ifdefs are primarily better for sections where the affected code only makes sense in the context of other per-branch customizations. One may argue that these also present an opportunity for the same branch-merging scheme, but don't go hog-wild. One colossal project at a time, please.
#ifdefs 主要适用于受影响的代码仅在其他每个分支自定义的上下文中有意义的部分。有人可能会争辩说,这些也为相同的分支合并方案提供了机会,但不要盲目。请一次一个庞大的项目。
In the short run, the file will appear to grow. This is OK. What you're doing is bringing things together that need to be together. Afterwards, you'll begin to see areas that are clearly the same regardless of version; these can be left alone or refactored at will. Other areas will clearly differ depending on the version. You have a number of options in this case. One method is to delegate the differences to per-version strategy objects. Another is to derive client versions from a common abstract class. But none of these transformations are possible as long as you have ten "tips" of development in different branches.
在短期内,文件似乎会增长。还行吧。你正在做的是把需要放在一起的东西放在一起。之后,您将开始看到无论版本如何都明显相同的区域;这些可以单独放置或随意重构。其他区域将根据版本明显不同。在这种情况下,您有多种选择。一种方法是将差异委托给每个版本的策略对象。另一种方法是从公共抽象类派生客户端版本。但是,只要您在不同的分支中有十个开发“技巧”,这些转换都不可能实现。
回答by Robin
I don't know if this solves your problem, but what I guess you want to do is migrate the content of the file to smaller files independent of each other (summed up). What I also get is that you have about 10 different versions of the software floating around and you need to support them all without messing things up.
我不知道这是否解决了您的问题,但我猜您想要做的是将文件的内容迁移到彼此独立的较小文件(总结)。我还得到的是,你有大约 10 个不同版本的软件,你需要支持它们,不要把事情搞砸。
First of all there is just noway that this is easy and will solve itself in a few minutes of brainstorming. The functions linked in your file are all vital to your application, and simply cutting them of and migrating them to other files won't save your problem.
首先,有只是没有办法,这是很简单的将解决自身在头脑风暴的几分钟。文件中链接的函数对于您的应用程序来说都是至关重要的,简单地将它们删除并将它们迁移到其他文件并不能解决您的问题。
I think you only have these options:
我认为你只有这些选择:
Don't migrate and stay with what you have. Possibly quit your job and start working on serious software with good design in addition. Extreme programming is not always the best solution if you are working on a long time project with enough funds to survive a crash or two.
Work out a layout of how you would love your file to look once it's split up. Create the necessary files and integrate them in your application. Rename the functions or overload them to take an additional parameter (maybe just a simple boolean?). Once you have to work on your code, migrate the functions you need to work on to the new file and map the function calls of the old functions to the new functions. You should still have your main-file this way, and still be able to see the changes that were made to it, once it comes to a specific function you know exactly when it was outsourced and so on.
Try to convince your co-workers with some good cake that workflow is overrated and that you need to rewrite some parts of the application in order to do serious business.
不要迁移并留下你所拥有的。可能会辞掉你的工作,并开始开发具有良好设计的严肃软件。如果您正在从事一个长期项目,并且有足够的资金来承受一两次崩溃,那么极限编程并不总是最好的解决方案。
制定您希望文件拆分后的外观布局。创建必要的文件并将它们集成到您的应用程序中。重命名函数或重载它们以获取附加参数(也许只是一个简单的布尔值?)。一旦必须处理代码,请将需要处理的函数迁移到新文件,并将旧函数的函数调用映射到新函数。您仍然应该以这种方式拥有主文件,并且仍然能够看到对其所做的更改,一旦涉及到特定功能,您就可以确切地知道它何时被外包等等。
试着用一些好的蛋糕来说服你的同事,工作流程被高估了,你需要重写应用程序的某些部分才能做正经事。
回答by Patrick
Exactly this problem is handled in one of the chapters of the book "Working Effectively with Legacy Code" (http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052).
正是这个问题在“有效地使用遗留代码”一书中的一章中得到了处理(http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052)。
回答by ocodo
I think you would be best off creating a set of commandclasses that map to the API points of the mainmodule.cpp.
我认为您最好创建一组映射到 mainmodule.cpp 的 API 点的命令类。
Once they are in place, you will need to refactor the existing code base to access these API points via the command classes, once that's done, you are free to refactor each command's implementation into a new class structure.
一旦它们就位,您将需要重构现有代码库以通过命令类访问这些 API 点,完成后,您可以自由地将每个命令的实现重构为新的类结构。
Of course, with a single class of 11 KLOC the code in there is probably highly coupled and brittle, but creating individual command classes will help much more than any other proxy/facade strategy.
当然,对于 11 个 KLOC 的单个类,其中的代码可能高度耦合且脆弱,但是创建单独的命令类将比任何其他代理/外观策略更有帮助。
I don't envy the task, but as time goes on this problem will only get worse if it's not tackled.
我并不羡慕这项任务,但随着时间的推移,如果不解决这个问题,这个问题只会变得更糟。
Update
更新
I'd suggest that the Command pattern is preferable to a Facade.
我建议 Command 模式比 Facade 更可取。
Maintaining/organizing a lot of different Command classes over a (relatively) monolithic Facade is preferable. Mapping a single Facade onto a 11 KLOC file will probably need to be broken up into a few different groups itself.
在(相对)单一的 Facade 上维护/组织许多不同的 Command 类是可取的。将单个 Facade 映射到 11 KLOC 文件本身可能需要分解为几个不同的组。
Why bother trying to figure out these facade groups? With the Command pattern you will be able to group and organise these small classes organically, so you have a lot more flexibility.
为什么要费心去弄清楚这些门面组呢?使用命令模式,您将能够有机地对这些小类进行分组和组织,因此您拥有更大的灵活性。
Of course, both options are better than the single 11 KLOC and growing, file.
当然,这两种选择都比单一的 11 KLOC 和不断增长的文件要好。
回答by Michael Stum
One important advice: Do not mix refactoring and bugfixes. What you want is a Version of your program that is identicalto the previous version, except that the source code is differently.
一个重要的建议:不要混合重构和错误修复。您想要的是与之前版本相同的程序版本,只是源代码不同。
One way could be to start splitting up the least big function/part into it's own file and then either include with a header (thus turning main.cpp into a list of #includes, which sounds a code smell in itself *I'm not a C++ Guru though), but at least it's now split into files).
一种方法可能是开始将最小的函数/部分拆分到它自己的文件中,然后包含一个标题(从而将 main.cpp 转换为 #includes 列表,这本身听起来有一种代码味道*我不是虽然是 C++ Guru),但至少它现在被拆分为文件)。
You could then try to switch all maintenance releases over to the "new" main.cpp or whatever your structure is. Again: No other changes or Bugfixes because tracking those is confusing as hell.
然后,您可以尝试将所有维护版本切换到“新” main.cpp 或任何您的结构。再说一遍:没有其他更改或错误修正,因为跟踪这些令人困惑。
Another thing: As much as you may desire making one big pass at refactoring the whole thing in one go, you might bite off more than you can chew. Maybe just pick one or two "parts", get them into all the releases, then add some more value for your customer (after all, Refactoring does not add direct value so it is a cost that has to be justified) and then pick another one or two parts.
另一件事:尽管您可能希望在一次重构整个过程中大获成功,但您可能会咬牙切齿。也许只选择一两个“部分”,将它们放入所有版本中,然后为您的客户增加更多价值(毕竟,重构不会增加直接价值,因此必须证明这是一项成本),然后再选择另一个一两部分。
Obviously that requires some discipline in the team to actually use the split files and not just add new stuff to the main.cpp all the time, but again, trying to do one massive refactor may not be the best course of action.
显然,这需要团队中的一些纪律来实际使用拆分文件,而不仅仅是一直向 main.cpp 添加新内容,但同样,尝试进行一次大规模重构可能不是最佳行动方案。