Copy and paste programming

Copy and paste programming

Copy and paste programming is a pejorative term to describe highly repetitive computer programming code apparently produced by copy and paste operations. It is frequently symptomatic of a lack of programming competence, or an insufficiently expressive development environment, as subroutines or libraries would normally be used instead.

Contents

Forms of Copy And Paste Programming

Plagiarism

Copy and pasting is often done by inexperienced or student programmers, who find the act of writing code from scratch difficult and prefer to search for a pre-written solution or partial solution they can use as a basis for their own problem solving.[1] (See also Cargo cult programming)

Duplication

As a way of applying library code

Copy and pasting is also done by experienced programmers, who often have their own libraries of well tested, ready-to-use code snippets and generic algorithms that are easily adapted to specific tasks.[2]

As a way of branching code

Branching code is a normal part of large-team software development, allowing parallel development on both branches and hence, shorter development cycles. Classical branching has the following qualities:

  • Is managed by a version control system that supports branching
  • Branches are re-merged once parallel development is completed.

Copy and paste is a less formal alternative to classical branching, often used when it is foreseen that the branches will diverge more and more over time, as when a new product is being spun off from an existing product.

As an approach to repetitive tasks

One of the most harmful forms of copy-and-paste programming occurs in code that performs a repetitive task. Each repetition is copied from above and pasted in again, with minor modifications. Harmful effects are discussed below.

Deliberate Design Choice

Use of programming idioms and design patterns are distinct from copy and paste programming, as they are expected to be recalled from the programmer's mind, rather than retrieved from a code bank.

There is research aimed at "decriminalizing" cut and paste, known as the Subtext programming language. Note that under this model, cut and paste is the primary model of interaction and hence not an anti-pattern.

Effects

Specific to Plagiarized Code

  • Inexperienced programmers who copy code often do not fully understand the pre-written code they are taking. As such, the problem arises more from their inexperience and lack of courage than from the act of copying and pasting, per se. The code often comes from disparate sources such as friends' or co-workers' code, Internet forums, code provided by the student's professors/TAs, or computer science textbooks. The result risks being a disjointed clash of styles, and may have superfluous code that tackles problems for which solutions are no longer required.
  • Bugs can also easily be introduced by assumptions and design choices made in the separate sources that no longer apply when placed in a new environment.
  • Such code may also, in effect, be unintentionally obfuscated, as the names of variables, classes, functions, etc., are normally left unchanged, even though their purpose may be completely different in the new context than it was in the original context.[1]

Specific to Duplicated Code

As a way of applying library code

  • Being a form of code duplication, copy and paste programming has some intrinsic problems; such problems are exacerbated if the code doesn't preserve any semantic link between the source text and the copies. In this case, if changes are needed, time is wasted hunting for all the duplicate locations. (This can be partially mitigated if the original code and/or the copy are properly commented; however, even then the problem remains of making the same edits multiple times. Also, because code maintenance often omits updating the comments,[3] comments describing where to find remote pieces of code are notorious for going out-of-date.)
  • Adherents of object oriented methodologies further object to the "code library" use of copy and paste. Instead of making multiple mutated copies of a generic algorithm, an object oriented approach would abstract the algorithm into a reusable encapsulated class. The class is written flexibly, with full support of inheritance and overloading, so that all calling code can be interfaced to use this generic code directly, rather than mutating the original.[4] As additional functionality is required, the library is extended (while retaining backward compatibility). This way, if the original algorithm has a bug to fix or can be improved, all software using it stands to benefit.

As a way of branching code

As a way of spinning-off a new product, copy and paste programming has some advantages. Because the new development initiative does not touch the code of the existing product:

  • There is no need to regression test the existing product, saving on QA time associated with the new product launch, and reducing time to market.
  • There is no risk of introduced bugs in the existing product, which might upset the installed user base.

The downsides are:

  • If the new product does not diverge as much as anticipated from the existing product, you can wind up supporting two code bases (at twice the cost) for what is essentially one product. This can lead to expensive refactoring and manual merging down the line.
  • The duplicate code base doubles the time required to implement changes which may be desired across both products; this increases time-to-market for such changes, and may in fact wipe out any time gains achieved by branching the code in the first place.

Similar to above, the alternative to a copy-and-paste approach would be a modularized approach:

  • Start by factoring out code to be shared by both products into libraries.
  • Use those libraries (rather than a second copy of the code base) as the foundation for development of the new product.
  • If an additional third, fourth, or fifth version of the product is envisaged down the line, this approach is far stronger, because the ready-made code libraries dramatically shorten the development life cycle for any additional products after the second.[5]

As an approach to repetitive tasks

  • For repetitive tasks, the copy and paste approach often leads to large methods (a bad code smell).
  • Each repetition creates a code duplicate, with all the problems discussed in prior sections, but with a much greater scope. Scores of duplications are common; hundreds are possible. Bug fixes, in particular, become very difficult and costly in such code.[6]
  • Such code also suffers from significant readability issues, due to the difficulty of discerning exactly what differs between each repetition. This has a direct impact on the risks and costs of revising the code.
  • The procedural programming model strongly discourages the copy-and-paste approach to repetitive tasks. Under a procedural model, a preferred approach to repetitive tasks is to create a function or subroutine that performs a single pass through the task; this subroutine is then called by the parent routine, either repetitively or better yet, with some form of looping structure. Such code is termed "well decomposed", and is recommended as being easier to read and more readily extensible.[7]
  • The general rule of thumb applicable to this case is "don't repeat yourself".

See also

References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Cut, copy, and paste — Cut and paste redirects here. For the hack writing strategy, see Cut and paste job. Copy Paste redirects here. For the album by BoA, see Hurricane Venus. In human computer interaction, cut and paste and copy and paste offer user interface… …   Wikipedia

  • Programming by permutation — Trying to approach a solution to a programming problem by iteratively making small changes (permutations) and testing each change to see if it behaves as expected is called programming by permutation . This approach sometimes seems attractive… …   Wikipedia

  • Rule of three (programming) — Rule of three is a code Refactoring rule of thumb to decide when a replicated piece of code should be replaced by a new procedure. It states that you are allowed to copy and paste the code once, but that when the same code is replicated three… …   Wikipedia

  • Criticism of the APL programming language — The APL programming language has been used since the mid 1960s on mainframe computers and has itself evolved in step with computers and the computing market. APL is not widely used, but minimalistic and high level by design, at several points in… …   Wikipedia

  • Text Executive Programming Language — In 1979, Honeywell Information Systems announced a new programming language for their time sharing service named TEX, an acronym for the Text Executive processor. TEX was a first generation scripting language, developed around the time of AWK and …   Wikipedia

  • Publish and Subscribe — was a document linking model introduced by Apple Computer in System 7. Named the Edition Manager in developer documentation [cite web | title=Publish and Subscribe (MacApp PG) | url=http://developer.apple.com/documentation/mac/MacAppProgGuide/MacA… …   Wikipedia

  • Duplicate code — is a computer programming term for a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable for a number… …   Wikipedia

  • Subtext (programming language) — Schematic tables. An alpha build of the Subtext environment, which illustrates the unique polymorphic conditionals present in the IDE. Subtext is a moderately visual programming language and environment, for writing application software. It is an …   Wikipedia

  • Anti-pattern — (deutsch: Antimuster) bezeichnet in der Softwareentwicklung einen häufig anzutreffenden schlechten Lösungsansatz für ein bestimmtes Problem. Es bildet damit das Gegenstück zu den Mustern (Entwurfsmuster, Analysemuster, Architekturmuster...),… …   Deutsch Wikipedia

  • Antimuster — Anti Pattern (deutsch: Antimuster) bezeichnet in der Softwareentwicklung einen häufig anzutreffenden schlechten Lösungsansatz für ein bestimmtes Problem. Es bildet damit das Gegenstück zu den Mustern (Entwurfsmuster, Analysemuster,… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”