|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
Recommended Links | Program Understanding | ||||
patch | wdiff | dwdiff, | kdiff3 | tkdiff | |
Admin Horror Stories | Unix History | Humor | Etc |
|
diff is one of the oldest UNIX commands (was included in UNIX around 1976). It compares the contents of the two files source file and target file (modified version) and produces "delta" -- lines that are changed or absent in either of files. It was written by Hunt and McIlroy and based on the algorithm for file comparison that they created (see J. W. Hunt and M. D. McIlroy, An algorithm for differential file comparison, Bell Telephone Laboratories CSTR #41 (1976), PostScript (text edited from OCR, figures redrawn)). While it is the first it is still one of the best.
|
Both files in a classic UNIX diff are assumed to be text files. As file difference is an illusive concept that simplest approach is consider line indivisible and compute so called "longest common subsequence of lines"(lcs). Then anything not in this lcs is declared to belong to the difference set -- the minimal set of lines that needs to be changed for the transformation of source to the target.
But both symbol based diff and diff based on words can be easily obtained by using appropriate filters before applying diff. String diff is essentially the problem that is studied in all algorithms for text file comparison as any text file can be trivially converted to the string is some alphabet with each line represented by one letter of this alphabet.
Difference by word is also a trivial modification of the basic program. Essentially you first need to covert the text into "one word per line" format and then use line-based diff. Several such modifications exists (wdiff, dwdiff, see below) or can be easily written in any scripting language.
diff proved to be useful in so many cases that it is difficult to enumerate them. First and foremost this tool is used to discover differences between versions of a text file. In this role it is useful in keeping track of the evolution of a document or of a program. For example, often, a programmer needs to debug a software program with the codebase that contains several thousand (or even hundred thousand) lines. if the problem is present only is this version and is not present in older versions then you can diff the sources and use code browsers for the difference set to try to pinpoint the source of the problem. This approach can help dramatically narrow the slice of the program that probably contains the problematic code.
It is also used as a file compression method, since many versions of a (long) file can be represented by storing one (long) version of it and many (short) scripts of transforming the older version into newer versions of the same file. Another application is so called approximate string matching used, for instance, for the detection of misspelled words.
If we use diff for program understanding (which is comparing two versions of the program is usually about), then along with diff tools, powerful tools for source code browsing and, especially, slicers, are also necessary. Some are built into the IDE and some are standalone. The grand daddy of all slicers is Xedit that has built-in slicing capabilities since 1970th (famous "all" command). Among code browsers the cscope for C/C++ created by AT&T is probably one of the first useful implementations that address the problem.
diff also lead to creation of news class programs that use generation of the difference set as a part of their operation. The most popular among those is patch written by Larry Wall. Older version control systems were little more then diff and several shell scripts.
The external diff command compares two text files for differences. It determines which lines must be changed to make the two files identical. The diff command scans the two files and indicates editing changes that must be made to the first file to make it identical to the second file. The changes can be saved for use as an ed command script to change the first file. It can also compare directories. System V provides a second command to compare directories; refer to the dircmp command.
diff [ -lrs ] [ -S name ] [ -cefhn ] [ -biwt ] dir1 dir2 diff [ -cefhn ] [ -biwt ] file1 file2 diff [ -cefhn ] [ -biwt ] - file2 diff [ -cefhn ] [ -biwt ] file1 - diff [ -D string ] [ -biw ] file1 file2 diff [ -D string ] [ -biw ] - file2 diff [ -D string ] [ -biw ] file1 - diff [ -C string ] [ -biw ] file1 file2 diff [ -C string ] [ -biw ] - file2 diff [ -C string ] [ -biw ] file1 -
You may want to think of file1 as the old file and file2 as the new file.
The -e option of diff produces an editing script usable with either ex or ed. It consists of a sequence of ed commands necessary to re-create file2 from file1. By editing the script produced by diff, you can come up with some useful changes to the file that bring it "half-way" to the new version.
Options
The following list describes the options and their arguments that may be used to control how diff functions.
Comparison control options:
-b | Causes blanks (spaces and tabs) to compare equally even if an unequal number of blanks exist. All trailing blanks are ignored. For example, if the first line was in file1 and the second line was in file2, they would compare as equals. |
file1: A sample line of text here file2: A sample line of text here
-i | Causes the case of letters to be ignored. For example, |
THE BIG dog ran fast. and The big dog ran fast. match as equal lines.
-t | Expands tabs on input to spaces on output. Normal output adds additional characters to the front of each line. This may change the indention of the original text, making it difficult to read. This option preserves the original text indention. The -c option also adds additional characters, causing indention problems. |
-w | Causes all white spaces (blanks and tabs) to be ignored. For example, |
if ( x == y ) and if(x==y) compare as equals.
Directory comparison options:
-l | Display long output listing. The ouput is piped through pr for pagination. Other differences are saved and summarized after all text file differences are displayed. |
-r | Recursively descends through subdirectories. |
-s | Display files that are the same. Normally, identical files are not displayed. |
-S name | Begin the directory comparison with file name. Normally, all files in the directory are compared. |
Mutually exclusive options:
-D string | Creates a merged version of file1 and file2 on the standard ouput. C preprocessor controls are included in the output. If the ifdef string is not defined then a compile (cc) on the output would yield the same program as a compile on file1. If the ifdef string is defined then a compile would be the same as a compile on file2. |
-c[n] | Displays a comparison with n lines of context. The default for n is 3. The output begins with the identification and creation dates of each file. Each change encountered is separated with a dozen asterisks (*). Lines removed from file1 are preceded by a hyphen (-). Lines removed from file2 are preceded by a plus (+). Lines changed from file1 to file2 and vice versa are preceded with an exclamation mark (!). System V does not support the [n]. See the -C[n] option for variable context sizes. |
-C[n] | System V only. Same as the -c[n] option on BSD. |
-e | Produces an ed script consisting of the a (append), c (change), and d (delete) commands. These commands can be used as input to ed to change file1 to match file2. See the following section on Version Control. |
-f | Produces a script similar to that produced by -e, but the order is for file1 from file2. These commands are not usable with ed. |
-h | Does not attempt to find the most efficient way to edit the changes. It is fast, but not thorough. The changes must be short and well separated. It does work on files of unlimited size. The -e and -f options are disabled if -h is specified. |
-n | Produces a script similar to the -e option. The order is reversed. Each insert or delete command contains a count of changed lines. This format is used by rcsdiff. |
Two arguments may be passed to the diff command. They can be iether files or directories:
file1 | The first input file used in the comparison. If file1 is a directory name, the file2 file in directory file1 is used for comparison. For example, if you specify, |
diff adir afile
diff uses adir/afile afile for the two files. | |
file2 | The second input file used in the comparison. If file2 is a directory, the second file is set to file2/file1. |
- | A hyphen may be used in place of either file1 or file2 to represent the standard input. This allows you to pipe input to diff redirect input from a file, or type input from your keyboard for comparison. |
dir1 | The first directory containing files used for comparison. |
dir2 | The second directory containing files used for comparison. |
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Aug 27, 2017 | stackoverflow.com
diff -r dir1/ dir2/ | sed '/Binary\ files\ /d' >outputfileThis recursively compares dir1 to dir2, sed removes the lines for binary files (begins with " Binary files "), then it's redirected to the outputfile.
Feb 04, 2017 | www.cyberciti.biz
The diff command compare files line by line. It can also compare two directories:
# Compare two folders using diff ## diff /etc /tmp/etc_oldRafal Matczak September 29, 2015, 7:36 am§ Quickly find differences between two directories
And quicker:diff -y <(ls -l ${DIR1}) <(ls -l ${DIR2})
ColorDiff is a wrapper for diff. It produces the same output as diff, but with coloured syntax highlighting at the commandline to improve readability. The output is similar to how a diff-generated patch might appear in Vim or Emacs with the appropriate syntax highlighting options enabled. The colour schemes can be read from a central configuration file or from a local user ~/.colordiffrc file.
The diff command displays different versions of lines that are found when comparing two files. (There's also a GNU version on the CD-ROM.) It prints a message that uses ed-like notation (a for append, c for change, and d for delete) to describe how a set of lines has changed. This is followed by the lines themselves. The < character precedes lines from the first file and > precedes lines from the second file.
The output of diff -e shows compact formats with just the differences between the files. But, in many cases, context diff listings are more useful. Context diffs show the changed lines and the lines around them. (This can be a headache if you're trying to read the listing on a terminal and there are many changed lines fairly close to one another: the context will make a huge "before" section, with the "after" section several screenfuls ahead. In that case, the more compact diff formats can be useful.)
On many versions of diff (including the GNU version used on Linux), the -c option shows context around each change. By itself, -c shows three lines above and below each change. Here's an example of a C++ file before and after some edits; the -c2 option shows two lines of context:
The -e option of diff produces an editing script usable with either ex or ed, instead of the usual output. This script consists of a sequence of a (add), c (change), and d (delete) commands necessary to re-create file2 from file1 (the first and second files specified on the diff command line).
Obviously there is no need to completely re-create the first file from the second, because you could do that easily with cp. However, by editing the script produced by diff, you can come up with some desired combination of the two versions.
Google matched content |
Diff - Wikipedia, the free encyclopedia
diff - Linux Command - Unix Command
diff [options] from-file to-fileIn the simplest case, diff compares the contents of the two files from-file and to-file. A file name of - stands for text read from the standard input. As a special case, diff - - compares a copy of standard input to itself.
If from-file is a directory and to-file is not, diff compares the file in from-file whose file name is that of to-file, and vice versa. The non-directory file must not be -.
If both from-file and to-file are directories, diff compares corresponding files in both directories, in alphabetical order; this comparison is not recursive unless the -r or --recursive option is given. diff never compares the actual contents of a directory as if it were a file. The file that is fully specified may not be standard input, because standard input is nameless and the notion of ``file with the same name'' does not apply.
diff options begin with -, so normally from-file and to-file may not begin with -. However, -- as an argument by itself treats the remaining arguments as file names even if they begin with -.
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: October 25, 2020