I think I'd appreciate some sort of "semantic grouping" of individual changes more than drawing someone random line and classifying all changes below it as "trivial".
The problem is that even a lot of the changes that normally constitute clutter can become relevant in certain situations or even introduce bugs.
One example would be ordering of Python imports: Changing the order of imports should have no effect on program behaviour if all your packages are well-behaved - and in 99.99% of cases it indeed hasn't. But the fact remains that imports are statements that are executed and can have side-effects. If a package does something nontrivial during load, changing the import order can have effects. Hiding such a change could mask introduction of a bug.
Hiding changes can also lead to confusion if you are trying to understand a series of changes that are based on each other, or if all changes of a commit are hidden. I've had the latter situation with IntelliJ, where the working tree was shown as "unclean" but the diff was completely empty. Solution: The diff wasn't actually empty, IntelliJ was just set to hide the changes.
I think a more interesting solution would be to build a sort of "tree of changes": At the bottom, you'd have the individual changes in the file; one level up, the changes would be grouped into higher-level operations, such as "change formatting", "rename identifier", "remove field", "move function", etc.
If possible, those could be grouped into even higher-level changes, such as "implement new class" or "extract expression into function", etc.
Agreed, I don't think the value of a semantic diff would be in hiding changes. Instead, the value should be in generating more useful diffs.
Normal diff often gets "confused" compared to how you'd logically identify the code. For example, if you extract a piece of a larger function as a smaller function, instead of showing that a piece of code was moved, it will show that you changed a header, deleted some lines, added others below, etc. A semantic diff should be able to refine these diffs in a better way, but shouldn't hide them. Even for the whitespace changes, I'd like it to show the diff, but the overlay to explain that only whitespace is different, so I know I don't need to look at it carefully.
I think the problem you'll eventually run into is figuring out intent from the diff. It seems like an easier version of reverse compiling.
When it comes down to semantic diffs I'm more interested in something like the Semantic Patch Language by Coccinelle. Being able to represent mundane refactorings across an entire codebase in a few lines seems great. And it unifies intent with the diff.
Personally, I really don't care about these cases. What really grinds my gears is when a diff plucks out a weird line in the middle of a block of code that only has a closing curly brace and that's the line that it thinks is the same, and everything around it is a diff.
If you're going to call yourself a semantic diff-ing company, fix that before you worry about the order of my imports.
I think I have heard of their product before, and reading the blog post intrigued me, so I wanted to try it, but... VS Code Integration? GitHub Integration? No standalone version which you could actually use as a diff tool for git locally? Ok, I guess only having a "cloud" version makes licensing easier, and you can call me old fashioned, but seeing an eminently "offline" task such as diff being turned into "online-only" seems a bit strange to me.
The VS Code extension works offline. The diff calculation is performed on the host where the VS Code GUI is running (makes a difference in case of SSH/Docker/WSL).
More specifically, the `function` keyword version of an anonymous function preserves the keyword `this` whilst the arrow syntax anonymous function does not. Arrow functions also cannot use the `yield` keyword nor be used as constructors
(function () {
var x = 0;
const foo1 = function(a, b) { var x = 2; }
const foo2 = (a, b) => { var x = 3; }
foo1(); console.log(x); // prints 0
foo2(); console.log(x); // prints 0
})()
However, there is the difference in how the implicit semicolons are inserted:
const foo1 = function(a, b) { return a + b; }
(2, 3)
console.log(foo1) // prints 5
const foo2 = (a, b) => { return a + b; }
(2, 3)
console.log(foo2) // prints [Function: foo2]
haha, oh ASI - I was very confused by your example as I read this as if it was
> const foo1 = function(a, b) { return a + b; }
(2, 3)
> console.log(foo1) // prints 5
> const foo2 = (a, b) => { return a + b; }
(2, 3)
> console.log(foo2) // prints [Function: foo]
Even though it makes no sense for (2, 3) to be a result in those cases, that was just how I ended up reading it, and I was exceptionally confused about how the printed output could possibly happen.
A super nice example of how subtle differences can really change things though.
As a side note, ASI for JS is actually super easy to implement and the rules are actually really simple (leaving aside whether the feature itself is good :D ) as it's just "these specific statements can have a new line instead of a semicolon" - so in the parser instead of consume(semicolon) you can just do "semicolon or newline" (You can check the logic in JSC in https://github.com/WebKit/WebKit/blob/main/Source/JavaScript... - just look for autoSemicolon() or autoSemi() I can't recall off the top of my head)
My guess would be that quite a large portion of changes we'd expect at a glance to be identical aren't, especially for inputs that would not be expected. I'd also guess this is much more likely in languages in which valid code commonly produces undefined behavior.
If the tool could show you, for example, "this change is functionally identical except for when the sum of the two inputs overflows a UInt64", that'd be pretty cool.
That would neat, although I suspect most compilers/linters should already be able to warn you about potential overflows.
If you want to boil down what devs are looking for in a diff tool to one thing, it would be "which change(s) between these two versions of code result in a different binary (or AST/opcodes/bytecode, depending on the language)?" All other changes, while certainly sometimes useful to know about, are just syntactic sugar.
Literally every time you add/subtract/multiply two variables there is a potential overflow. In relatively rare cases, the compiler might be able to prove that they can't overflow, but in the general case it can't, and I doubt any actually do.
Depends on how technical the audience of this post should be.
Putting an accurate example which is only understandable to someone with years of experience might make newbies think it's a made up concern.
Yeah, that was a bit of an unfortunate example for a blogvertising post. Even I as a non-frontend developer knew to watch out for that one. A company working with semantic diffs should really know better and such mistakes do not inspire confidence for me!
I haven't actually checked the source, but I've heard that clang-format works by assigning "badness" weights to each choice of whitespace between tokens, and then runs Dijkstra's (or some other DP) to find the least bad set of choices. A recent Tom7 video said that Knuth did the same thing for text justification.
How about we do a similar thing for ASTs? Like a peephole optimizer looking for runs of instructions that could be substituted for simpler alternatives, a tree diff could identify diff patterns that "might be trivial." You have a whole catalog of these patterns, and assign to each a weight. Then the displayed diff is the optimal set of choices "consider different or not?"
You would need some additional ingredient, though; some boundary condition. Otherwise "everything is the same" would always minimize badness.
Not far. Just show all changes. Like the blog article already states, for many projects you already have code formatters, so changes in format usually don’t happen a lot - and if they do there might be a reason you don’t want to hide (like… you change your rules of code formatting).
For all the other example I neither see the point why you would want to hide it. If you don’t want to see commas added in a list, make it a rule that the comma always has to be appended after the last element. Most languages allow that. Semantic equivalence? The JS example isn’t even equivalent because „this“ may have a different context.
I’d prefer to have a „dumb“ diff that simply shows all the changes instead of adding these kind of complexities. Just keep your MRs small and there’s no real issue.
The healthy workflow is: notice formatting discrepancies -> reformat -> reopen the diff, now containing only intentional, substantial changes.
Of course the edited source files should have been reformatted automatically, on save or on build, before someone opens a diff: this should never happen except as a symptom of inadequate reformatting (e.g. I decide to adopt redundant commas at the end of comma-separated lists) or abnormal operations (e.g. non-reformatted code was accidentally committed to version control).
Interesting idea. I've just tried it with a couple of languages:
- TS with Vue: SFC are not really working (it's showing a style change as if the whole stylesheet were replaced with a mostly-identical stylesheet).
- Rust: It doesn't seem semantic at all. It's showing a lot of character-level insertions and deletions that seem worse than how git-diff or GitHub would break down the changes.
It doesn't seem ready yet for what I'd like to use it for.
I'm sorry you didn't have a good experience testing the tool. If it doesn't work / makes things worse than a standard diff, that's definitely considered a bug. It is probably something specific to your code and not a general issue. It would therefore be great if you could open an issue [1] or support ticket [2], ideally with some sample code, so we can take a look. Thanks in advance!
As author of SemanticDiff, I am obviously a bit biased. But Wilfred, the author of difftastic, found the analysis to be "pretty even-handed" [1], so I think it should be somewhat fair.
In theory semantic diff is useful, but based on my code review experience, it hardly matters. For a language like Python or JavaScript, a developer fluent in these languages don't really pay much attention to these things anyway, just like you don't normally pay much attention to commas and periods in a sentence unless it causes confusion. Personally I wouldn't pay $5/month out of the pocket for this functionality.
There is also diffsitter. I was testing it a month ago, it works fine. Not sure what language-aware diffing exactly means, but diffsitter uses tree-sitter and it is comparing ASTs and CSTs of the files.
tree-sitter is very generic and supports so many languages, it is really great. The first use case of the article, "Level 1: Irrelevant Whitespace" is covered by diffsitter.
I wanted at some point, to diff files and ignore comments for Rust source code. I wrote a small program, to remove the two different comment nodes the Rust grammar defines: line_comments and block_comments. Then i diffed the resulting uncommented code using diffsitter.
From start to finish, writing the program and testing it to many different files it took 5 hours.
Whether or not this makes a semantic difference is language implementation dependent. I think that is why this kind of tool is not especially appealing to me. I would have to have almost complete knowledge of the compiler and the diff tool to truly trust that there is no semantic difference. Moreover, I would like to know why changes to the text that are being made that have no semantic effect are being mixed with those that do.
For me, text is king and that is the level at which I want to evaluate diffs 99% of the time, but I do recognize that others have different goals and preferences.
I was expecting this to refer to different ways to represent the same diff. (For example, you could represent a change from `console.log(“hello”)` as `console.log('hello')` as +'-“ … +’-“ or as +'hello'-“hello”)
I don’t have a specific example in mind, but it seems reasonable that different languages could benefit from different ways of representing the same diff.
Hopefully this tool gives a dev ready control of what kinds of differences to hide/show.
I'm actually not convinced of the concept of semantic diff (not talking just about this tool specifically)... when we talk about code that is different but equivalent, I think we're talking about elements of style.
It seems to me that it would pretty much always be better to normalize the elements of style considered insignificant, rather than hide them just in the diff tool. That covers diffing as well as viewing/reading the code.
If you don't care about a particular element of style then either it shouldn't be coming up much or I think you'd be better off using some kind of enforcing/fixing linter.
As far as possible. Git's line-based diffing is ridiculously primitive and gets in the way of software development. I wonder how many bugs are introduced because of Git's diffing system.
It's a very interesting question. One idea I've toyed with over the years is a language specifically designed to facilitate effective diffs.
Anyway, it seems the "Level 3: semantic diff" actually could be divided into different levels. But "Level 4: Mostly identical" seems quite problematic.
I think this question has been already largely been answered by automatic style (etc) tools. Such tools generally should not make semantic changes to programs, so they (implicitly) define what are meaningful semantic changes and what are meaningless changes.
Fortunately the article already says that hiding all semantically identical changes is "probably going too far", so they can just not try and solve the halting problem.
The problem is that even a lot of the changes that normally constitute clutter can become relevant in certain situations or even introduce bugs.
One example would be ordering of Python imports: Changing the order of imports should have no effect on program behaviour if all your packages are well-behaved - and in 99.99% of cases it indeed hasn't. But the fact remains that imports are statements that are executed and can have side-effects. If a package does something nontrivial during load, changing the import order can have effects. Hiding such a change could mask introduction of a bug.
Hiding changes can also lead to confusion if you are trying to understand a series of changes that are based on each other, or if all changes of a commit are hidden. I've had the latter situation with IntelliJ, where the working tree was shown as "unclean" but the diff was completely empty. Solution: The diff wasn't actually empty, IntelliJ was just set to hide the changes.
I think a more interesting solution would be to build a sort of "tree of changes": At the bottom, you'd have the individual changes in the file; one level up, the changes would be grouped into higher-level operations, such as "change formatting", "rename identifier", "remove field", "move function", etc. If possible, those could be grouped into even higher-level changes, such as "implement new class" or "extract expression into function", etc.