Interface Dispatch (2018)
82 points by matthewwarren 6 years ago | 36 comments- barrkel 6 years agoSome JVMs use fat pointers where the pointer embeds the dynamic type of the class rather than the statically declared type. This permits inline caches to verify the type of the instance without any indirections.
Delphi implemented interfaces like the C++ approach, and in order to be compatible with COM, it didn't have a huge amount of flexibility: interfaces are mandated by the COM ABI to be called like this:
This pretty much forces the vtable references to be stored in the object instance, one per implemented interface. Casting an instance to an interface means adding the offset to the vtable in the object data to the instance pointer.reference->vtable->method(reference)
The vtables for every interface point at stubs which subtract the same offset and then jump to the implementation, so that object data is where it's expected for a normal instance method call.
COM influenced the internal design of Java interfaces too - the first three slots were reserved for COM's lifetime management and typecasting methods. I'm not sure anything practically ever came from that though.
I reused the interface calling mechanism when I implemented anonymous methods in Delphi. Anonymous methods can capture variables from their declaring scope; the captured state needs lifetime management in the absence of GC, so I piggy-backed on COM's lifetime management. This reduced the effort to interop with C++ too.
- KyleJ61782 6 years agoI concur with the description of the Delphi implementation, as I've navigated it numerous times myself. I guess I don't understand the author's negative tone with regard to the C++/Delphi implementation. There's going to be overhead with any choice made and C++ and Delphi made the choice to sacrifice memory usage for speed. Furthermore, the only part that really scales with the number of objects is the extra vtable pointer for each implemented interface in each newly constructed object. Hardly a significant overhead in today's memory-prolific computing environment. Also, pointer fixup thunks and unconditional jumps are easy to speculatively handle.
- latk 6 years agoThe author is interested in PL design, and thinks that while C++ style object implementations are great at offering compatibility with certain systems or interfaces, other approaches allow for far nicer semantics in a greenfield programming language.
For example, fat pointers make it possible to implement interfaces for existing types. I have experienced that as game-changing in Haskell, Go, Rust.
I also disagree that any memory usage can be handwaved as “no longer relevant today”. Electron, or garbage collection: perhaps. But object size affects cache pressure, and cache utilization is critical to performance. C++ is typically good at keeping objects small, except when it comes to multiple inheritance.
- KyleJ61782 6 years agoHonest question: do you have a real-world example that indicates that the extra pointers added due to multiple inheritance actually results in performance degradation due to extra cache pressure?
- KyleJ61782 6 years ago
- latk 6 years ago
- KyleJ61782 6 years ago
- userbinator 6 years agoWith the C++, there is also an optimisation which can be performed: put the method pointers in the object itself, instead of in a vtable. This is beneficial when there aren't so many objects nor methods, since it eliminates one indirection. Instead of obj->table.method() it turns into obj->method(). When I write "OOP" code in C, I usually do this instead of using a separate vtable.
As an aside, I noticed the word "klass", then looked up at the domain name and confirmed my suspicions. ;-)
- microtherion 6 years ago"klass" is a common convention in OOP code when referring to classes, because "class" and "Class" are often reserved already.
- mikepurvis 6 years agoPython typically goes with `cls`, particularly in conjunction with the `@classmethod` decorator.
- qwsxyh 6 years agoPython code uses both.
- qwsxyh 6 years ago
- chaosite 6 years agoJava seems to prefer `clazz`.
- MaxBarraclough 6 years ago"klass" seems to somewhat outnumber "clazz" in OpenJDK, 1541 to 1021. Wonder if there's a reason they haven't settled on one.
- noblethrasher 6 years agoC# prefers `@class`
- MaxBarraclough 6 years ago
- mikepurvis 6 years ago
- shereadsthenews 6 years agoDoing this prevents a large class of important compiler optimizations. Pitching a DIY vtable as an optimization seems to require a lot of hard evidence.
- microtherion 6 years ago
- nwellnhof 6 years agoHere's another interesting approach for Java:
https://dl.acm.org/citation.cfm?id=504291
Interface methods are looked up in a hash table containing function pointers (the table index can be computed at compile time). If there's a collision, a small stub is generated that dispatches to the right method.
- chrisseaton 6 years ago> C# generally follows the same approach as Java, but tries to perform more aggressive optimizations.
I don't understand this when it doesn't have deoptimization?
The Java example has one condition for fast path, while the C# has two nested loops and a condition inside that!
- latk 6 years agoAuthor here. Both the JVM and CLR do some pretty aggressive optimizations even for normal interface calls. In the Java section I also show inline caching, a standard optimization approach. The CLR calls its approach to caching “virtual stub dispatch”. Speculative optimizations and inlining would be more advanced optimizations that I did not cover.
But what about the worst case – the method lookup when the method is not in the cache? There, the CLR introduces an additional level of indirection for calls through interface types, which has benefits for memory usage and possibly dispatch time. But since this is the worst case, the complexity doesn't matter that much as long as the important cases (which can be cached) work well.
- chrisseaton 6 years ago> In the Java section I also show inline caching, a standard optimization approach.
Thanks. Does the CLR do inline caching for interface calls which cannot be devirtualised without speculation? Or not because of the lack of deoptimisation? Or does it have self-modifying code instead?
- pjmlp 6 years agoIt depends very much of which .NET runtime we talk about (NGEN, Pre-RyuJIT, RyuJIT, IL2CPP, Mono, MDIL, .NET Native, .NET Core, Burst), and which version of it.
- latk 6 years agoCLR routes any interface calls through an indirect cell or pad that can be quickly rewritten to point to a different dispatch stub. Some stubs perform caching whereas the generic resolver behaves as explained by the pseudocode in my article.
The “book of the runtime” has a graph visualizing these possibilities: https://github.com/dotnet/coreclr/blob/master/Documentation/...
So this is a self-modifying code approach.
- pjmlp 6 years ago
- chrisseaton 6 years ago
- lozenge 6 years agoPerhaps it's memory optimisation? Especially considering all the instantiations of generic types that Java wouldn't have.
- latk 6 years ago
- carapace 6 years agoIf you liked this you might also like the COLA system: https://en.wikipedia.org/wiki/COLA_(software_architecture)
> COLA is a system for experimenting with software design currently being investigated by the Viewpoints Research Institute. "COLA" stands for "Combined Object Lambda Architecture". A COLA is a self-describing language in two parts, an object system which is implemented in terms of objects, and a functional language to describe the computation to perform.
- gumby 6 years agofat pointer cost can be mitigated by doing pointer preunpacking in hardware. That kind of support fell out of favor when RISC started beating CISC architectures, but if dynamic dispatch became a significant performance issue I could see it going into certain processors. We have happily returned to an era of processor experimentation.
- Insanity 6 years agoSpeaking of interfaces in Go, can anyone explain this particular result? We've done a benchmark recently at work to test which way we would best represent a set. (Linux, go1.12)
results:map[string]bool map[string]struct{} map[string]interface{} // where we put an empty struct in it.
(I'm aware this is a bit of a tangent to TFA)BenchmarkBoolSet-4 5886572 223 ns/op 39 B/op 0 allocs/op BenchmarkStructSet-4 6716685 206 ns/op 32 B/op 0 allocs/op BenchmarkInterfaceSet-4 4299072 313 ns/op 115 B/op 0 allocs/op
- bouk 6 years agoCan’t really say what’s going on here without seeing some code.
- Insanity 6 years agoWell the code is the same for all of them (but when using a bool map we assign true instead of empty struct). So we run this code inside a benchmarking function:
edit: fix code.m := map[int]interface{} // or m := map[int]struct{} for i := 0; i < t.N; i++ { m[i] = struct{}{} }
- wyufro 6 years agoI'm speculating, but sizeof(bool) == 1, sizeof(struct{}) == 0 and sizeof(interface{}) == 8, so might be as simple as that.
- wyufro 6 years ago
- Insanity 6 years ago
- bouk 6 years ago
- abecedarius 6 years agoI'm curious about the history of the fat-pointer approach. The first exposition I personally ran across was http://erights.org/enative/fatpointers.html in the 90s. Can anyone point out others?
- cross_wiber 6 years agoThe earliest one I know of is in the Emerald programming language. There is a brief description of the technique in http://www.emeraldprogramminglanguage.org/OOPSLA-86-paper.pd... from 1986.
See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.103... for more history and description of the implementation. From that paper: "The run-time system constructed an AbCon for each <type, implementation> pair that it encountered. An object reference consisted not of a single pointer, but of a pair of pointers: a pointer to the object itself, and a pointer to the appropriate AbCon". (Called an AbCon because it "mapped Abstract operations to Concrete implementations.)
- abecedarius 6 years agoGreat find! Thanks.
- abecedarius 6 years ago
- cross_wiber 6 years ago
- Anonymous4C54D6 6 years agoThe ascii art is broken.
- latk 6 years ago(Author) I am aware of the problem. For me it works on desktop but not on mobile browsers – apparently the Unicode box drawing characters are not really fixed-width on all systems?
One day I'll write a Pandoc filter that pipes these diagrams through a Shaky or Ditaa type tool first…
- munificent 6 years ago> For me it works on desktop but not on mobile browsers
Looks weird for me on Mac desktop Chrome. It looks like the default monospace font, Courier, lacks box drawing characters. So it probably falls back to some other monospace font which likely has different metrics.
A good CSS font stack for monospace fonts is something like:
The first catches almost all Windows users, the second almost all Mac users, and the last fallback to the default.font-family: Consolas, Monaco, monospace;
- munificent 6 years ago
- gumby 6 years agoNot in Mac Safari
- signa11 6 years agonot on firefox/chromium
- latk 6 years ago