Replies: 3 comments 2 replies
-
Okay. More understanding. TypeWrappers right now encapsulate two bits of information that can be split. Reflection-only metadata, and runtime metadata. They are used at runtime, to generate dynamic types. But also to implement reflective capabilities, etc. But they are also used by the static compiler and exporter to represent type information without execution. This deserves being split. TypeWrappers can become two things: JavaTypeInfo and JavaType. The first being metadata-only, suitable for usage by the compilers at any stage. Runtime information can be maintained in JavaType. This should give us the basis for separating the EXPORTER/IMPORTER requirements into the first, but not into the second. A demarcation line to help split things. ClassLoaderWrapper needs to go away at importer/exporter time. ClassLoaders are a runtime concept. They're being sort of faked by the static compiler though, even though there's no backing java.lang.ClassLoader. So this part of the logic can move out into something very-unJava like. Maybe JavaTypeInfoContext. The idea here is to be a bit recursive. The JavaTypeInfoContext delivers JavaTypeInfo instances, and can provide lookups by name. And then it can be overloaded with various parent search paths, but like a ClassLoader, but not really. For instance, AssemblyJavaTypeInfoResolver. Or FileJavaTypeInfoResolver. And, at runtime, we can provide a different resolver. At static compile time, we're interested in setting up a Context that can load types from System.Reflection.Metadata and friends as well as FileJavaTypeInfoResolver. However, at runtime, we're more interested in providing a ClassLoaderJavaTypeInfoResolver, which actually consults the runtime class loader hierarchy. This provides a nice interface split. |
Beta Was this translation helpful? Give feedback.
-
Okay. Some new ideas. Been playing around with some test designs. First off, going to abstract away the managed type provider and java type provider. These are the sources of raw .NET types and raw Java byte code. So the rest of the code base can depend on that, instead of against System.Type or System.Reflection.Metadata, etc. So, our own ManagedType, ManagedField, ManagedMethod classes. These dont' need to be interfaes wrapping anything: we'll just do a full load of the source. So, we take a System.Type, and copy everything into ManagedType. We should be careful here about allocations and such. Make good use of structures, etc. Much like System.Reflection.Metadata. So, we populate one of these structures. Optimize for certain patterns: such as the average number of methods a type has (store on type, instead of on list, etc). Same applies on the Java side, ByteCodeClass, ByteCodeType, etc. This is also the layer at which the ConstantPool overrides take place. ConstantPool overrides are not going to be much more than intercepting the load and rewriting information in the fake class info. This layer doesn't need to loop back to resolve types. ManagedType needs to be self contained. Any references to other types should be stored as some sort of reference. This layer isn't concerned about resolving stuff, just about describing a type and what that other type relates to. This is in accordance with SRM, which uses ref structs to repreesnt handles to other types. But it's not in accordance with System.reflection, where things contain links to actual System.Types. We need to poke through this layer a bit for System.Reflection, where our handles can optinally contain some sort of quicker way to access the referenced type. That is, resolution of a type reference can be assisted in System.Type. Runtime dynamic compilation will need this for speed, and to maintain the real runtime integraty between System.Type. Next layer is something like a view of how things look in Java. JavaView, JavaType, not exactly sure. This will be a layer that is fed a bucket of ByteCodeTypes and ManagedTypes, and allows you to lookup a JavaType from it. This lookup happens within some sort of unit (compilation unit). The compilation Unit isn't necessarily representative of a single assembly or single class loader or anything: but it could be. You could pile ManagedTypes obtained from multiple assemblies into it. Or ByteCodeTypes from multiple class files, across Jars, into it. You can query this unit for Java types by name, etc. So, for instance, you could query it for cli.System.Object. If the container contains the ManagedTypes derived from mscorlib.dll (or wherver Object is these days), you'd get back a JavaType that represented that class as seen from Java. This type would have higher level properties, dealing with the shape of the type as seen from Java. Next layer is the actual compiler/builder. The goal here is to take the JavaTypes and push them out to some sort of assembly builder, as the .NET types that implement them. Basically, we convert back. We know what the Java type should look like. So, we can ask a Java type to emit itself to some sort of builder. What it emits is a .NET assembly that implements the Java type. This gets a bit funky, in that you could pass a .NET type through it, to see what it looks like from Java, and then ask the resulting Java type to emit itself back as .NET. And it might work, up until the body, where we have no bytecode to convert. So in practice this won't be done. Users will only emit originally Java types. Probably a flag on the type of some kind. The static compiler can use this infrastructure to load up .NET information from SRM, dump it into a compilation unit, then load up Java byte code, dump it into some compilation units, then create a new MetadataWriter, then loop over the types in the compilation unit and have them emit into the MetadataWriter. That's static compilation. The dynamic runtime compiler can do the same thing, but based on System.Reflection.AssemblyBuilder, linking a compilation unit to the associated ClassLoader, and asking it to emit managed types into that AssemblyBuilder as needed. |
Beta Was this translation helpful? Give feedback.
-
Some of this work has been completed as of now. IKVM.Runtime's RuntimeJavaType hierarchy now doesn't rely on typeof() in any path intended to convert byte code. It remains in paths intended for dynamic runtime generation (where teh actual types are the runtime types). So, it's a step closer to splitting out of IKVM.Runtime. But not there yet. Most of the static access in Runtime is gone. Instead, it passses a RuntimeContext class around, which is similar to a DI container, in that it holds instances of the other types which were previously static. This gets us again closer to splitting it off. |
Beta Was this translation helpful? Give feedback.
-
So I've been thinking for awhile about how to approach rebuilding the IKVM compiler. Right now it's very much tied into the Runtime. Relies on many static variables and instances. Loads core types using typeof() in many places. Requires TypeWrappers and ClassLoaderWrappers to be loaded. This means it's very much tied to the classes of the current .NET runtime. You can't really run the compiler on Core 3 and generate .NET 5 assemblies. It's why the IKVM.Runtime sources files are included by ikvmc, with IFDEFs, instead of just referenced as a library.
So my thinking is this needs to change. It needs to be remodeled into a set of classes which can produce assemblies without regard for the current execution environment. We can derive a lot of lessons from Microsoft.CodeAnalysis (Roslyn), and maybe even reuse some classes in it.
Another thing would be to dump the usage of System.Reflection.Emit. Right now this exists in two forms: actual System.Reflection.Emit, for generation of dynamic assemblies at runtime; and IKVM.Reflection.Emit, which is subbed out for ikvmc. I would remodeling this new IL generation around System.Reflection.Metadata. No more hard references to runtime types.
This has a downside: the only way to generate dynamic assemblies is through System.Reflection.Emit. And dynamic assemblies is the way we get unload support. Now, Core can unload assemblies differently. Using AssemblyLoadContext. And this could be a good path forward for Core. ClassLoaders really are an AssemblyLoadContext. There's some way to put this together for Core where we simply no longer use System.Reflection.Emit, but instead load assemblies in an isolated LoadContext. But this would be a very different architecture than Framework, where we'd have to build a translation layer from regular assemblies to Dynamic.
So, translation layer. What would that look like. The main compilation path would emit assemblies with System.Reflection.Metadata. But then those assemblies would be reparsed and reemitted 1->1 to System.Reflection.Emit. There's some history of people working on this at https://github.com/Lokad/ILPack, but in the opposite direction: taking assemblies built by Emit, and rewriting them towards MetadataBuilder. Our converter would be almost the exact same thing but backwards. It is a lot of work. But straightforward. Mapping opcodes to opcodes, etc.
Another option is to write all assemblies to some interface that looks like MetadataBuilder, but calls either MetadataBuilder or Reflection.Emit dynamically. I'm not super fond of this. It seems like the ability to take a static assembly and reemit into AssemblyBuilder might have usages outside IKVM, and thus attract more interest.
So there's a speed sacrifice here. We'd be writing assemblies to a stream, then rereading them back in and emitting them back into AssemblyBuilder. Is this significant? Maybe. Does it matter? Maybe.
It does open the door for a stage two for Core though: dropping Dynamic assemblies completely, and using static assemblies with AssemblyLoadContext. No translation required there.
One stance to take might be that we should simply always target the latest methodology, and write adaptors for the earlier down level platforms, and take the hit. And optimize where we can.
Roslyn builds ontop of MetadataBuilder, with Compilations. These are more about the 'context' surrounding building a bunch of code: how assembly references are located, the relationships between assemblies, and the options that are availabel to customize the output. We can probably mirror much of this: IkvmCompilation instead of CSharpCompilation. Roslyn's stack (CodeAnalysis) provides many tools: assembly identity comparison, signing, reference resolving, etc, that we can make use of. Though we can't make use of any of the actual code generation pieces. There's a lot of this stuff in IKVM: especially assembly identity comparison, which would be nice to dump and replace with a more rigorous implementation.
So here's what I sort of imagine. A new IkvmCompilation class, that operates standalone. It duplicates most of the API surface of CSharpCompilation, but adds some other things in. It has options to emit static assemblies or dynamic assemblies (they do differ in IL). Or at least options that target the specifics of those two. It will take most of the place of Universe, StaticCompiler, and the TypeWrappers, as it relates to generating assemblies. All of the Emit code will be pulled out of the TypeWrappers, and moved into this new library.
TypeWrappers can then invoke this new IkvmCompilation class to obtain references to the assemblies as they need them. With the dynamic path passing different options from the static path. Resolvers will be different between the two: with the dynamic path passing resolvers that take class loaders into consideration. Omitting resources, etc. While the static path can generate slightly different assemblies. The IkvmCompilation code itself should be unaware of this distinction: just accepting different options for each.
I do think this should be a separate project with a separate assembly. It can be independently testable and usage outside of the IKVM.Runtime or the Static Compiler.
I wonder if it makes sense to call it IKVM.CodeAnalysis?
Beta Was this translation helpful? Give feedback.
All reactions