This changes how the JIT matches method names and signatures for method
sets (e.g. JitDisasm). It also starts printing method instantiations for full method names
and makes references to types consistent in generic instantiations and the signature.
In addition it starts supporting generic instantiations in release too.
To do this, most of the type printing is moved to the JIT, which also aligns the output
between crossgen2 and the VM (there were subtle differences here, like spaces between generic type arguments).
More importantly, we (for the most part) stop relying on JIT-EE methods that are documented to only be for debug purposes.
The new behavior of the matching is the following:
* The matching behavior is always string based.
* The JitDisasm string is a space-separated list of patterns. Patterns can arbitrarily
contain both '*' (match any characters) and '?' (match any 1 character).
* The string matched against depends on characters in the pattern:
+ If the pattern contains a ':' character, the string matched against is prefixed by the class name and a colon
+ If the pattern contains a '(' character, the string matched against is suffixed by the signature
+ If the class name (part before colon) contains a '[', the class contains its generic instantiation
+ If the method name (part between colon and '(') contains a '[', the method contains its generic instantiation
For example, consider
```
namespace MyNamespace
{
public class C<T1, T2>
{
[MethodImpl(MethodImplOptions.NoInlining)]
public void M<T3, T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4)
{
}
}
}
new C<sbyte, string>().M<int, object>(default, default, default, default); // compilation 1
new C<int, int>().M<int, int>(default, default, default, default); // compilation 2
```
The full strings are:
Before the change:
```
MyNamespace.C`2[SByte,__Canon][System.SByte,System.__Canon]:M(byte,System.__Canon,int,System.__Canon)
MyNamespace.C`2[Int32,Int32][System.Int32,System.Int32]:M(int,int,int,int)
```
Notice no method instantiation and the double class instantiation, which seems like an EE bug. Also two different names are used for sbyte: System.SByte and byte.
After the change the strings are:
```
MyNamespace.C`2[byte,System.__Canon]:M[int,System.__Canon](byte,System.__Canon,int,System.__Canon)
MyNamespace.C`2[int,int]:M[int,int](int,int,int,int)
```
The following strings will match both compilations:
```
M
*C`2:M
*C`2[*]:M[*](*)
MyNamespace.C`2:M
```
The following will match only the first one:
```
M[int,*Canon]
MyNamespace.C`2[byte,*]:M
M(*Canon)
```
There is one significant change in behavior here, which is that I have removed the special case that allows matching class names without namespaces. In particular, today Console:WriteLine would match all overloads of System.Console.WriteLine, while after this change it will not match. However, with generalized wild cards the replacement is simple in *Console:WriteLine.
Add --crashreportonly command line option that doesn't generated a dump.
Add matching DOTNET_EnableCrashReportOnly env var.
Add LoadModule error logging for MacOS.
Make DAC validate EEClass and MethodDesc functions more robust on Linux/MacOS so SOS's eestack and dumpstack commands don't segfault.
Update createdump doc.
* Describe the validity of null managed pointers
- Declare that it is valid to have a null managed pointer, but declare it invalid to actually read from such a pointer
- In practice this has always been legal, as it has been legal to managed pointer locals for years, and they are included in the list of values that are zeroinitialized on method start
- Also clarify the rules to permit a managed pointer to the location directly following a managed object.
- This is a new capability in the spec that will likely be useful for accessing fixed size data buffers held in objects of the GC heap. However, the GC has been able to tolerate this behavior for many years, so there is no code change necessary.
Fixes#69690
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Aaron Robinson <arobins@microsoft.com>
The initial implementation of this did not handle the fact that retbuf
can point to GC heap during reflection invoke. It was fixed in #39815,
but the way it was fixed was by copying it into a local. This changes
the fix so that we simply report the return value pointer as a byref
throughout the mechanism, which simplifies the JIT's handling and is a
perf improvement as well.
* Fix typos
* Cleanup trailing whitespaces in committed files
* Revert a macro for win32 compat
* Disambiguate test data method
* Revert XMLPath test which rely on external assets
* Revert whitespace change in Xml tests
* Revert ClrEtwAl and ILLink.Shared
* Revert crossgen2 props/targets and *.wxl
- Refactor Module into ModuleBase and Module
- The goal is to have allow a subset version of Module which can only hold refs, this is to be used by the manifest module in an R2R image to allow for version resilient cross module references.
- Update handling of ModuleBase so that its used everywhere that tokens are parsed from R2R
- Remove ENCODE_MODULE_ID_FOR_STATICS and ENCODE_ACTIVE_DEPENDENCY
- These were only used for NGEN, and conflict with easy impelmentation for the ModuleBase concept
- Remove locking in ENCODE_STRING_HANDLE processing, and tweak comments. Comments applied to the removed ngen based code, and the lock was only necessary for the old ngen thing.
- Adjust ComputeLoaderModuleWorker for locating loader module
- Follow comment more accurately, to avoid putting every generic into its definition module. This will make R2R function lookup able to find compiled instantiations in some cases. This may be what we want long term, it may not.
- Remove MemberRefToDesc map and replace with LookupMap like the other token types. We no longer make use of the hot table, so this is more efficient
- Also reduces complexity of implementation of ModuleBase
- Build fixup to describe a single method as a standalone blob of data
- There are parallel implementations in Crossgen2 and in the runtime
- They produce binary identical output
- Basic R2RDump support for new fixup
- Adjust module indices used within the R2R format to support a module index which refers to the R2R manifest metadata. This requires bumping the R2R version to 6.2
- Add a module index between the set of assembly refs in the index 0 module and the set of assembly refs in the R2R manifest metadata
- Adjust compilation dependency rules to include a few critical AsyncStateMachineBox methods
- Remove PEImage handling of native metadata which was duplicative
- Do not enable any more devirtualization than was already in use, even in the cross module compilation scenario. In particular, do not enable devirtualization of methods where the decl method isn't within the version bubble, even if the decl method could be represented with a cross module reference token. (This could be fixed, but is out of scope for this initial investigation)
Make the compilation deterministic in this new model, even though we are generating new tokens on demand
- Implement this by detecting when we need new tokens during a compile, and recompiling with new tokens when necessary
- This may result in compiling the same code as much as twice
Compile the right set of methods with cross module inlining enabled
- Add support for compiling the called virtual methods on generic types
- This catches the List<T> and Dictionary<TKey,TValue> scenarios
- Support input of PGO data to control the set of methods
- Enable new `READYTORUN_FLAG_UNRELATED_R2R_CODE` flag on R2R images which is used to indicate which modules may have generic code not directly related to the metadata of the image
- Lookup R2R methods in an `alternate` location as well as the metadata defining module. This allows for many generics to be embedded without needing to use the new `READYTORUN_FLAG_UNRELATED_R2R_CODE` flag, which has global effects on performance.
- Add command line switches to enable/disable the new behavior
- Enhance the version resilience test to cover this new behavior
Basic stateless linear collection marshalling for blittable elements
Not handled:
- caller-allocated buffer
- guaranteed unmarshal
- pinnable reference
- non-blittable element marshalling
- element scenarios on custom marshallers
* Add support for cross module inlining and cross module generic compilation to Crossgen2
- Refactor Module into ModuleBase and Module
- The goal is to have allow a subset version of Module which can only hold refs, this is to be used by the manifest module in an R2R image to allow for version resilient cross module references.
- Update handling of ModuleBase so that its used everywhere that tokens are parsed from R2R
- Remove ENCODE_MODULE_ID_FOR_STATICS and ENCODE_ACTIVE_DEPENDENCY
- These were only used for NGEN, and conflict with easy impelmentation for the ModuleBase concept
- Remove locking in ENCODE_STRING_HANDLE processing, and tweak comments. Comments applied to the removed ngen based code, and the lock was only necessary for the old ngen thing.
- Adjust ComputeLoaderModuleWorker for locating loader module
- Follow comment more accurately, to avoid putting every generic into its definition module. This will make R2R function lookup able to find compiled instantiations in some cases. This may be what we want long term, it may not.
- Remove MemberRefToDesc map and replace with LookupMap like the other token types. We no longer make use of the hot table, so this is more efficient
- Also reduces complexity of implementation of ModuleBase
- Build fixup to describe a single method as a standalone blob of data
- There are parallel implementations in Crossgen2 and in the runtime
- They produce binary identical output
- Basic R2RDump support for new fixup
- Adjust module indices used within the R2R format to support a module index which refers to the R2R manifest metadata. This requires bumping the R2R version to 6.2
- Add a module index between the set of assembly refs in the index 0 module and the set of assembly refs in the R2R manifest metadata
- Adjust compilation dependency rules to include a few critical AsyncStateMachineBox methods
- Remove PEImage handling of native metadata which was duplicative
- Do not enable any more devirtualization than was already in use, even in the cross module compilation scenario. In particular, do not enable devirtualization of methods where the decl method isn't within the version bubble, even if the decl method could be represented with a cross-module reference token. (This could be fixed, but is out of scope for this initial investigation)
Make the compilation deterministic in this new model, even though we are generating new tokens on demand
- Implement this by detecting when we need new tokens during a compile, and recompiling with new tokens when necessary
- This may result in compiling the same code as much as twice
Compile the right set of methods with cross module inlining enabled
- Add support for compiling the called virtual methods on generic types
- This catches the List<T> and Dictionary<TKey,TValue> scenarios
- Add command line switches to enable/disable the new behavior
- By default the new behavior is not enabled
* Do not set NO_CSE on ARR_ADDRs
It is effectively no-CSE already because of how "optIsCSECandidate" works.
* Delete GT_INDEX
Instead:
1) For "ldelem", import "IND/OBJ(INDEX_ADDR)".
2) For "ldelema", import "INDEX_ADDR".
This deletes two usages of "ADDR":
1) "OBJ(ADDR(INDEX))" from "ldelem<struct>".
2) "ADDR(INDEX)" from "ldelema".
* Add a zero-diff quirk
* Update the first class structs document
Remove references to things that no longer exist.
This adds support for EnC on arm64. A couple of notes on the
implementation compared to x64:
- On x64 we get the fixed stack size from unwind info. However, for the
frames we set up on arm64 for EnC it is not possible to extract the
frame size from there because their prologs generally look like
stp fp, lr, [sp,#-16]!
mov fp, sp
sub sp, sp, #144
with unwind codes like the following:
set_fp; mov fp, sp
save_fplr_x #1 (0x01); tp fp, lr, [sp, #-16]!
As can be seen, it is not possible to get the fixed stack size from
unwind info in this case. Instead we pass it through the GC info that
already has a section for EnC data.
- On arm64 the JIT is required to place the PSPSym at the same offset
from caller-SP for both the main function and for funclets. Due to
this we try to allocate the PSPSym as early as possible in the main
function and we must take some care in funclets. However, this
conflicts with the EnC frame header that the JIT uses to place values
that must be preserved on EnC transitions. This is currently
callee-saved registers and the MonitorAcquired boolean.
Before this change we were allocating PSPSym above (before) the
monitor acquired boolean, but we now have to allocate MonitorAcquired
first, particularly because the size of the preserved header cannot
change on EnC transitions, while the PSPSym can disappear or appear.
This changes frame allocation slightly for synchronized functions.
Bring the main document up to date with current implementation. Remove
some obsolete sections (EnC, complex epilog).
Co-authored-by: Jakob Botsch Nielsen <Jakob.botsch.nielsen@gmail.com>
These do not serve much purpose today -- instead just use null and add a
helper function to iterate non-null early args, which is somewhat
common.
In addition to saving some TP and memory, teaching the backend about
null early nodes will also be beneficial because I am planning to change
rationalization to null out non-values in the early arg list so that all
nodes have only values as their operands in LIR.
Throughput diff:
```
Collection Base # instructions Diff # instructions PDIFF
aspnet.run.windows.x64.checked.mch 69,717,468,395 69,206,312,087 -0.73%
benchmarks.run.windows.x64.checked.mch 54,695,846,729 54,294,078,768 -0.73%
coreclr_tests.pmi.windows.x64.checked.mch 340,169,515,528 337,478,749,067 -0.79%
libraries.crossgen2.windows.x64.checked.mch 128,653,906,043 126,926,566,191 -1.34%
libraries.pmi.windows.x64.checked.mch 228,653,702,806 226,554,618,843 -0.92%
libraries_tests.pmi.windows.x64.checked.mch 531,053,530,645 525,233,144,101 -1.10%
```
Memory stats (libraries.pmi)
Before: 25961399533 bytes
After: 25770612141 bytes (-0.7%)
This refactors the JIT's representation of call arguments. It replaces
`GenTreeCall::Use` and `fgArgTabEntry` with a single class `CallArg`.
`CallArg` always contains space for ABI information and contains two
intrusive linked list nodes: one for all args (similar to current
`gtCallArgs`) where all standard arguments are always in argument order,
and one for late args (similar to current `gtCallLateArgs`) that may be
reordered. The late args list may also not contain all arguments.
`fgArgInfo` is also replaced by a new class `CallArgs` that is stored
inline in `GenTreeCall`. It encapsulates all handling of arguments
(insertion/removal). The change also begins treating the 'this' argument
as a normal argument with `CallArgs` providing convenient access to it
when necessary.
The main benefit of this change is to avoid keeping track of the side
table `fgArgInfo` and having to scan through this side table repeatedly
when we need to query argument information. In addition it gives more
convenient ways to access well known arguments like 'this', the ret
buffer, VSD cell, etc. Finally, it also serves as a nice clean-up in the JIT.