Ali Mohammad Pur
cfc241f61d
LibRegex: Make the trie rewrite optimisation maintain the alt order
...
This is required by the spec.
2025-05-21 14:28:45 +02:00
Ali Mohammad Pur
2eccd68ba5
LibRegex: Document the append_alternative optimisation a bit
2025-05-21 14:28:45 +02:00
Timothy Flynn
7280ed6312
Meta: Enforce newlines around namespaces
...
This has come up several times during code review, so let's just enforce
it using a new clang-format 20 option.
2025-05-14 02:01:59 -06:00
Ali Mohammad Pur
022cd1adca
LibRegex: Use the right offset when patching jumps through fork-trees
...
Fixes #4474 .
2025-04-27 12:16:15 +02:00
Ali Mohammad Pur
fca1d33fec
LibRegex: Correctly calculate the target for Repeat in table alts
...
Fixes a bunch of websites breaking because we now verify jump offsets by
trying to remove 0-offset jumps.
This has been broken for a good while, it was just rare to see Repeat
inside alternatives that lended themselves well to tree alts.
2025-04-24 01:17:27 -06:00
Ali Mohammad Pur
4b9abdb963
LibRegex: Remove useless jumps (Jump* +0) before running opts
...
This leads to some more significant performance increases on the simple
/<script|<style|<link/ regex in speedometer (~2x)
2025-04-23 22:57:49 +02:00
Ali Mohammad Pur
ec0836c9ea
LibRegex: Don't blindly treat multi-target tree jumps as a single jump
...
The tree generation was broken, we just didn't notice it because it was
very rarely being picked for more complex bytecodes.
2025-04-23 22:57:49 +02:00
Ali Mohammad Pur
09eb28ee1d
LibRegex: Better estimate the cost of laying out alts as a chain
...
Previously we were counting the total number of *nodes* in the tree for
the chain cost, which greatly underestimated its cost when large
bytecode entries were present,
This commit switches to estimating it using the total bytecode *size*,
which is a closer value to the true cost than the tree node count.
This corresponds to a ~4x perf improvement on /<script|<style|<link/ in
speedometer.
2025-04-23 22:57:49 +02:00
Ali Mohammad Pur
446a453719
LibRegex: Pull out the first compare to avoid unnecessary execution
...
This adds a fast-path to drop view indices we know will not match
immediately without going through the regex VM.
2025-04-18 17:09:27 +02:00
Ali Mohammad Pur
76f5dce3db
LibRegex: Flatten capture group list in MatchState
...
This makes copying the capture group COWVector significantly cheaper,
as we no longer have to run any constructors for it - just memcpy.
2025-04-18 17:09:27 +02:00
Ali Mohammad Pur
69050da929
LibRegex: Merge inverse string table mappings separately
2025-04-06 20:21:16 +02:00
Ali Mohammad Pur
4136d8d13e
LibRegex: Use an interned string table for capture group names
...
This avoids messing around with unsafe string pointers and removes the
only non-FlyString-able user of DeprecatedFlyString.
2025-04-02 11:43:13 +02:00
Ali Mohammad Pur
5355710481
LibRegex: Don't treat single-jump blocks as noop in the optimizer
2025-03-09 14:37:57 +01:00
Ali Mohammad Pur
ea3b7efd91
LibRegex: Treat the UnicodeSets flag as Unicode
...
Fixes /.../v not being interpreted as a unicode pattern.
2025-02-28 14:31:45 -05:00
mikiubo
8a6f7b787e
LibRegex: Use depth-first search in regex optimizer
...
use depth-first search in optimizer code bacause using breadth-first
search generate a bug. Add test example in test lib.
2025-02-25 00:09:20 +01:00
Ali Mohammad Pur
08ebfaff17
LibRegex: Take trailing inversion state into account in block comparison
...
Fixes #3421 .
2025-02-01 11:30:02 +01:00
Ali Mohammad Pur
50733c564c
LibRegex: Use the *actually* correct repeat start offset for Repeat
...
Fixes #2931 and various frequent crashes.
2024-12-23 13:13:52 +01:00
Ali Mohammad Pur
358378c1c0
LibRegex: Pick the right target for OpCode_Repeat
...
Repeat's 'offset' field is a bit odd in that it is treated as a negative
offset, causing a backwards jump when positive; the optimizer didn't
correctly model this behaviour, which caused crashes and misopts when
dealing with Repeats.
This commit fixes that behaviour.
2024-12-13 10:00:16 +01:00
Ali Mohammad Pur
4a8d3e35a3
LibRegex: Add some more debugging info to bytecode block ranges
...
These were getting difficult to differentiate, now they each get a
comment on where they came from to aid with future debugging.
2024-12-13 10:00:16 +01:00
Ali Mohammad Pur
5a4d657a4e
LibRegex: Avoid generating ForkJumps when jumping to the next alt block
...
Fixes #2398 .
2024-11-17 20:12:39 +01:00
Ali Mohammad Pur
00bc22c332
LibRegex: Don't immediately ignore TempInverse in optimizer
...
fe46b2c141
added the reset-temp-inverse flag, but set it up so all
tempinverse ops were negated at the start of the next op; this commit
makes it so these flags actually persist for one op and not zero.
Fixes #2296 .
2024-11-17 09:03:29 -05:00
Ali Mohammad Pur
dabd60180f
LibRegex: Don't ignore references that weren't bound in checked blocks
...
Fixes #2281 .
2024-11-12 10:37:57 +01:00
Timothy Flynn
93712b24bf
Everywhere: Hoist the Libraries folder to the top-level
2024-11-10 12:50:45 +01:00