struct
   Pf::USet32
 
  - Pf::USet32
 - Struct
 - Value
 - Object
 
Overview
A thread-safe, persistent set of 32-bit unsigned integers.
TODO  The current implementation is far more stupid versus e.g. Pf::Set
wrt no-change semantics. In other words, you will pay even if you #add
something that already exists. Optimization of all set operations wrt
no-change remains future work.
How it works?
The most important takeaway is that with USet32, integer magnitude directly affects performance.
See the diagram below.
                (0-31)             (0-15)           (0-15)
LSB          bitmap index         n0 index         n2 index       MSB
               ─────────           ───────         ───────
   0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   ───────────           ─────────         ───────         ───────
    bit index           chunk index        n1 index        n3 index
      (0-63)              (0-31)            (0-15)          (0-15)
    ┌─────┐                        ▲
    │N3   │                        │
    └──┬──┘                        │
       │  16-way branch            │
       ▼                           │
    ┌─────┐                        │
    │N2   │                        │
    └──┬──┘                        │
       │  16-way branch            │
       ▼                           │
    ┌─────┐                        │
    │N1   │                        │
    └──┬──┘                        │
       │  16-way branch            │
       ▼                           │
    ┌─────┐                        │
    │N0   │                        │
    └──┬──┘                        │
       │  16-way branch            │  instantiated on-demand
       ▼                           │  as the numbers grow
  ┌─────────┐                      │
  │wide node│                      │
  └────┬────┘                      │
       │  32-way branch            │
       ▼                           │
  ┌─────────┐                      │
  │  chunk  │                      │
  └────┬────┘                      │
       │  32-way branch            │
       ▼                           │
  ┌───────────────────────────┐    │
  │ bitmap                    │    │
  │                           │    │
  │ 0 0 0 0 0 0 0 ... 0 0 0 0 │    │
  │ ───────────────────────── │    │
  │          64 bits          │    │
  └───────────────────────────┘    │
USet32 effectively "reifies" the number tree (each number is a path through that tree), providing an indexing data structure on top of the u32 number line.
Note how this implementation penalizes large numbers or sets containing large numbers (numbers far away from origin).
Bitmaps are very cheap, so if your numbers are frequently in the 0-63 range, you're well set. The larger your number, the more allocation overhead you will have and the "taller" the tree will grow.
Everything from chunk upwards adds an allocation / a pointer dereference. On one hand, that's nice because you will copy less and share more. On the other hand, modern CPUs penalize pointer chasing so you're going to suffer: dozens of nanoseconds up to 100ns for an insertion is not unheard of.
We will probably provide support for USet64-256 in the future, because as you can see, this construction scales rather trivially; and USet256 especially would be useful for storing hashes such as SHA256 and BLAKE3. In that case, easing indirection and control overhead would become a high-priority goal.
Performance
Let's add all i32s. We can't add all u32s because BitArray crashes on that. We don't want to crash our only competitor in Crystal-land.
require "benchmark"
require "bit_array"
Benchmark.ips do |x|
  x.report("uset32") do
    _ = Pf::USet32.transaction do |set|
      (0..Int32::MAX).each do |n|
        set << n.to_u32
      end
    end
  end
  x.report("bit array") do
    bits = BitArray.new(Int32::MAX)
    (0..Int32::MAX).each do |n|
      bits.unsafe_put(n, true)
    end
  end
end
This reports the following on my machine (Ryzen 3 2200G):
   uset32   9.28m (107.77s ) (± 0.00%)  4.79GB/op  46.95× slower
bit array 435.66m (  2.30s ) (± 1.14%)   256MB/op        fastest
Horrendous, isn't it? Except remember that bit array allocates all the memory upfront, and does a neat sequ pass where it doesn't follow even a single pointer. Paradise for a modern CPU!
Now, USet32 is a dynamic, immutable set that is explicitly designed in such a way that large numbers are penalized (versus, say, Roaring bitmaps, where magnitude does not matter and instead small bitmaps are penalized). USet32 lets you work with small numbers practically for free.
If you have a different origin far from zero in your domain, you can normalize the numbers
before inserting them into USet32. To support negative numbers, create two sets: one for
negative numbers and another for positive ones. This is also the way to store Int32s efficiently,
since their raw binary representation is very far from zero if cast to UInt32.
USet32 is particularly well-suited for recording small, sequential ids, with an occasional immutable add or set operation. USet32 is suited for situations where you have thousands of small immutable sets, especially when they have a common ancestor.
In fact, USet32 works beautifully well for set operations, since, point A, it's a trie, and point B, we have bitmaps at the leaves. This is almost perfect for any kind of set operation, be it union, intersection, difference, etc.
Consider, for instance, the following micro-benchmarks:
require "benchmark"
xs_r = (0u32...100_000u32).to_a.shuffle!
xs_uset = xs_r.to_pf_uset32
xs_set = xs_r.to_set
ys_r = (0u32...20_000u32).to_a.shuffle!
ys_uset = ys_r.to_pf_uset32
ys_set = ys_r.to_set
zs_r = (80_000u32...150_000u32).to_a.shuffle!
zs_uset = zs_r.to_pf_uset32
zs_set = zs_r.to_a.shuffle!.to_set
Benchmark.ips do |x|
  x.report("USet32: create 100k") { xs_r.to_pf_uset32 }
  x.report("Set: create 100k") { xs_r.to_set }
end
Benchmark.ips do |x|
  x.report("USet32: subset") { ys_uset.subset_of?(xs_uset) }
  x.report("Set: subset") { ys_set.subset_of?(xs_set) }
end
Benchmark.ips do |x|
  x.report("USet32: xs union zs") { xs_uset | zs_uset }
  x.report("Set: xs union zs") { xs_set | zs_set }
end
Benchmark.ips do |x|
  x.report("USet32: xs intersects? zs") { xs_uset.intersects?(zs_uset) }
  x.report("Set: xs intersects? zs") { xs_set.intersects?(zs_set) }
end
Benchmark.ips do |x|
  x.report("USet32: xs intersect zs") { xs_uset & zs_uset }
  x.report("Set: xs intersect zs") { xs_set & zs_set }
end
Benchmark.ips do |x|
  x.report("USet32: xs difference zs") { xs_uset - zs_uset }
  x.report("Set: xs difference zs") { xs_set - zs_set }
end
They give the following results on my machine:
USet32: create 100k 336.02  (  2.98ms) (± 1.15%)  231kB/op   1.49× slower
   Set: create 100k 500.30  (  2.00ms) (± 2.98%)  2.0MB/op        fastest
USet32: subset  25.07M ( 39.88ns) (±12.30%)  32.0B/op          fastest
   Set: subset   2.61k (382.91µs) (± 0.71%)   0.0B/op  9600.69× slower
USet32: xs union zs 543.26k (  1.84µs) (± 3.44%)  3.38kB/op          fastest
   Set: xs union zs 189.07  (  5.29ms) (± 1.72%)   6.0MB/op  2873.25× slower
# NOTE: the quality of this benchmark needs improvement because it hits the fast
# path in both cases.
USet32: xs intersects? zs  82.90M ( 12.06ns) (± 1.57%)  0.0B/op   1.07× slower
   Set: xs intersects? zs  89.11M ( 11.22ns) (± 1.76%)  0.0B/op        fastest
USet32: xs intersect zs 592.29k (  1.69µs) (± 5.73%)  2.79kB/op          fastest
   Set: xs intersect zs 340.63  (  2.94ms) (± 7.84%)   769kB/op  1738.79× slower
USet32: xs difference zs   1.24M (804.59ns) (± 6.80%)    608B/op          fastest
   Set: xs difference zs 147.97  (  6.76ms) (± 4.26%)  3.75MB/op  8399.25× slower
TODO compare with BitArray and CRoaring. We will likely win over BitArray but lose to CRoaring due to indirection, although probably only on very large bitmaps.
USet32 makes set operations and checks so cheap one almost wants to forgive it the horrendous construction performance.
Obviously I'm not going to compare Set's dup.add with #add because that's not
how the world works; this will be as humiliating for sets as construction perf is
for USet32. Everything has its trade-offs.
NOTE  You are recommended to compile with --mcpu=native to make sure the Crystal compiler
and LLVM consider vectorization if your CPU supports it. Or maybe they consider it anyway;
however, my timings were slightly slower without the flag.
Inspired by
Included Modules
Defined in:
permafrost/uset32.crConstructors
- 
        .[] : USet32
        
          
Constructs an empty set.
 - 
        .[](*values : UInt32) : USet32
        
          
Constructs a set with the given values.
 - .additive_identity : self
 - 
        .new(ee : Enumerable(UInt32)) : USet32
        
          
Construct a set from the given enumerable ee.
 - 
        .new : USet32
        
          
Constructs an empty set.
 - .transaction(& : Commit -> ) : USet32
 
Instance Method Summary
- 
        #&(other : USet32) : USet32
        
          
Intersection: returns a new set containing integers common to this and other sets.
 - 
        #+(other : USet32) : USet32
        
          
Alias of
#|. - 
        #-(other : USet32) : USet32
        
          
Difference: returns a new set containing integers from this set that are not in the other set.
 - #==(other : USet32) : Bool
 - 
        #|(other : USet32) : USet32
        
          
Union: returns a new set containing integers from this and other sets.
 - 
        #add(value : UInt32) : USet32
        
          
Returns a copy of this set with value present.
 - 
        #add?(value : UInt32) : Tuple(USet32, Bool)
        
          
Returns a copy of this set with value present, followed by a boolean indicating whether value was added during the call.
 - 
        #delete(value : UInt32) : USet32
        
          
Returns a copy of this set without value.
 - 
        #each(& : UInt32 -> ) : Nil
        
          
Yields integers stored in this set.
 - 
        #hash(hasher)
        
          
See
Object#hash(hasher) - 
        #includes?(value) : Bool
        
          
Returns
trueif this set contains value. - #inspect(io)
 - 
        #intersects?(other : USet32) : Bool
        
          
Returns
trueif this set has common integers with other. - #pretty_print(pp)
 - 
        #proper_subset_of?(other : USet32) : Bool
        
          
Returns
trueif all integers in this set are contained in other, and other is larger than this set. - 
        #proper_superset_of?(other : USet32) : Bool
        
          
Returns
trueif this set contains all integers from other, and other is smaller than this set. - 
        #size : Int32
        
          
Returns the number of integers in this set.
 - 
        #subset_of?(other : USet32) : Bool
        
          
Returns
trueif all integers in this set are contained on other. - 
        #superset_of?(other : USet32) : Bool
        
          
Returns
trueif this set contains all integers from other. - #to_s(io)
 
Instance methods inherited from module Enumerable(UInt32)
  
  
    
      to_pf_bidi
    to_pf_bidi, 
    
  
    
      to_pf_map(& : T -> Tuple(K, V)) : Pf::Map(K, V) forall K, Vto_pf_map to_pf_map, to_pf_set : Pf::Set(T) to_pf_set, to_pf_uset32 : Pf::USet32 to_pf_uset32
Constructor Detail
Construct a set from the given enumerable ee.
Lets you build a possibly large set using Commit, which offers mutable versions
of methods from USet32.
WARNING  The yielded Commit must, under no circumstances, outlive the block or
escape it otherwise, through retained closures, instance variables, etc. It uses
pointers into stack-allocated memory, so using it after the block returns
will lead to UB.
set = Pf::USet32.transaction do |commit|
  commit << 1
  commit << 2
  if commit.includes?(2)
    commit << 3
  end
end
set # Pf::USet32[1, 2, 3]
        Instance Method Detail
Intersection: returns a new set containing integers common to this and other sets.
Pf::USet32[1, 2, 3] & Pf::USet32[2, 3, 4] # => Pf::USet32[2, 3]
        Difference: returns a new set containing integers from this set that are not in the other set.
Pf::USet32[1, 2, 3] - Pf::USet32[2, 3, 4] # => Pf::USet32[1]
        Union: returns a new set containing integers from this and other sets.
Pf::USet32[1, 2, 3] | Pf::USet32[4, 5, 6] # => Pf::USet32[1, 2, 3, 4, 5, 6]
        Returns a copy of this set with value present.
set = Pf::USet32[1, 2, 3]
set.add(4) # => Pf::USet32[1, 2, 3, 4]
set        # => Pf::USet32[1, 2, 3]
        Returns a copy of this set with value present, followed by a boolean indicating whether value was added during the call.
set = Pf::USet32[1, 2, 3]
set.add?(4) # => {Pf::USet32[1, 2, 3, 4], true}
set.add?(1) # => {Pf::USet32[1, 2, 3], false}
set # => Pf::USet32[1, 2, 3]
        Returns a copy of this set without value.
set = Pf::USet32[1, 2, 3]
set.delete(2) # => Pf::USet32[1, 3]
set           # => Pf::USet32[1, 2, 3]
        Returns true if this set contains value. Returns false otherwise.
set = Pf::USet32[1, 2, 3]
set.includes?(2)   # => true
set.includes?(100) # => false
        Returns true if this set has common integers with other. Returns
false otherwise.
xs = Pf::USet32[1, 2, 3]
ys = Pf::USet32[4, 5, 6]
zs = Pf::USet32[3, 5, 6]
xs.intersects?(ys) # => false, nothing in common
xs.intersects?(zs) # => true, they both have `3`
        Returns true if all integers in this set are contained in other, and
other is larger than this set. Returns false otherwise.
xs = Pf::USet32[1, 2, 3]
ys = Pf::USet32[1, 2, 3, 4, 5, 6]
xs.proper_subset_of?(ys) # => true
ys.subset_of?(ys)        # => true
ys.proper_subset_of?(ys) # => false
        Returns true if this set contains all integers from other, and other
is smaller than this set. Returns false otherwise.
xs = Pf::USet32[1, 2, 3]
ys = Pf::USet32[1, 2, 3, 4, 5, 6]
ys.proper_superset_of?(xs) # => true
ys.superset_of?(ys)        # => true
ys.proper_superset_of?(ys) # => false
        Returns the number of integers in this set.
set = Pf::USet32[1, 2, 3]
set.size # => 3
        Returns true if all integers in this set are contained on other. Returns
false otherwise.
xs = Pf::USet32[1, 2, 3]
ys = Pf::USet32[1, 2, 3, 4, 5, 6]
xs.subset_of?(ys) # => true
ys.subset_of?(xs) # => false
        Returns true if this set contains all integers from other. Returns
false otherwise.
xs = Pf::USet32[1, 2, 3]
ys = Pf::USet32[1, 2, 3, 4, 5, 6]
xs.superset_of?(ys) # => false
ys.superset_of?(xs) # => true