rstrategies¶
A library to implement storage strategies in VMs based on the RPython toolchain. rstrategies can be used in VMs for any language or language family.
This library has been developed as part of a Masters Thesis by Anton Gulenko.
The original paper describing the optimization “Storage Strategies for collections in dynamically typed languages” by C.F. Bolz, L. Diekmann and L. Tratt can be found here.
So far, this library has been adpoted by 3 VMs: RSqueak, Topaz (Forked here) and Pycket (Forked here).
Concept¶
Collections are often used homogeneously, i.e. they contain only objects of the same type. Primitive numeric types like ints or floats are especially interesting for optimization. These cases can be optimized by storing the unboxed data of these objects in consecutive memory. This is done by letting a special “strategy” object handle the entire storage of a collection. The collection object holds two separate references: one to its strategy and one to its storage. Every operation on the collection is delegated to the strategy, which accesses the storage when needed. The strategy can be switched to a more suitable one, which might require converting the storage array.
Usage¶
The following are the steps needed to integrated rstrategies in an RPython VM. Because of the special nature of this library it is not enough to simply call some API methods; the library must be integrated within existing VM classes using a metaclass, mixins and other meta-programming techniques.
The sequence of steps described here is something like a “setup walkthrough”, and might be a bit abstract. To see a concrete example, look at SingletonStorageStrategy, StrategyFactory and W_PointersObject from the RSqueak VM. The code is also well commented.
Basics¶
Currently the rstrategies library supports fixed sized and variable sized collections. This can be used to optimize a wide range of primitive data structures like arrays, lists or regular objects. Any of these are called ‘collections’ in this context. The VM should have a central class or class hierarchy for collections. In order to extend these classes and use strategies, the library needs accessor methods for two attributes of collection objects: strategy and storage. The easiest way is adding the following line to the body of the root collection class:
rstrategies.make_accessors(strategy='strategy', storage='storage')
This will generate the 4 accessor methods
_[get/set]_[storage/strategy]()
for the respective attributes.
Alternatively, implement these methods manually or overwrite the
getters/setters in StrategyFactory
.
Next, the strategy classes must be defined. This requires a small class hierarchy with a dedicated root class. In the definition of this root class, include the following lines:
__metaclass__ = rstrategies.StrategyMetaclass
import_from_mixin(rstrategies.AbstractStrategy)
import_from_mixin(rstrategies.SafeIndexingMixin)
import_from_mixin
can be found in rpython.rlib.objectmodel
. If
index-checking is performed safely at other places in the VM, you can
use rstrategies.UnsafeIndexingMixin
instead. If you need your own
metaclass, you can combine yours with the rstrategies one using multiple
inheritance like
here.
Also implement a storage_factory()
method, which returns an instance
of rstrategies.StorageFactory
, which is described below.
An example AbstractStrategy
class, which also stores an additional space
parameter could looks like this:
class AbstractStrategy(AbstractStrategy):
_attrs_ = ['space']
_immutable_fields_ = ['space']
__metaclass__ = rstrat.StrategyMetaclass
import_from_mixin(rstrat.AbstractStrategy)
import_from_mixin(rstrategies.SafeIndexingMixin)
def __init__(self, space):
self.space = space
def strategy_factory(self):
return self.space.strategy_factory
Strategy classes¶
Now you can create the actual strategy classes, subclassing them from the single root class. The following list summarizes the basic strategies available.
EmptyStrategy
A strategy for empty collections; very efficient, but limited. Does not allocate anything.SingleValueStrategy
A strategy for collections containing the same objectn
times. Only allocates memory to store the size of the collection.GenericStrategy
A non-optimized strategy backed by a generic python list. This is the fallback strategy, since it can store everything, but is not optimized.WeakGenericStrategy
LikeGenericStrategy
, but usesweakref
to hold on weakly to its elements.SingleTypeStrategy
Can store a single unboxed type like int or float. This is the main optimizing strategyTaggingStrategy
Extension of SingleTypeStrategy. Uses a specific value in the value range of the unboxed type to represent one additional, arbitrary object. For example, one offloat
’sNaN
representations can be used to represent special value likenil
.
There are also intermediate classes, which allow creating new, more customized strategies. For this, you should get familiar with the code.
Include one of these mixin classes using import_from_mixin
. The
mixin classes contain comments describing methods or fields which are
also required in the strategy class in order to use them. Additionally,
add the @rstrategies.strategy(generalize=alist)
decorator to all
strategy classes. The alist
parameter must contain all strategies,
which the decorated strategy can switch to, if it can not represent a
new element anymore.
Example
for an implemented strategy. See the other strategy classes behind this
link for more examples.
An example strategy class for optimized int
storage could look like this:
@rstrat.strategy(generalize=[GenericStrategy])
class IntegerOrNilStrategy(AbstractStrategy):
import_from_mixin(rstrat.TaggingStrategy)
contained_type = model.W_Integer
def wrap(self, val): return self.space.wrap_int(val)
def unwrap(self, w_val): return self.space.unwrap_int(w_val)
def wrapped_tagged_value(self): return self.space.w_nil
def unwrapped_tagged_value(self): return constants.MAXINT
Strategy Factory¶
The last part is subclassing rstrategies.StrategyFactory
,
overwriting the method instantiate_strategy
if necessary and passing
the strategies root class to the constructor. The factory provides the
methods switch_strategy
, set_initial_strategy
,
strategy_type_for
which can be used by the VM code to use the
mechanism behind strategies. See the comments in the source code.
The strategy mixins offer the following methods to manipulate the contents of the collection:
- basic API
size
- fixed size API
store
,fetch
,slice
,store_all
,fetch_all
- variable size API
insert
,delete
,append
,pop
If the collection has a fixed size, simply never use any of the variable size methods in the VM code. Since the strategies are singletons, these methods need the collection object as first parameter. For convenience, more fitting accessor methods should be implemented on the collection class itself.
An example strategy factory for the AbstractStrategy
class above could look like this:
class StrategyFactory(rstrategies.StrategyFactory):
_attrs_ = ['space']
_immutable_fields_ = ['space']
def __init__(self, space):
self.space = space
rstrat.StrategyFactory.__init__(self, AbstractStrategy)
def instantiate_strategy(self, strategy_type):
return strategy_type(self.space)
def strategy_type_for(self, list_w, weak=False):
"""
Helper method for handling weak objects specially
"""
if weak:
return WeakListStrategy
return rstrategies.StrategyFactory.strategy_type_for(self, list_w)