btgym.envs package

btgym.envs.base module

class btgym.envs.base.BTgymEnv(**kwargs)[source]

Base OpenAI Gym API shell for Backtrader backtesting/trading library.

Keyword Arguments:
 
  • filename=None (str, list) – csv data file.
  • **datafeed_args (any) – any datafeed-related args, passed through to default btgym.datafeed class.
  • dataset=None (btgym.datafeed) – BTgymDataDomain instance, overrides filename or any other datafeed-related args.
  • strategy=None (btgym.startegy) – strategy to be used by engine, any subclass of btgym.strategy.base.BTgymBaseStrateg
  • engine=None (bt.Cerebro) – environment simulation engine, any bt.Cerebro subclass, overrides strategy arg.
  • network_address=`tcp://127.0.0.1:` (str) – BTGym_server address.
  • port=5500 (int) – network port to use for server - API_shell communication.
  • data_master=True (bool) – let this environment control over data_server;
  • data_network_address=`tcp://127.0.0.1:` (str) – data_server address.
  • data_port=4999 (int) – network port to use for server – data_server communication.
  • connect_timeout=60 (int) – server connection timeout in seconds.
  • render_enabled=True (bool) – enable rendering for this environment;
  • render_modes=[‘human’, ‘episode’] (list) – episode - plotted episode results; human - raw_state observation.
  • **render_args (any) – any render-related args, passed through to renderer class.
  • verbose=0 (int) – verbosity mode, {0 - WARNING, 1 - INFO, 2 - DEBUG}
  • log_level=None (int) – logbook level {DEBUG=10, INFO=11, NOTICE=12, WARNING=13}, overrides verbose arg;
  • log=None (logbook.Logger) – external logbook logger, overrides log_level and verbose args.
  • task=0 (int) – environment id
  • random_seed (int) – numpy random seed, def: None

Environment kwargs applying logic:

if <engine> kwarg is given:
    do not use default engine and strategy parameters;
    ignore <strategy> kwarg and all strategy and engine-related kwargs.

else (no <engine>):
    use default engine parameters;
    if any engine-related kwarg is given:
        override corresponding default parameter;

    if <strategy> is given:
        do not use default strategy parameters;
        if any strategy related kwarg is given:
            override corresponding strategy parameter;

    else (no <strategy>):
        use default strategy parameters;
        if any strategy related kwarg is given:
            override corresponding strategy parameter;

if <dataset> kwarg is given:
    do not use default dataset parameters;
    ignore dataset related kwargs;

else (no <dataset>):
    use default dataset parameters;
        if  any dataset related kwarg is given:
            override corresponding dataset parameter;

If any <other> kwarg is given:
    override corresponding default parameter.
_seed(seed=None)[source]

Sets env. random seed.

Parameters:seed – int or None
static _comm_with_timeout(socket, message)[source]

Exchanges messages via socket, timeout sensitive.

Parameters:
  • socket – zmq connected socket to communicate via;
  • message – message to send;

Note

socket zmq.RCVTIMEO and zmq.SNDTIMEO should be set to some finite number of milliseconds.

Returns:status: communication result; message: received message if status == ok or None; time: remote side response time.
Return type:dictionary
_start_server()[source]

Configures backtrader REQ/REP server instance and starts server process.

_stop_server()[source]

Stops BT server process, releases network resources.

_force_control_mode()[source]

Puts BT server to control mode.

_assert_response(response)[source]

Simple watcher: roughly checks if we really talking to environment (== episode is running). Rises exception if response given is not as expected.

_print_space(space, _tab='')[source]

Parses observation space shape or response.

Parameters:space – gym observation space or state.
Returns:description as string.
reset(**kwargs)[source]

Implementation of OpenAI Gym env.reset method. Starts new episode. Episode data are sampled according to data provider class logic, controlled via kwargs. Refer BTgym_Server and data provider classes for details.

Parameters:kwargs – any kwargs; this dictionary is passed through to BTgym_server side without any checks and modifications; currently used for data sampling control;
Returns:observation space state

Notes

Current kwargs accepted is:

episode_config=dict(
    get_new=True,
    sample_type=0,
    b_alpha=1,
    b_beta=1
),
trial_config=dict(
    get_new=True,
    sample_type=0,
    b_alpha=1,
    b_beta=1
)
step(action)[source]

Implementation of OpenAI Gym env.step() method. Makes a step in the environment.

Parameters:action – int or dict, action compatible to env.action_space
Returns:tuple (Observation, Reward, Info, Done)
close()[source]

Implementation of OpenAI Gym env.close method. Puts BTgym server in Control Mode.

get_stat()[source]

Returns last run episode statistics.

Note

when invoked, forces running episode to terminate.

render(mode='other_mode', close=False)[source]

Implementation of OpenAI Gym env.render method. Visualises current environment state.

Parameters:mode

str, any of these:

`human` - current state observation as price lines;
`episode` - plotted results of last completed episode.

[other_key] - corresponding to any custom observation space key

_stop()[source]

Finishes current episode if any, does nothing otherwise. Leaves server running.

_restart_server()[source]

Restarts server.

_start_data_server()[source]
For data_master environment:
  • configures backtrader REQ/REP server instance and starts server process.
For others:
  • establishes network connection to existing data_server.
_stop_data_server()[source]
For data_master:
  • stops BT server process, releases network resources.
_restart_data_server()[source]

Restarts data_server.

_get_dataset_info()[source]

Retrieves dataset configuration and descriptive statistic.

reset_data(**kwargs)[source]

Resets data provider class used, whatever it means for that class. Gets data_server ready to provide data. Supposed to be called before first env.reset().

Note

when invoked, forces running episode to terminate.

Parameters:**kwargs – data provider class .reset() method specific.

btgym.envs.multidiscrete module

class btgym.envs.multidiscrete.MultiDiscreteEnv(engine, dataset=None, **kwargs)[source]

OpenAI Gym API shell for Backtrader backtesting/trading library with multiply data streams (assets) support. Action space is dictionary of discrete actions for every asset.

Multi-asset setup explanation:

1. This environment expects Dataset to be instance of btgym.datafeed.multi.BTgymMultiData, which sets number, specifications and sampling synchronisation for historic data for all assets one want to trade jointly.

2. Internally every episodic asset data is converted to single bt.feed and added to environment strategy as separate named data-line (see backtrader docs for extensive explanation of data-lines concept). Strategy is expected to properly handle all received data-lines.

3. btgym.spaces.ActionDictSpace and order execution. Strategy expects to receive separate action for every asset in form of dictionary: {asset_name_1: action, …, asset_name_K: action} for K assets added, and issues orders for all assets within a single strategy step. It is supposed that actions are discrete [for this environment] and same for every asset. Base actions are set by strategy.params.portfolio_actions, defaults are: (‘hold’, ‘buy’, ‘sell’, ‘close’) which equals to gym.spaces.Discrete with depth N=4 (~number of actions: 0, 1, 2, 3). That is, for K assets environment action space will be a shallow dictionary (DictSpace) of discrete spaces: {asset_name_1: gym.spaces.Discrete(N), …, asset_name_K: gym.spaces.Discrete(N)}

Example:

if datalines added via BTgymMultiData are: ['eurchf', 'eurgbp', 'eurgpy', 'eurusd'],
and base asset actions are ['hold', 'buy', 'sell', 'close'], than:

env.action.space will be:
    DictSpace(
        {
            'eurchf': gym.spaces.Discrete(4),
            'eurgbp': gym.spaces.Discrete(4),
            'eurgpy': gym.spaces.Discrete(4),
            'eurusd': gym.spaces.Discrete(4),
        }
    )
single environment action instance (as seen inside strategy):
    {
        'eurchf': 'hold',
        'eurgbp': 'buy',
        'eurgpy': 'hold',
        'eurusd': 'close',
    }
corresponding action integer encoding as passed to environment via .step():
    {
        'eurchf': 0,
        'eurgbp': 1,
        'eurgpy': 0,
        'eurusd': 3,
    }
vector of integers (categorical):
    (0, 1, 0, 3)

4. Environment actions cardinality and encoding. Note that total set of environment actions for K assets and N base actions is a cartesian product of K sets of N elements each. It can be encoded as vector of integers, single scalar, binary or one_hot. As cardinality skyrockets with K, multi-discrete action setup is only suited for small number of assets.

Example:

Setup with 4 assets and 4 base actions [hold, buy, sell, close] spawns total of 256 possible
environment actions expressed by single integer in [0, 255] or binary encoding:
    vector str :                            vector:         int:   binary:
    ('hold', 'hold', 'hold', 'hold')     -> (0, 0, 0, 0) -> 0   -> 00000000
    ('hold', 'hold', 'hold', 'buy')      -> (0, 0, 0, 1) -> 1   -> 00000001
    ...         ...         ...
    ('close', 'close', 'close', 'sell')  -> (3, 3, 3, 2) -> 254 -> 11111110
    ('close', 'close', 'close', 'close') -> (3, 3, 3, 3) -> 255 -> 11111111

Internally there is some weirdness with encodings as we jump forth and back between dictionary of names or categorical encodings and binary encoding or one-hot encoding. As a rule: strategy operates with dictionary of string names of actions, environment sees action as dictionary of integer numbers while policy estimator operates with either binary or one-hot encoding.

5. Observation space: is nested DictSpace, where ‘external’ part part of space should hold specifications for every asset added.

Example:

if datalines added via BTgymMultiData are:
    'eurchf', 'eurgbp', 'eurgpy', 'eurusd';

environment observation space should be DictSpace:
 {
    'raw': spaces.Box(low=-1000, high=1000, shape=(128, 4), dtype=np.float32),
    'external': DictSpace(
        {
            'eurusd': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
            'eurgbp': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
            'eurchf': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
            'eurgpy': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
        }
    ),
    'internal': spaces.Box(...),
    'datetime': spaces.Box(...),
    'metadata': DictSpace(...)
}

refer to strategies declarations for full code.

This class requires dataset, strategy, engine instances to be passed explicitly.

Parameters:
  • dataset (btgym.datafeed) – BTgymDataDomain instance;
  • engine (bt.Cerebro) – environment simulation engine, any bt.Cerebro subclass,
Keyword Arguments:
 
  • network_address=`tcp://127.0.0.1:` (str) – BTGym_server address.
  • port=5500 (int) – network port to use for server - API_shell communication.
  • data_master=True (bool) – let this environment control over data_server;
  • data_network_address=`tcp://127.0.0.1:` (str) – data_server address.
  • data_port=4999 (int) – network port to use for server – data_server communication.
  • connect_timeout=60 (int) – server connection timeout in seconds.
  • render_enabled=True (bool) – enable rendering for this environment;
  • render_modes=[‘human’, ‘episode’] (list) – episode - plotted episode results; human - raw_state observation.
  • **render_args (any) – any render-related args, passed through to renderer class.
  • verbose=0 (int) – verbosity mode, {0 - WARNING, 1 - INFO, 2 - DEBUG}
  • log_level=None (int) – logbook level {DEBUG=10, INFO=11, NOTICE=12, WARNING=13}, overrides verbose arg;
  • log=None (logbook.Logger) – external logbook logger, overrides log_level and verbose args.
  • task=0 (int) – environment id

btgym.envs.portfolio module

class btgym.envs.portfolio.PortfolioEnv(engine, dataset=None, **kwargs)[source]

OpenAI Gym API shell for Backtrader backtesting/trading library with multiply assets support. Action space is dictionary of contionious actions for every asset. This setup closely relates to continuous portfolio optimisation problem definition.

Setup explanation:

0. Problem definition. Consider setup with one riskless asset acting as broker account cash and K (by default - one) risky assets. For every risky asset there exists track of historic price records referred as data-line. Apart from assets data lines there possibly exists number of exogenous data lines holding some information and statistics, e.g. economic indexes, encoded news, macroeconomic indicators, weather forecasts etc. which are considered relevant and valuable for decision-making. It is supposed for this setup that: i. there is no interest rate for base (riskless) asset; ii. short selling is not permitted; iii. transaction costs are modelled via broker commission; iv. ‘market liquidity’ and ‘capital impact’ assumptions are met; v. time indexes match for all data lines provided;

1. Assets and datalines. This environment expects Dataset to be instance of btgym.datafeed.multi.BTgymMultiData, which sets number, specifications and sampling synchronisation for historic data for all assets and data lines.

Namely, one should define data_config dictionary of data lines and list of assets. data_config specifies all data sources used by strategy, while assets defines subset of data lines which is supposed to hold historic data for risky portfolio assets.

Internally every episodic asset data is converted to single bt.feed and added to environment strategy as separate named data_line (see backtrader docs for extensive explanation of data_lines concept). Every non-asset data line as also added as bt.feed with difference that it is not ‘tradable’ i.e. it is impossible to issue trade orders on such line. Strategy is expected to properly handle all received data-lines.

Example:

1. Four data streams added via Dataset.data_config,
   portfolio consists of four assets, added via strategy_params, cash is EUR:

    data_config = {
        'usd': {'filename': '.../DAT_ASCII_EURUSD_M1_2017.csv'},
        'gbp': {'filename': '.../DAT_ASCII_EURGBP_M1_2017.csv'},
        'jpy': {'filename': '.../DAT_ASCII_EURJPY_M1_2017.csv'},
        'chf': {'filename': '.../DAT_ASCII_EURCHF_M1_2017.csv'},
    }
    cash_name = 'eur'
    assets_names = ['usd', 'gbp', 'jpy', 'chf']

2. Three streams added, only two of them form portfolio; DXY stream is `decision-making` only:
    data_config = {
        'usd': {'filename': '.../DAT_ASCII_EURUSD_M1_2017.csv'},
        'gbp': {'filename': '.../DAT_ASCII_EURGBP_M1_2017.csv'},
        '​DXY': {'filename': '.../DAT_ASCII_DXY_M1_2017.csv'},
    }
    cash_name = 'eur'
    assets_names = ['usd', 'gbp']

2. btgym.spaces.ActionDictSpace and order execution. ActionDictSpace is an extension of OpenAI Gym DictSpace providing domain-specific functionality. Strategy expects to receive separate action for every K+1 asset in form of dictionary: {cash_name: a[0], asset_name_1: a[1], …, asset_name_K: a[K]} for K risky assets added, where base actions are real numbers: a[i] in [0,1], 0<=i<=K, SUM{a[i]} = 1. Whole action should be interpreted as order to adjust portfolio to have share a[i] * 100% for i-th asset.

Therefore, base actions are gym.spaces.Box and for K assets environment action space will be a shallow DictSpace of K+1 continuous spaces: {cash_name: gym.spaces.Box(low=0, high=1), asset_name_1: gym.spaces.Box(low=0, high=1), …, asset_name_K: gym.spaces.Box(low=0, high=1)}

  1. TODO: refine order execution control, see: https://community.backtrader.com/topic/152/multi-asset-ranking-and-rebalancing/2?page=1

    Example:

    if cash asset is 'eur',
    risky assets added are: ['chf', 'gbp', 'gpy', 'usd'],
    and data lines added via BTgymMultiData are:
    {
        'chf': eurchf_hist_data_source,
        'gbp', eurgbp_hist_data_source,
        'jpy', eurgpy_hist_data_source,
        'usd', eurusd_hist_data_source,
    },
    than:
    
    env.action.space will be:
        DictSpace(
            {
                'eur': gym.spaces.Box(low=0, high=1, dtype=np.float32),
                'chf': gym.spaces.Box(low=0, high=1, dtype=np.float32),
                'gbp': gym.spaces.Box(low=0, high=1, dtype=np.float32),
                'jpy': gym.spaces.Box(low=0, high=1, dtype=np.float32),
                'usd': gym.spaces.Box(low=0, high=1, dtype=np.float32),
            }
        )
    
    single environment action instance (as seen inside strategy or passed to environment via .step()):
        {
            'eur': 0.3
            'chf': 0.1,
            'gbp': 0.1,
            'jpy': 0.2,
            'usd': 0.3,
        }
    
    or vector (unlike multi-asset discrete setup, there is no binary/one hot encoding):
        (0.3, 0.1, 0.1, 0.2, 0.3)
    
    which says to broker: "... adjust positions to get 30% in base EUR asset (cash), and amounts of
    10%, 10%, 20% and 30% off current portfolio value in CHF, GBP, JPY respectively".
    
    Note that under the hood broker uses `order_target_percent` for every risky asset and can issue
    'sell', 'buy' or 'close' orders depending on positive/negative difference of current to desired
    share of asset.
    

3. Observation space: is nested DictSpace, where ‘external’ part part of space should hold specifications for every data line added (note that cash asset does not have it’s own data line).

Example:

if data lines added via BTgymMultiData are:
    'chf', 'gbp', 'jpy', 'usd';

environment observation space can be DictSpace:
 {
    'external': DictSpace(
        {
            'usd': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
            'gbp': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
            'chf': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
            'jpy': spaces.Box(low=-1000, high=1000, shape=(128, 1, num_features), dtype=np.float32),
        }
    ),
    'raw': spaces.Box(...),
    'internal': spaces.Box(...),
    'datetime': spaces.Box(...),
    'metadata': DictSpace(...)
}

refer to strategies declarations for full code.

This class requires dataset, strategy, engine instances to be passed explicitly.

Parameters:
  • dataset (btgym.datafeed) – BTgymDataDomain instance;
  • engine (bt.Cerebro) – environment simulation engine, any bt.Cerebro subclass,
Keyword Arguments:
 
  • network_address=`tcp://127.0.0.1:` (str) – BTGym_server address.
  • port=5500 (int) – network port to use for server - API_shell communication.
  • data_master=True (bool) – let this environment control over data_server;
  • data_network_address=`tcp://127.0.0.1:` (str) – data_server address.
  • data_port=4999 (int) – network port to use for server – data_server communication.
  • connect_timeout=60 (int) – server connection timeout in seconds.
  • render_enabled=True (bool) – enable rendering for this environment;
  • render_modes=[‘human’, ‘episode’] (list) – episode - plotted episode results; human - raw_state observation.
  • **render_args (any) – any render-related args, passed through to renderer class.
  • verbose=0 (int) – verbosity mode, {0 - WARNING, 1 - INFO, 2 - DEBUG}
  • log_level=None (int) – logbook level {DEBUG=10, INFO=11, NOTICE=12, WARNING=13}, overrides verbose arg;
  • log=None (logbook.Logger) – external logbook logger, overrides log_level and verbose args.
  • task=0 (int) – environment id
step(action)[source]

Implementation of OpenAI Gym env.step() method. Makes a step in the environment.

Parameters:action – int or dict, action compatible to env.action_space
Returns:tuple (Observation, Reward, Info, Done)