Tree ensembles

Note

Currently only EvoTrees tree ensemble models are supported. However, if one wants to use a different tree ensemble training package, such as XGBoost.jl or some Python package, a new parameter extraction function can simply be implemented. Studying the code for TEModel and extract_evotrees_info will be useful for this task.

Formulation

First, one must create and train an EvoTrees tree ensemble model.

using EvoTrees

config = EvoTreeRegressor(nrounds=500, max_depth=5)
evo_model = fit_evotree(config; x_train, y_train)

Then the parameters can be extracted from the trained tree ensemble and used to create a JuMP model containing the tree ensemble MIP formulation.

using Gurobi
using Gogeta

# Extract data from EvoTrees model

universal_tree_model = extract_evotrees_info(evo_model)

# Create jump model and formulate
jump = Model(() -> Gurobi.Optimizer())
set_attribute(jump, "OutputFlag", 0) # JuMP or solver-specific attributes can be changed

TE_formulate!(jump, universal_tree_model, MIN_SENSE)

Optimization

There are two ways of optimizing the JuMP model: either by 1) creating the full set of split constraints before optimizing, or 2) using lazy constraints to generate only the necessary ones during the solution process.

1) Full set of constraints

add_split_constraints!(jump, universal_tree_model)
optimize!(jump)

2) Lazy constraints

# Define a callback function. For each solver this might be slightly different.
# See JuMP documentation or your solver's Julia interface documentation.
# Inside the callback 'tree_callback_algorithm' must be called.

function split_constraint_callback_gurobi(cb_data, cb_where::Cint)

    # Only run at integer solutions
    if cb_where != GRB_CB_MIPSOL
        return
    end

    Gurobi.load_callback_variable_primal(cb_data, cb_where)
    tree_callback_algorithm(cb_data, universal_tree_model, jump)

end

jump = direct_model(Gurobi.Optimizer())
TE_formulate!(jump, universal_tree_model, MIN_SENSE)

set_attribute(jump, "LazyConstraints", 1)
set_attribute(jump, Gurobi.CallbackFunction(), split_constraint_callback_gurobi)

optimize!(jump)

The optimal solution (minimum and maximum values for each of the input variables) can be queried after the optimization.

get_solution(opt_model, universal_tree_model)
objective_value(opt_model)

Recommendations

Using the tree ensemble optimization from this package is quite straightforward. The only parameter the user can change is the solution method: with initial constraints or with lazy constraints. In our computational tests, we have seen that the lazy constraint generation almost invariably produces models that are computationally easier to solve. Therefore we recommend primarily using it as the solution method, but depending on your use case, trying the initial constraints might also be worthwhile.