| Title: | Regression-Based Boolean Rule Inference |
|---|---|
| Description: | Tools for regression-based Boolean rule inference in artificial intelligence studies. The package fits ridge regression models on conjunction expansions and composes interpretable rule sets. Parallel execution is supported for multi-CPU environments. |
| Authors: | Seyed Amir Malekpour [aut, cre] |
| Maintainer: | Seyed Amir Malekpour <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.0 |
| Built: | 2026-05-19 08:08:32 UTC |
| Source: | https://github.com/compbioipm/rbbr |
MAGIC data
MAGIC_dataMAGIC_data
An object of class data.frame with 19020 rows and 11 columns.
OR data
OR_dataOR_data
An object of class data.frame with 1000 rows and 5 columns.
Make predictions for new datapoints by utilizing a trained RBBR model.
rbbr_predictor( trained_model, data_test, num_top_rules = 1, slope = 10, num_cores = 1, verbose = FALSE )rbbr_predictor( trained_model, data_test, num_top_rules = 1, slope = 10, num_cores = 1, verbose = FALSE )
trained_model |
Model returned by 'rbbr_train()' |
data_test |
The new dataset for which we want to predict the target class or label probability. Each sample is represented as a row, and features are in columns. |
num_top_rules |
Number of Boolean rules with the best Bayesian Information Criterion (BIC) scores to be used for prediction. The default value is 1. |
slope |
The slope parameter for the sigmoid activation function. Default is 10. |
num_cores |
Number of parallel workers to use for computation. Adjust according to your system. Default is NA (automatic selection). |
verbose |
Logical. If TRUE, progress messages are shown. Default is FALSE. |
Numeric vector of predicted probabilities (length = nrow(data_test))
# Load dataset data(example_data) # Inspect loaded data head(XOR_data) # For fast run, use the first three input features to predict target class in column 11 data_train <- XOR_data[1:800, c(1,2,3,11)] data_test <- XOR_data[801:1000, c(1,2,3,11)] # training model trained_model <- rbbr_train(data_train, max_feature = 2, num_cores = 1, verbose = TRUE) head(trained_model$boolean_rules) # testing model data_test_x <- data_test[ ,1:(ncol(data_test)-1)] labels <- data_test[ ,ncol(data_test)] predicted_label_probabilities <- rbbr_predictor(trained_model, data_test_x, num_top_rules = 1, num_cores = 1, verbose = TRUE) head(predicted_label_probabilities) head(labels) # true labels# Load dataset data(example_data) # Inspect loaded data head(XOR_data) # For fast run, use the first three input features to predict target class in column 11 data_train <- XOR_data[1:800, c(1,2,3,11)] data_test <- XOR_data[801:1000, c(1,2,3,11)] # training model trained_model <- rbbr_train(data_train, max_feature = 2, num_cores = 1, verbose = TRUE) head(trained_model$boolean_rules) # testing model data_test_x <- data_test[ ,1:(ncol(data_test)-1)] labels <- data_test[ ,ncol(data_test)] predicted_label_probabilities <- rbbr_predictor(trained_model, data_test_x, num_top_rules = 1, num_cores = 1, verbose = TRUE) head(predicted_label_probabilities) head(labels) # true labels
Scales input features to the [0,1] interval using the 97.5th percentile of each feature. The last column (target) is not scaled.
rbbr_scaling(data)rbbr_scaling(data)
data |
A numeric dataset. Each row is a sample and each column a feature. The target variable is expected in the last column. |
A dataset with scaled features (all columns except the last), capped at 0.9999.
# Load dataset data(example_data) # Inspect loaded data head(MAGIC_data) # Scale features data_scaled <- rbbr_scaling(MAGIC_data) head(data_scaled)# Load dataset data(example_data) # Inspect loaded data head(MAGIC_data) # Scale features data_scaled <- rbbr_scaling(MAGIC_data) head(data_scaled)
Regression-Based Boolean Rule (RBBR) inference is performed on datasets where the input and target features are either binarized or continuous within the [0,1] range.
rbbr_train( data, max_feature = 3, mode = "1L", slope = 10, weight_threshold = 0, balancing = TRUE, num_cores = NA, verbose = FALSE )rbbr_train( data, max_feature = 3, mode = "1L", slope = 10, weight_threshold = 0, balancing = TRUE, num_cores = NA, verbose = FALSE )
data |
The dataset with scaled features within the [0,1] interval. Each row represents a sample and each column represents a feature. The target variable must be in the last column. |
max_feature |
The maximum number of input features allowed in a Boolean rule. The default value is 3. |
mode |
Choose between "1L" for fitting 1-layered models or "2L" for fitting 2-layered models. The default value is "1L". |
slope |
The slope parameter used in the Sigmoid activation function. The default value is 10. |
weight_threshold |
Conjunctions with weights above this threshold in the fitted ridge regression models will be printed as active conjunctions in the output. The default value is 0. |
balancing |
Logical. This is for adjusting the distribution of target classes or categories within a dataset to ensure that each class is adequately represented. The default value is TRUE. Set it to FALSE, if you don't need to perform the data balancing. |
num_cores |
Number of parallel workers to use for computation. Adjust according to your system. Default is NA (automatic selection). |
verbose |
Logical. If TRUE, progress messages and a progress bar are shown. Default is FALSE. |
This function outputs the predicted Boolean rules with the best Bayesian Information Criterion (BIC).
# Load dataset data(example_data) # Example for training a two-layer model head(OR_data) # For fast run, use the first three input features to predict target class in column five data_train <- OR_data[1:800, c(1,2,3,5)] data_test <- OR_data[801:1000, c(1,2,3,5)] # training model trained_model <- rbbr_train(data_train, max_feature = 2, mode = "2L", balancing = FALSE, num_cores = 1, verbose = TRUE) head(trained_model$boolean_rules) # testing model data_test_x <- data_test[ ,1:(ncol(data_test)-1)] labels <- data_test[ ,ncol(data_test)] predicted_label_probabilities <- rbbr_predictor(trained_model, data_test_x, num_top_rules = 10, num_cores = 1, verbose = TRUE) head(predicted_label_probabilities)# Load dataset data(example_data) # Example for training a two-layer model head(OR_data) # For fast run, use the first three input features to predict target class in column five data_train <- OR_data[1:800, c(1,2,3,5)] data_test <- OR_data[801:1000, c(1,2,3,5)] # training model trained_model <- rbbr_train(data_train, max_feature = 2, mode = "2L", balancing = FALSE, num_cores = 1, verbose = TRUE) head(trained_model$boolean_rules) # testing model data_test_x <- data_test[ ,1:(ncol(data_test)-1)] labels <- data_test[ ,ncol(data_test)] predicted_label_probabilities <- rbbr_predictor(trained_model, data_test_x, num_top_rules = 10, num_cores = 1, verbose = TRUE) head(predicted_label_probabilities)
XOR data
XOR_dataXOR_data
An object of class data.frame with 1000 rows and 11 columns.