Data Science

CEP made easyR – Drools with R using RStudio

Introduction

[ UPDATE ] A new and improved approached to using Drools including its CEP features with R is via issuing REST APIs to a Drools Decision Server. An approach to this is described here.

R is a functional programming language (FPL) specialising in statistical analyses. Drools is that most excellent community project to build a “universal behavioural platform”.  The project includes the fusion complex event processing (CEP) sub-project.  By mashing R and CEP Drools together with RStudio you have the makings of a a very powerful visual IDE for the simulation of data sets and the testing of rules formed using the Drools CEP syntax.  This means developers have an agile environment for learning and applying the Drools CEP MVEL syntax immediately accessible from a browser without any local configuration. No java code is needed for setting up your Drools test client code and the Drools rules engine is accessed using a few simple API calls. R is feature rich in functions for extracting, transforming and loading data from sources such as REST, spreadsheets, CSV files, databases, CURL and SOAP. This takes the tedium out of generating and preparing sample data sets as inputs to rules services.

RStudio-ScreenShot

Setup

System configuration assumes a functional RStudio environment for either the client or server side version.  To do this visit http://www.rstudio.com/ and follow the installation instructions there or read RStudio Server on Fedora.

To integrate Drools with R, a Drools package has been written for R.  This Drools package enables you to process pseudo clock based timestamped event streams. To install these packages do as follows. Upon first usage of the Drools package you may need to add additional R packages. Just use the RStudio package installer to add whatever else is necessary as missing packages are encountered.   Note that the package has been pre-loaded into the RStudio Server in the instructions for the Fedora RStudio Server set-up referenced above.  To install these packages independently do:

# Install whatever R packages you need ...
$ wget http://cran.rstudio.com/src/contrib/rJava_0.9-6.tar.gz
$ sudo R CMD INSTALL rJava_0.9-6.tar.gz
$ wget https://github.com/StefanoPicozzi/Rdrools6/blob/master/Rdrools6jars_0.0.1.tar.gz
$ sudo R CMD INSTALL Rdrools6jars_0.0.1.tar.gz
$ wget https://github.com/StefanoPicozzi/Rdrools6/blob/master/Rdrools6_0.0.1.tar.gz
$ sudo R CMD INSTALL Rdrools6_0.0.1.tar.gz

Usage Pattern

Usage is based on a very simple pattern in which an event stream of input is processed with a rules MVEL file. Facts are created all within the file. A fact known as “output” is then used to capture content back to the client. R makes the pre and post data processing much easier allowing you to spend more time in cycles of authoring and testing your rules syntax. To use the drools package the steps are as per below and a code fragment follows:

  • Create an input dataframe in which you hold the input data, e.g. inputdata
  • Assign columns names to the input dataframe, e.g. input.columns
  • Assign column names for the dataframe to hold the output of the rules execution, e.g. output.columns
  • Tell R where your MVEL rule file is located, e.g. rules Set the rules engine to use STREAM, e.g. mode <- “STREAM”
  • Set up the rules session, e.g. rules.session Run the rules and capture the output dataframe generated, e.g. outputdata

Sample

Simple Sample

To reproduce the example shown above create an R script file with the following contents named sample/R and then create a rules files named rules.txt as follows.

sample.R

# Script to author and test rules files using Drools6 packages inside R
# Pull down observations and apply all rules
Sys.setenv(NOAWT = "true")

library("httr")
library("rjson")
library("Rdrools6")

setwd("~/Sample")

# Set up some sample input data
row <- data.frame(obsid = "1", obsdate = "2014-05-22 00:00:00", obsvalue = 10)
inputdata <- row
row <- data.frame(obsid = "2", obsdate = "2014-05-21 00:00:00", obsvalue = 20)
inputdata <- rbind(inputdata, row)
row <- data.frame(obsid = "2", obsdate = "2014-05-20 00:00:00", obsvalue = 30)
inputdata <- rbind(inputdata, row)
input.columns <- colnames(inputdata)

# Set up some sample output data
output.columns <-c ("rulename", "rulevalue")

# set up rules file
rules <- readLines("rules.txt")
mode <- "STREAM"

# Apply rules
rules.session <- rulesSession(mode, rules, input.columns, output.columns)
outputdata <- runRules(rules.session, inputdata)

rules.txt

import java.util.HashMap;
import org.json.JSONObject;
import java.util.Date; 
import java.text.SimpleDateFormat; 
import com.satimetry.nudge.Output;

global java.util.HashMap output;
global SimpleDateFormat inSDF;
global SimpleDateFormat outSDF;

function void print(String txt) {
   System.out.println(txt);
}

declare Observation
  @role( event )
  @timestamp( obsdate )
  obsid : String @key
  obsdate: Date @key
  obsvalue: Integer
end


rule "ruleInsertObservation"
  salience 1000
  when
   $input : JSONObject() from entry-point DEFAULT 
  then
      inSDF = new SimpleDateFormat("yyyy-M-d h:m:s");
      Date obsdate = inSDF.parse( $input.get("obsdate").toString() );
      Observation $observation = new Observation( $input.get("obsid").toString(), obsdate );
      $observation.setObsvalue( Integer.parseInt($input.get("obsvalue").toString()) );
      insert( $observation );
      print(drools.getRule().getName() + "->" + $observation.getObsid() + "-" + $observation.getObsdate() );
end

rule "ruleTotalValue"
  salience -1000
  no-loop true
  when
      $total : Number( intValue > 0) from accumulate(
      Observation( $obsvalue: obsvalue ) over window:time( 30d ),
      sum ( $obsvalue ) )
  then
      JSONObject joutput = new JSONObject();
      joutput.put("rulename", drools.getRule().getName());
      joutput.put("rulevalue", $total);
      Output $output = new Output(joutput.toString());
      insert($output);
      print(drools.getRule().getName() + "->" + $total);
end

 

 

 

Advertisements

4 thoughts on “CEP made easyR – Drools with R using RStudio

  1. Pingback: Really Simple Rules Service | the change architect

  2. Pingback: Weight Watcher demo – stateless CEP decision server | the change architect

  3. Pingback: The Weight Watcher | the change architect

  4. Pingback: Simppeli sääntökone | R-ohjelmointi.org

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s