First published on MSDN on Apr 30, 2017
In previous releases of Microsoft R Server, parcel installation required downloading two pre-built parcel files. The
9.1 release
improves upon this experience by providing a parcel generator script generate_mrs_parcel.sh to generate a single MRS-9.1.0-*.parcel file. Here are the
complete instructions
to install MRS Parcel in Cloudera Cluster.
In this article we will look into how to make
RStudio Server
(both Open and Commercial) work with
MRS 9.1.0 Parcel Installation in Cloudera
. RStudio Server has an open source license as well as commercial license. You can view the differences
here
. RStudio Server with commercial license is also called RStudio Server Pro.
The following steps assume that you already have a Cloudera Cluster with MRS-9.1.0 parcel installed and activated. These steps can be run on edgenode/gateway node of the cluster.
-
Download and install RStudio Server Pro from
here
(OR) RStudio Server Open Source License from
here
.
RStudio Server Open Source License :
wget
https://download2.rstudio.org/rstudio-server-rhel-1.0.143-x86_64.rpm
sudo yum install --nogpgcheck rstudio-server-rhel-1.0.143-x86_64.rpm
sudo rstudio-server verify-installation
sudo rstudio-server version
RStudio Server Pro :
wget
https://download2.rstudio.org/rstudio-server-rhel-pro-1.0.143-x86_64.rpm
sudo yum install --nogpgcheck rstudio-server-rhel-pro-1.0.143-x86_64.rpm
sudo rstudio-server verify-installation
sudo rstudio-server version
-
We need to set some environment variables at the start of R session. This can be achieved using Renviron file. Append the following lines to /opt/cloudera/parcels/MRS/lib64/R/etc/Renviron
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/cloudera/parcels/MRS/lib64/R/lib
R_LIBS=/opt/cloudera/parcels/MRS/lib64/R/library
MRS_PARCEL_PATH=/opt/cloudera/parcels/MRS
sudo ln -s /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so /opt/cloudera/parcels/MRS/hadoop/libjvm.so
-
Copy RevoHadoopEnvVars.site file from home directory into hadoop directory. (NOTE: If .RevoHadoopEnvVars.site file is not present in the home directory , just run R command once - this will generate site file in home directory)
sudo cp ~/.RevoHadoopEnvVars.site /opt/cloudera/parcels/MRS/hadoop
sudo mv /opt/cloudera/parcels/MRS/hadoop/.RevoHadoopEnvVars.site /opt/cloudera/parcels/MRS/hadoop/RevoHadoopEnvVars.site
sudo rstudio-server stop
sudo rstudio-server restart
-
RStudio Server will be available in the following url : http://<nodename>:8787. (Make sure port 8787 is open)
Let us run Microsoft R Server Examples on Local, Hadoop and Spark Compute Context using RStudio Server :
LOCAL COMPUTE CONTEXT
LOCAL COMPUTE CONTEXT ON HDFS DATA
HADOOP COMPUTE CONTEXT
SPARK COMPUTE CONTEXT