Using RStudio Server with Microsoft R Server Parcel for Cloudera
Published Mar 23 2019 04:25 PM 535 Views
Microsoft
First published on MSDN on Apr 30, 2017
In previous releases of Microsoft R Server, parcel installation required downloading two pre-built parcel files. The 9.1 release improves upon this experience by providing a parcel generator script generate_mrs_parcel.sh to generate a single MRS-9.1.0-*.parcel file. Here are the complete instructions to install MRS Parcel in Cloudera Cluster.

In this article we will look into how to make RStudio Server (both Open and Commercial) work with MRS 9.1.0 Parcel Installation in Cloudera . RStudio Server has an open source license as well as commercial license. You can view the differences here . RStudio Server with commercial license is also called RStudio Server Pro.

The following steps assume that you already have a Cloudera Cluster with MRS-9.1.0 parcel installed and activated. These steps can be run on edgenode/gateway node of the cluster.

  • Download and install RStudio Server Pro from here (OR) RStudio Server Open Source License from here .


RStudio Server Open Source License :
wget https://download2.rstudio.org/rstudio-server-rhel-1.0.143-x86_64.rpm
sudo yum install --nogpgcheck rstudio-server-rhel-1.0.143-x86_64.rpm
sudo rstudio-server verify-installation
sudo rstudio-server version



RStudio Server Pro :
wget https://download2.rstudio.org/rstudio-server-rhel-pro-1.0.143-x86_64.rpm
sudo yum install --nogpgcheck rstudio-server-rhel-pro-1.0.143-x86_64.rpm
sudo rstudio-server verify-installation
sudo rstudio-server version



  • We need to set some environment variables at the start of R session. This can be achieved using Renviron file. Append the following lines to /opt/cloudera/parcels/MRS/lib64/R/etc/Renviron


LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/cloudera/parcels/MRS/lib64/R/lib
R_LIBS=/opt/cloudera/parcels/MRS/lib64/R/library
MRS_PARCEL_PATH=/opt/cloudera/parcels/MRS

  • Create libjvm.so symlink


sudo ln -s /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so /opt/cloudera/parcels/MRS/hadoop/libjvm.so

  • Copy RevoHadoopEnvVars.site file from home directory into hadoop directory. (NOTE: If .RevoHadoopEnvVars.site file is not present in the home directory , just run R command once - this will generate site file in home directory)


sudo cp ~/.RevoHadoopEnvVars.site /opt/cloudera/parcels/MRS/hadoop
sudo mv /opt/cloudera/parcels/MRS/hadoop/.RevoHadoopEnvVars.site /opt/cloudera/parcels/MRS/hadoop/RevoHadoopEnvVars.site


  • Restart RStudio Server


sudo rstudio-server stop
sudo rstudio-server restart



  • RStudio Server will be available in the following url : http://<nodename>:8787. (Make sure port 8787 is open)


Let us run Microsoft R Server Examples on Local, Hadoop and Spark Compute Context using RStudio Server :

LOCAL COMPUTE CONTEXT





LOCAL COMPUTE CONTEXT ON HDFS DATA





HADOOP COMPUTE CONTEXT





SPARK COMPUTE CONTEXT








Version history
Last update:
‎Mar 23 2019 04:25 PM
Updated by: