hive {hive} | R Documentation |
High-level functions to control Hadoop framework.
hive( new ) .hinit( hadoop_home ) hive_start( henv = hive() ) hive_stop( henv = hive() ) hive_is_available( henv = hive() )
hadoop_home |
A character string pointing to the local Hadoop
installation. If not given, then |
henv |
An object containing the local Hadoop configuration. |
new |
An object specifying the Hadoop environment. |
High-level functions to control Hadoop framework.
The function hive()
is used to get/set the Hadoop cluster
object. This object consists of an environment holding information
about the Hadoop cluster.
The function .hinit()
is used to initialize a Hadoop cluster. It
retrieves most configuration options via searching the
HADOOP_HOME
directory given as an environment variable, or,
alternatively, by searching the /etc/hadoop
directory in case
the https://www.cloudera.com distribution (i.e., CDH3) is used.
The functions hive_start()
and hive_stop()
are used to
start/stop the Hadoop framework. The latter is not applicable for
system-wide installations like CDH3.
The function hive_is_available()
is used to check the status of
a Hadoop cluster.
hive()
returns an object of class "hive"
representing
the currently used cluster configuration.
hive_is_available()
returns TRUE
if the given Hadoop
framework is running.
Stefan Theussl
Apache Hadoop: https://hadoop.apache.org/.
Cloudera's distribution including Apache Hadoop (CDH): https://www.cloudera.com/downloads/cdh.html.
## read configuration and initialize a Hadoop cluster: ## Not run: h <- .hinit( "/etc/hadoop" ) ## Not run: hive( h ) ## Start hadoop cluster: ## Not run: hive_start() ## check the status of an Hadoop cluste: ## Not run: hive_is_available() ## return cluster configuration 'h': hive() ## Stop hadoop cluster: ## Not run: hive_stop()