`
文章列表
Overview HCatalog is a table and storage management layer for Hadoop that enables users with different data processing tools — Pig, MapReduce — to more easily read and write data on the grid. HCatalog’s table abstraction presents users with a relational view of data in the Hadoop distributed file ...

Hive: HiveServer2

    博客分类:
  • Hive
HiveServer2 (HS2) is a server interface that enables remote clients to execute queries against Hive and retrieve the results. The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide bet ...
start time 1.uptime 16:11:40 up 59 days, 4:21, 2 users, load average: 0.00, 0.01, 0.002. date -d "$(awk -F. '{print $1}' /proc/uptime) second ago" +"%Y-%m-%d %H:%M:%S" 3. cat /proc/uptime| awk -F. '{run_days=$1 / 86400;run_hour=($1 % 86400)/3600;run_minute=($1 % 3600)/6 ...
1. configure oozie-site.xml       <property>        <name>oozie.db.schema.name</name>        <value>oozie</value>       </property>    <property>        <name>oozie.service.JPAService.create.db.schema</name>        <value>true</value& ...

Oozie: Run examples

#cd  /path/to/oozie-4.0.1 #tar -xzf oozie-examples.tar.gz modify all job.properties in the examples dirnameNode=hdfs://192.168.122.1:2014jobTracker=192.168.122.1:2015queueName=defaultexamplesRoot=oozie/examples #hdfs dfs -put -R examples  oozie/   Run  pig example #oozie job --oozie http://lo ...

Oozie: configuration

---conf/oozie-site.xml---    <property>        <!--<name>oozie.service.AuthorizationService.security.enabled</name>-->           <name>oozie.service.AuthorizationService.authorization.enabled</name>        <value>false</value>           </propert ...

oozie: common errors

1. when run oozie examples, there is a error Traceback (most recent call last):  File "/usr/lib/python3/dist-packages/CommandNotFound/util.py", line 24, in crash_guard    callback()  File "/usr/lib/command-not-found", line 69, in main    enable_i18n()  File "/usr/lib/comma ...

oozie: Workflow

Workflow Definition A workflow definition is a DAG with control flow nodes (start, end, decision, fork, join, kill) or action nodes (map-reduce, pig, etc.), nodes are connected by transitions arrows. The workflow definition language is XML based and it is called hPDL (Hadoop Process Definition La ...
Import data from mysql to hdfs ----------------------------------------     Export data from hdfs to mysql --------------------------------------             ---------------------------------------------------- The difference between sqoop1 and sqoop2 Feature Sqoop Sqoop2 Connect ...
Create a new database in MySQL and grant privileges to a Hue user to manage this database. mysql> create database hue; Query OK, 1 row affected (0.01 sec) mysql> grant all on hue.* to 'hue'@'localhost' identified by 'secretpassword'; Query OK, 0 rows affected (0.00 sec) Shut down Hu ...

MySQL: Data Join

  Introduction The purpose of this article is to show the power of MySQL JOIN operations, Nested MySQL Queries(intermediate or temporary result set table) and Aggregate functions(like GROUP BY). One can refer to link1 or link2 for basic understanding of MySQL join operations. In order to explain ...
1. when click a job in 'Job Browser' panle,  the log of the job doesn't appare  R: Hadoop2.x has the ablity to aggregate logs from cluster nodes and purge the old logs. My problem is rooted at wrong vm guest clock(namenode runs on my local host while the other three datanodes run on  centos6.4 vm ...
When I create a sqoop job to import data from mysql to hdfs, and submit it to run. Hue's error.log contains the follwoing error inof:   and the sqoop.log has error info:   The root is that sqoop job syntax has some error. 
1. Press  'Data Browsers' -> 'Sqoop Transfer' -> 'create new job' 2.create new connection using the default connector with id 1 3. fill the 'From' fileds     4.fill the 'To' fields and click 'save and run' button   5.check the job status in 'Job Browser' panel  
cogroup cogroup is a generalization of group . Instead of collecting records of one input based on a key, it collects records of n inputs based on a key. The result is a record with a key and one bag for each input.   A = load 'input1' as (id:int, val:float);B = load 'input2' as (id:int, val2 ...
Global site tag (gtag.js) - Google Analytics