Python - write compressed log file into HDFS for hadoop hive mapreduce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 import pyhdfs from cStringIO import StringIO import binascii -snip- #Set hdfs connection info hdfsaddress = “namenode” hdfsport = 12345 hdfsfn = “filename” #gzip compression level clevel = 1 -snip- logger.info(“Writing compressed data into ” + hdfsfn + “.gz”) #open hdfs file fout = pyhdfs.open(hdfs, hdfsfn + “.gz”, “w”) #compress the data and store it in compressed_data buf = StringIO() f = gzip.GzipFile(mode=’wb’, compresslevel=clevel,fileobj=buf) try: f.write(concatlog) finally: f.close() compressed_data = buf.getvalue() #write compressed data into hdfs pyhdfs.write(hdfs,fout,compressed_data) #close hdfs file logger.info(“Writing task finished”) pyhdfs.close(hdfs,fout) -snip-

1 March, 2012 · Logan Han

Facebook scribe with hdfs

packages: libevent hadoop-0.20-libhdfs JDK for hdfs support Boost http://sourceforge.net/projects/boost/ ./bootstrap.sh ./bjam ./bjam install ...

17 February, 2012 · Logan Han

Flume DFO local storage usage check

It might be useful when flume driver failed. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 !/usr/bin/perl #Check flume DFO directory size sub trim($); use strict; use warnings; my $exit=0; my $backlog_size = `du -s flume | awk {‘print \$1’}`; $backlog_size = trim($backlog_size); if ( !$ARGV[0] || !$ARGV[1]) { ########################### Usage of the plugin print “check_flume_backlog critical_size warning_size \n”; exit 0; } ######################### Case 1 if State is Critical if ($backlog_size > $ARGV[0]) { print “Critical: “.$backlog_size.”b\n”; exit 2; } ######################## Case 2 if State is Warning if($backlog_size > $ARGV[1] || $backlog_size == 0) { print “Warning: “.$backlog_size.”b\n”; exit 1; } ######################## Case 3 if State is OK if($backlog_size < $ARGV[0] && $backlog_size < $ARGV[1]) { print “OK: “.$backlog_size.”b\n”; exit 0; } sub trim($) { my $string = shift; $string =~ s/^\s+//; $string =~ s/\s+$//; return $string; } And for centralised monitoring.. ...

30 January, 2012 · Logan Han

Default fixed version value when creating an issue in JIRA

You can try to add some JavaScript code to the field that will perform required operation for you, in this case it should be ‘Fix Version’ field. You can refer to this documentation as a guideline: http://confluence.atlassian.com/display/JIRACOM/Using+JavaScript+to+Set+Custom+Field+Values ...

24 January, 2012 · Logan Han

delete mplayerx history

http://code.google.com/p/mplayerx/issues/detail?id=517 launch Terminal, and use the command to clear the history - tested. ...

30 December, 2011 · Logan Han