#!/usr/bin/env python

import sys

for line in sys.stdin:
line = line.strip()
(id, fname, lname, addr, city, state, zip, job, email, active, salary) = line.split(“\t”)

if int(salary) >= 75000:
print “%s,1″ % state

#!/usr/bin/env python

import sys

previous_state = ”
count_for_state = 0

for line in sys.stdin:
line = line.strip()

(state, number) = line.split(“,”)

if state == previous_state:
count_for_state = count_for_state + int(number)
if previous_state != ”:
if count_for_state >= 1:
print “%s\t%d” % (previous_state, count_for_state)
previous_state = state
count_for_state = int(number)

if count_for_state >= 1:
print “%s\t%d” % (state, count_for_state)

# Path of Hadoop streaming JAR library

# Directory in which we’ll store job output

# Make sure we don’t have output from a previous run.
# The -r option removes the directory recursively, and
# the -f option prevents Hadoop from warning us if the
# directory doesn’t exist.
hadoop fs -rm -r -f $OUTPUT

# Run this job
hadoop jar $STREAMJAR \
-mapper -file \
-reducer -file \
-input /dualcore/employees \
-output $OUTPUT

Fix conky rings in Ubuntu 13.04

After install ubuntu 13.04, I had to make the following change to get conky to have a transparant background —

add the following lines to your conkyrc:


own_window_argb_visual yes
own_window_argb_value 200



partitions 101

Video guide:

Chapter 6 read hat book.


use fdisk -l to list hard drives
first sata is sta
second sata is sdb
partitions are listed as sda1. sda2 etc

to add a new partition to /dev/sdb/
-fdisk /dev/sdb/
-n (for new)
-p (for primary)
it will ask for partition number (1-4)
we are going to say 1
-it is a new disk to the first cylinder is at 1
-we then specify the size for the partition: +10G
-press p again to print the new partition table
-press w to write the partition table
-once the partitin is written we need to either reboot or use partprobe
-then we need to format the filesystem on the artition
-#mkfs.ext4 /dev/sdb2
– we are now going to mount this new partitin /dev/sdb1 to user1 home directory: mount /dev/sdb1 /home/user1
-we can manually mount and umount this partition.
-to have it automatically mount, edit the fstab file

Swap space is created in a similar manner to mkfs however, after creating the partition use

#mkswap /dev/sdb*

mount the sawp space(similar to mount syntax)

#swap on /dev/sdb*

Logical Volume

1.Create partition on hard drive

2. St up the partition as a Physical Volume (PV)

3. Set up the PV in Physical Extents (PE)

4. Convert the PE into logical extents (LE)

5. Logical exents can be fomated into logical volumes


complete guide

cent guide

ubuntu guide

even more complete debian guide:

best guide so far from /r/sysadmin


so far:

Install the dependencies:

sudo apt-get install openjdk-7-jre



dpkg -i elasticsearch-0-20.5.deb

Install the Elastic Search Service Wrapper which:

The service wrapper allows you to start, stop and restart elasticsearch using:

./elasticsearch/bin/service/elasticsearch start | stop | restart

Modify the path
export PATH=$PATH:/usr/share/elasticsearch/bin/service

then start the service
sudo elasticsearch start
Then this:
ln -s `readlink -f elasticsearch/bin/service/elasticsearch` /usr/bin/elasticsearch_ctl
sed -i -e ‘s|# elasticsearch| graylog2|’ /etc/elasticsearch/elasticsearch.yml
/etc/init.d/elasticsearch start

Check to see if your working:
curl -XGET ‘http://localhost:9200/_cluster/health?pretty=true’
should return the following:

“cluster_name” : “elasticsearch”,
“status” : “green”,
“timed_out” : false,
“number_of_nodes” : 2,
“number_of_data_nodes” : 2,
“active_primary_shards” : 0,
“active_shards” : 0,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 0


Download/Build libevent:

Download and install latest version ( 2.0 or higher) from
Extract the compressed file and run:
#make install

Untested alternate: sudo apt-get install libevent-dev

Download/Build pgbouncer:

Extract the compressed file and run:
#cd /tmp
#tar xzf pgbouncer-1.4.2.tgz
#cd pgbouncer-1.4.2
#./configure –prefix=/usr/local –with-libevent=/usr/local
#make install
config does not exists, you need to create it: sudo cp /usr/local/share/doc/pgbouncer/pgbouncer.ini /etc/pgbouncer.ini

–Change ownership of pgbouncer binariey to the postgres user

Brief information prior to editing the .ini:

Quick searches revealed a lot of forums posts and emails with people running in to problems configuring pgbouncer. I always like to understand what im configuring and what the settings do prior to making changes. It looks like Postgres auth for pgbouncer has changed between 8.x and 9.x. Full write up here:

In a nut shell, pgbouncer is configured to look at 8.0/main/global/pg_auth for authentication. However this file was removed in 9.0+. We need to manually create the authfile.

Setting up the pgbouncer.auth file

Syntax for the authfile is as follows:

“username” “password”

Multiple ways to create the auth file:

Example of psql being used to write the pgbouncer.auth file. psql can dump output to a file:

here is the psql query to show users and passwords:

postgres=# select rolname, rolpassword from pg_authid;

touch /path/to/pgbouncer.auth

paste into /path/to/pgbouncer.auth

Editing pgbouncer config:

 listen_addr = *

listen_port = 6432

auth_type =trust

auth_file =