Firstly, Pseudo-Distributed mode is effectively a 1 node Hadoop Cluster setup. This is really the best way to get started with Hadoop as it makes it really easy to modify the config to be fully distributed once you’ve got a handle on the basics.
Step 1: Update OpenSuse packages from Software manager.
Step 2:Install Sun JDK(click here to refer the previous post to install Sun JDK in OpenSuse 12.1).
Create a user “hadoop” on your suse machine and login with the user hadoop to carry out below activities.
Step 3:Setup Passwordless SSH- Activate sshd and set bootable from root bash.
>sudo bash
#rcsshd start
#chkconfig sshd on
Now create ssh key for connet ssh without password.
>ssh-keygen -N ” -d -q -f ~/.ssh/id_dsa
>ssh-add ~/.ssh/id_dsa
Identity added: /root/.ssh/id_dsa (/root/.ssh/id_dsa)
Test connect to ssh without password — with Key
>ssh localhost
The authenticity of host ‘localhost (: :1)’ can’t be established.
RSA key fingerprint is 05:22:61:78:05:04:7e:d1:81:67:f2:d5:8a:42:bb:9f.
Are you sure you want to continue connecting (yes/no)? Please input yes
Step 4:Hadoop Installation:
Download hadoop-0.21.0.tar.gz file from http://www.apache.org/dyn/closer.cgi/hadoop/core/
Create a directory /home/hadoop/hadoop-install
/home/hadoop> mkdir hadoop-install
Extract the hadoop-0.21.0 tar file to this new directory.
/home/hadoop>sudo tar -zxvf /home/hadoop/Downlods/hadoop-0.21.0.tar.gz
Edit the following files in /home/hadoop/hadoop-install/hadoop-0.21.0/conf directory.
conf/core-site.xml
<?xml version=”1.0″?>
<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
<!– Put site-specific property overrides in this file. –>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-install/hadoop-datastore/</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://local:54310</value>
</property>
</configuration>
conf/mapred-site.xml
<?xml version=”1.0″?>
<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
<!– Put site-specific property overrides in this file. –>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>local:54311</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>100</value>
</property>
</configuration>
conf/hdfs-site.xml
<?xml version=”1.0″?>
<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
<!– Put site-specific property overrides in this file. –>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
conf/masters
localhost
conf/slaves
localhost
conf/hadoop-env.sh
Uncomment the line where you provide the details about JAVA_HOME. It should be pointing to sun-jdk. That is as shown below.
export JAVA_HOME=/usr/java/default
Setting the environmental variables for JDK and HADOOP
Open the file ~/.bashrc file and paste the below two command at the end of the file.
>vi ~/.bashrc
export JAVA_HOME=/usr/java/default
export HADOOP_COMMON_HOME=/home/hadoop/hadoop-install/hadoop-0.21.0
To get the immediate effect of .bashrc file, following command must be run.
$source ~/.bashrc
Starting hadoop processes
Format the namenode using the following command
bin/hdfs namenode -format
Start the dfs:
hadoop@localhost:~/hadoop/hadoop-0.21.0>bin/start-dfs.sh
Start the mapred:
hadoop@localhost:~/hadoop/hadoop-0.21.0>bin/start-mapred.sh
Check for running processes.
hadoop@localhost:~/hadoop/hadoop-0.21.0>jps
SecondaryNameNode
NameNode
DataNode
TaskTracker
JobTracker
Like this:
Like Loading...