INSTALLING HADOOP ON MAC OSX LION

28 Apr 201221 Jun 2012 ~ Ritesh Agrawal

Although you are likely to run hadoop on a big cluster of computers, it useful to have it locally installed for debugging and testing purpose. Here are some quick notes on how to set hadoop on Mac OSX Lion. Please refer below mentioned references for details.

Detailed Instructions

Step 1: Installing Hadoop
If you haven’t heard about homebrew, then you should definitely give it a try. It really makes installing and uninstalling softwares effortless and keeps your machine clean of unused files. Below I am using homebrew to install hadoop.

brew install hadoop

Step 2: Edit Configurations

Step 2.1 Add following line to /usr/local/Cellar/hadoop/1.0.1/libexec/conf/hadoop-env.sh. This line is required to overcome the following error related “SCDynamicStore”, expecially “Unable to load realm info from SCDynamicStore”

export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"

Step 2.2:Add the following content in the /usr/local/Cellar/hadoop/1.0.1/libexec/conf/core-site.xml. One key property is hadoop.tmp.dir. Note that we are setting the hdfs in current user’s folder and naming it as hadoop-store. You don’t need to create this folder as it will be automatically created for you in the later stages.

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/Users/${user.name}/hadoop-store</value>
		<description>A base for other temporary directories.</description>
	</property>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://localhost:8020</value>
	</property>
</configuration>

Step 2.3: Add the following content in the /usr/local/Cellar/hadoop/1.0.1/libexec/conf/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
	  <name>mapred.job.tracker</name>
	  <value>localhost:9001</value>
	</property>

	<property>
        <name>mapred.tasktracker.map.tasks.maximum</name>
        <value>2</value>
    </property>

    <property>
        <name>mapred.tasktracker.reduce.tasks.maximum</name>
        <value>2</value>
    </property>
</configuration>

Step 2.4: Add the following content in the /usr/local/Cellar/hadoop/1.0.1/libexec/conf/hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
	  <name>dfs.replication</name>
	  <value>1</value>
	</property>
</configuration>

Step 3:Enable SSH to localhost

Make sure that you have ssh private (~/.ssh/id_rsa) and public (~/.ssh/id_rsa.pub) keys already setup. If you are missing the above two files, then run the following command (Thanks to Ryan Rosario for pointing out this). Instead of using rsa key, you can also use dsa (replace rsa with dsa in the command below). However instructions below assume that you have used rsa key.

ssh-keygen -t rsa

Step 3.1: Make sure that “Remote login” is enabled in your system preferences. For this, Go to
“System Preferences” -> “Sharing”. “Remote login” should be checked.

Step 3.2: From the terminal run the following command. Make sure that authorized_key has 0600 permission. (see Raj Bandyopadhay’s comment)

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Step 3.3: Try login to localhost. If you get any error remove (or change to something else) ~/.ssh/known_hosts and retry connecting to localhost.

ssh localhost

Step 4. Start and Test Hadoop

hadoop namenode -format
/usr/local/Cellar/hadoop/1.0.1/bin/start-all.sh
hadoop jar /usr/local/Cellar/hadoop/1.0.1/libexec/hadoop-examples-1.0.1.jar pi 10 100

To make sure that all hadoop processes started, use the following command

ps ax | grep hadoop | wc -l
# expected output is 6

There are 5 process related to hadoop. If you see less than 6 processes then check log files. Log files are located at /usr/local/Cellar/hadoop/1.0.1/libexec/logs/*.log

Additional Notes

Namenode info: http://localhost:50070/dfshealth.jsp
Jobtracker: http://localhost:50030
Starting hadoop cluster: ‘/usr/local/Cellar/hadoop/1.0.1/bin/start-all.sh’
Stop hadoop cluster: /usr/local/Cellar/hadoop/1.0.1/bin/stop-all.sh
Verify hadoop started properly: Use ps ax | grep hadoop | wc -l and make sure you see 6 as output. There are 5 processes associated with hadoop and one pertaining to the last command

Common Issues

Unable to load realm info from SCDynamicStore: Refer step 2.1
could only be replicated to 0 nodes, instead of 1: Refer Step 3. Mostly likely this problem is because SSH to localhost not available
Jobtracker not starting: I stumbled across this problem and found that there was a spelling mistake in mapread-site.xml. I misspelled mapread to mapred (missing second a). Also see above additional notes to make sure that there are 5 processes running.

References:

Published by Ritesh Agrawal

I love learning, teaching (sometimes to machines), and sharing things I am learning. This blog mostly captures my thoughts, ideas, or concepts I discovered or learned as I work on various projects related to machine learning, big data processing and statistics. View all posts by Ritesh Agrawal

65 thoughts on “INSTALLING HADOOP ON MAC OSX LION”

Pingback: Installing Hadoop on Mac OSX Lion | Memento « Big Data Analytics
kindjal@gmail.com says:

4 May 2012 at 9:21 pm

Thank you for the writeup. Very helpful!

> hadoop nodename -format
Exception in thread “main” java.lang.NoClassDefFoundError: nodename
Caused by: java.lang.ClassNotFoundException: nodename
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

Perhaps you meant:
> hadoop namenode -format

Reply
Pingback: Installing Hadoop on OSX Lion (10.7) « Denny Lee
Raj Bandyopadhyay says:

13 Jun 2012 at 12:05 pm

Great tutorial, very useful. One small caveat is that the .ssh/authorized_keys file must have permission bits set to 0600. Use ‘chmod 0600 .ssh/authorized_keys’ after creating that file.

Reply
Pingback: An installation guide for installing RHadoop on Mac OS X 10.7.4 with Hadoop 1.0.1 « Nerd Stuff
Ryan Rosario (@DataJunkie) says:

21 Jun 2012 at 8:27 pm

Thank you for the excellent tutorial. This is my first time installing on Mac — I usually use Hadoop on Ubuntu.

One thing to note. An SSH keypair must already exist in order to do step 3.1:
ssh-keygen -t dsa

Reply
1. Ritesh Agrawal says:
  
  21 Jun 2012 at 10:53 pm
  
  Thanks Ryan for pointing out that. I updated the instructions.
  
  Reply
kavič (@kavic) says:

28 Jun 2012 at 3:24 pm

Great instructions. I went through it successfully but I still have trouble running hadoop directly on java files and cannot set the HADOOP_CLASSPATH properly as I am going over the hadoop book. Any ideas?

Thanks!

Reply
1. Ritesh Agrawal says:
  
  28 Jun 2012 at 4:27 pm
  
  @kavic,
  I didn’t explicitly set hadoop_classpath. Did you use brew to install hadoop. Also make sure that you JAVA_HOME properly set. In my case JAVA_HOME points to /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
  Also make sure you are able to run the hadoop test
  hadoop jar /usr/local/Cellar/hadoop/1.0.1/libexec/hadoop-examples-1.0.1.jar pi 10 100
  Let me know if you are able to solve this problem. I would like to update the blog based on your solution.
  
  Reply
  1. kavič (@kavic) says:
    
    29 Jun 2012 at 11:29 pm
    
    Thanks for following up.
    
    I finally figured out what was wrong since the tests were running fine and I could even run python scripts via streaming. The problem was that HADOOP_CLASSPATH is apparently set relative to the home directory /user/hduser and once I copied the classes over a new directory over there things were fixed… I am still wondering exactly what went wrong though!
    
    Reply
Pingback: How to install Hadoop on Mac OS X « Shereef Elias
Philip says:

18 Jul 2012 at 6:32 pm

I have followed this exactly but unfortunately receive an error when trying to run the example. It says something about protocol mismatch, ClientProtocol version mismatch. (client = 61, server = 63). Any ideas?

Reply
1. Ritesh Agrawal says:
  
  18 Jul 2012 at 7:09 pm
  
  checkout this link. It might help solve your problem: http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/27552
  
  Reply
  1. Philip says:
    
    18 Jul 2012 at 7:56 pm
    
    Hi Ritesh, I read through that but can’t see what to do?
    
    Reply
    1. Philip says:
      
      18 Jul 2012 at 8:11 pm
      
      New error occuring now
      
      Number of Maps = 10
      Samples per Map = 100
      12/07/18 21:10:38 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s).
      12/07/18 21:10:39 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 1 time(s).
      12/07/18 21:10:40 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 2 time(s).
      12/07/18 21:10:41 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 3 time(s).
      12/07/18 21:10:42 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 4 time(s).
      12/07/18 21:10:43 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 5 time(s).
      12/07/18 21:10:44 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 6 time(s).
      12/07/18 21:10:45 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 7 time(s).
      12/07/18 21:10:46 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 8 time(s).
      12/07/18 21:10:47 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 9 time(s).
      java.lang.RuntimeException: java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused
      at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:546)
      at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
      at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:265)
      at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
      at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
      at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
      Caused by: java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused
      at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
      at org.apache.hadoop.ipc.Client.call(Client.java:1075)
      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
      at $Proxy1.getProtocolVersion(Unknown Source)
      at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
      at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
      at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
      at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:238)
      at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:203)
      at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
      at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
      at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
      at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:542)
      … 17 more
      Caused by: java.net.ConnectException: Connection refused
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
      at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
      at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
      at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
      at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
      at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
      at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
      at org.apache.hadoop.ipc.Client.call(Client.java:1050)
      … 31 more
      
      Reply
ahmedahmedov says:

24 Jul 2012 at 2:33 pm

Hadoop dfs -ls shows local files not the HDFS :( any solutions?

Reply
1. Ritesh Agrawal says:
  
  24 Jul 2012 at 4:33 pm
  
  Can you try executing “hadoop dfs” and see if it shows help. If it does, try copying something from local to hdfs and see if that works.
  
  Reply
2. Ritesh Agrawal says:
  
  24 Jul 2012 at 4:36 pm
  
  Also, make sure that you are using correct path in step 2.2.
  
  Reply
  1. Sudheer says:
    
    11 Aug 2013 at 3:45 pm
    
    Ritesh – I have the same problem too. When I give the command ‘hadoop dfs’ I do get the help options but when I give any ‘hadoop dfs’ commands (-ls, -mkdir, copyFromLocal, etc) it creates everything on UFS where I am. (I am using MacOS). I also get the following warning: “WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable”
    
    Reply
unaur says:

2 Aug 2012 at 6:16 am

for rsa key generation we can make sure that its stored in the proper file name.
ssh-keygen -t rsa -p ” ” -f ~/.ssh/id_rsa

Reply
Parag Patel says:

10 Aug 2012 at 2:29 am

This worked flawlessly in on OS X 10.8 Mountain. Could you please update this page to indicate that.

Thanks,

Parag

Reply
Subho Banerjee says:

13 Aug 2012 at 5:58 am

Hello..
I am using Mac OS 10.8…. Could you tell me what is going wrong….

When trying out the tutorial the map seems to work, but it cannot compute the reduce.
12/08/13 08:58:12 INFO mapred.JobClient: Running job: job_201208130857_0001
12/08/13 08:58:13 INFO mapred.JobClient: map 0% reduce 0%
12/08/13 08:58:27 INFO mapred.JobClient: map 20% reduce 0%
12/08/13 08:58:33 INFO mapred.JobClient: map 30% reduce 0%
12/08/13 08:58:36 INFO mapred.JobClient: map 40% reduce 0%
12/08/13 08:58:39 INFO mapred.JobClient: map 50% reduce 0%
12/08/13 08:58:42 INFO mapred.JobClient: map 60% reduce 0%
12/08/13 08:58:45 INFO mapred.JobClient: map 70% reduce 0%
12/08/13 08:58:48 INFO mapred.JobClient: map 80% reduce 0%
12/08/13 08:58:51 INFO mapred.JobClient: map 90% reduce 0%
12/08/13 08:58:54 INFO mapred.JobClient: map 100% reduce 0%
12/08/13 08:59:14 INFO mapred.JobClient: Task Id : attempt_201208130857_0001_m_000000_0, Status : FAILED
Too many fetch-failures
12/08/13 08:59:14 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stdout
12/08/13 08:59:14 WARN mapred.JobClient: Error reading task outputServer returned HTTP response code: 403 for URL: http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stderr
12/08/13 08:59:18 INFO mapred.JobClient: map 89% reduce 0%
12/08/13 08:59:21 INFO mapred.JobClient: map 100% reduce 0%
12/08/13 09:00:14 INFO mapred.JobClient: Task Id : attempt_201208130857_0001_m_000001_0, Status : FAILED
Too many fetch-failures

Here is what I get when I try to see the tasklog using the links given in the output
http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stderr —>
2012-08-13 08:58:39.189 java[74092:1203] Unable to load realm info from SCDynamicStore

http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stdout —>

Also this error of Unable to load realm info from SCDynamicStore does not show up when I do ‘hadoop namenode -format’ or ‘start-all.sh’

Reply
Dave says:

29 Aug 2012 at 2:01 pm

In the event that a non-admin will be running hadoop, you’ll also need to adjust permissions on the hadoop log directory. For a typical developer workstation, something like this will usually be fine:

chmod -R a+w libexec/logs

(from the hadoop directory).

Reply
Krishna says:

18 Oct 2012 at 11:16 pm

Cool Ritesh. It was a piece of cake.
However I have a couple of observations –
1. ” /usr/local/Cellar/hadoop/1.0.1/libexec/conf/ ” is incorrect. The correct one should be – ” /usr/local/Cellar/hadoop/1.0.4/libexec/conf/ “

Reply
1. Ritesh Agrawal says:
  
  19 Oct 2012 at 12:40 am
  
  when I installed hadoop, it was at 1.0.1. Now the latest stable version is 1.0.4 and that’s why you are getting 1.0.4 instead of 1.0.1
  
  Reply
Tomasz Kleczek says:

21 Oct 2012 at 11:47 pm

Great post, really helpful, thanks!

Reply
jonathanschaller says:

2 Nov 2012 at 5:05 am

I get this error message after entering the below into the terminal. I’m not doing something right.

-MacBook-Pro:hadoop jonathanschaller$ /usr/local/Cellar/hadoop/1.0.4/libexec/conf/hadoop-env.sh
-bash: /usr/local/Cellar/hadoop/1.0.4/libexec/conf/hadoop-env.sh: Permission denied

Reply
1. Ritesh Agrawal says:
  
  2 Nov 2012 at 5:07 am
  
  Make sure you have ssh permission correctly set
  
  Reply
Kevin says:

19 Nov 2012 at 9:29 pm

Great Job ! I’m using it on My Macbook Air OS X Mountain Lion with the 1.1.0 version.

Works like a charm :)

Reply
Kevin says:

20 Nov 2012 at 1:53 pm

Hello again Ritesh Agrawal !

I would like to know who hadoop work with the other exemple in hadoop-exemple.jar .

There is several example to be use as a test.

I found the WordCount test but i would like to know how to execute it with the right syntax :

First I need a .txt with some letter in double triple… etc..

Is there any command to let hadoop knows or do I just need to do something like this :

hadoop jar /usr/local/~/hadoop-example-1.1.0.jar worldcount? 10 100 /usr/~/toto.txt

Is there any documentation where I can find some help with the example ? or a Wiki ?

Thanks for your help,

Kevin

Reply
1. liseregnier says:
  
  26 Mar 2013 at 9:29 am
  
  Let’s say you want to count the words of the file : ~/Downloads/ulysse.txt
  First copy it to the hdfs::
  fs -put ~/Downloads/ulysse.txt /user/yourname/wordcount-ex
  To run the example:
  hadoop jar hadoop-examples*.jar wordcount /user/yourname/wordcount-ex/ulysse.txt /user/yourname/wordcount-ex/output
  It will write the result in /user/yourname/wordcount-ex/output
  To see the result :
  hadoop fs -cat /user/yourname/wordcount-ex/output/part-r-00000
  
  Hope this helps!
  
  Reply
Nikhil (@npbendre) says:

27 Nov 2012 at 4:49 am

Thank you so much! Worked seamlessly on Mac OS X 10.8 !! :) :)

Reply
wesseljm says:

22 Dec 2012 at 7:49 pm

Reblogged this on Only The Best SQL Tips, Tricks, & Shortcuts and commented:
Great Intro to Config of Hadoop

Reply
subu says:

16 Jan 2013 at 6:41 am

Instructions were great. Worked like a charm. Thanks much for putting this together.

Reply
Pingback: Install Hadoop on Mac os X 10.8 « Faustineinsun's Blog
liseregnier says:

26 Mar 2013 at 9:56 am

I followed your tutorial to install the current hadoop-1.1.2 on lion 10.8.3 with java 1.6.0_43.
It seems to work pretty well, at least the pi example works fine.
But when I run the word count examples (as I explain above) it works but I have two warnings bothering me :
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
WARN snappy.LoadSnappy: Snappy native library not loaded.

Do you know how I can solve these ?

Reply
1. Sudheer says:
  
  11 Aug 2013 at 3:46 pm
  
  I get the first warning message too. Let me know how you fixed it.
  
  Reply
2. asd says:
  
  18 Aug 2013 at 6:00 pm
  
  Even I get the same warning….did u find a solution?
  
  Reply
  1. lise regnier says:
    
    23 Aug 2013 at 8:51 am
    
    nope no solution yet but it is working fine even with the warnings
    
    Reply
    1. thushw4 says:
      
      12 Nov 2015 at 6:13 pm
      
      did you try :
      
      brew install snappy
      
      Reply
Pramod says:

10 Apr 2013 at 2:35 am

Thanks Ritesh, instructions are really good, worked without issue. I am new to Hadoop. Would you recommend any links for further example which will help me write my own job.

Reply
mkstayalive says:

15 May 2013 at 8:47 am

Thank you. Helped me as well.

Reply
Chao Chen says:

18 May 2013 at 1:15 pm

Dear Retish,

I have done step 2.1, however, still got the error, do you know why? Thank you.

starting namenode, logging to /usr/local/Cellar/hadoop/1.1.2/libexec/bin/../logs/hadoop-chaochen-namenode-Chaos-iMac.local.out
2013-05-18 23:13:39.369 java[4390:1b03] Unable to load realm info from SCDynamicStore

Reply
1. Ritesh Agrawal says:
  
  18 May 2013 at 1:20 pm
  
  @Chao,
  Make sure that you copied the whole statement in 2.1: export HADOOP_OPTS=”-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk”
  
  Apart from that I am not sure. I haven’t tried installing hadoop 1.1.2. let me know if that doesn’t work and I can try installing hadoop 1.1.2 tonight.
  
  Ritesh
  
  Reply
Serge Prud'homme says:

27 May 2013 at 7:12 pm

Thanks very useful but I am trying to install hadoop 1.1.2 but I am getting the following error.

The following two commands seem to work ok:
~ $ hadoop namenode -format
~ $ /usr/local/Cellar/hadoop/1.1.2/bin/start-all.sh

As you can see other commands seem to work ok
~ $ ps ax | grep hadoop | wc -l
5

However the example fails miserably… ideas?
~ $ hadoop jar /usr/local/Cellar/hadoop/1.1.2/libexec/hadoop-examples-1.1.2.jar pi 10 100
Number of Maps = 10
Samples per Map = 100
2013-05-27 15:05:30.221 java[30151:1703] Unable to load realm info from SCDynamicStore
13/05/27 15:05:30 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/sergep/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)

Reply
Serge Prud'homme says:

28 May 2013 at 8:46 pm

I just realized that for hadoop 1.1.2 you can skip Step #2 entirely. The other steps are still required.

Reply
mc says:

29 May 2013 at 1:36 am

Well explained..Thanks..Just installed 112.

Reply
JerryK says:

2 Jun 2013 at 3:24 pm

I am running 1.1.2 and undid the step 2.* and I got the “Unable to load realm info from SCDynamicStore” error while executing hadoop namenode -format. But the command formatted. “/usr/local/Cellar/hadoop/1.1.2/bin/start-all.sh” starts hadoop, and “hadoop jar /usr/local/Cellar/hadoop/1.1.2/libexec/hadoop-examples-1.1.2.jar pi 10 10” throws the “Unable to load realm info from SCDynamicStore” error, but completes. However I think the results are incorrect. The last 2 lines of the output are:

Job Finished in 2.517 seconds
Estimated value of Pi is 3.20000000000000000000

The pi value seems way off.

Reply
1. JerryK says:
  
  2 Jun 2013 at 4:02 pm
  
  I just played around a bit more with the sample and it appears that the pi estimation is correct. You just have to modify the final term, samples per map, to get more decimal places. Here are more detailed results.
  
  hadoop jar /usr/local/Cellar/hadoop/1.1.2/libexec/hadoop-example1.1.2.jar pi 10 1000000000
  Number of Maps = 10
  Samples per Map = 1000000000
  ……..
  Job Finished in 321.893 seconds
  Estimated value of Pi is 3.14159266440000000000
  
  Reply
j2on says:

14 Jun 2013 at 11:54 pm

Great Post.. you should start series of hadoop related components installation on Mac such as Hive, Accumulo, and etc..

Reply
Sudheer says:

20 Jul 2013 at 9:24 pm

I am unable to ssh to local host on my Mac (OS 10.8.x). I looked around on the web but couldn’t solve. Would greatly appreciate any help. Here is what I did.
* tested with rsa keys but when it didn’t work I created a new .ssh directory and create dsa files. didn’t work
* followed the instructions and enabled ‘remote login’
* disabled the firewall too.
* Following is what I get:
ssh localhost
Connection closed by ::1

Reply
1. Sudheer says:
  
  20 Jul 2013 at 9:26 pm
  
  In the /var/log I see the following message every time I try to test ‘ssh localhost’
  sshd[1787]: fatal: Access denied for user by PAM account configuration [preauth]
  
  Reply
2. lise regnier says:
  
  21 Jul 2013 at 6:09 am
  
  Did you allow remote connection in System Preference -> Sharing -> Remote Login?
  
  Reply
  1. Sudheer says:
    
    11 Aug 2013 at 10:29 am
    
    Fixed the problem. It was the setting in ‘enable remote login’ setting in Mac.
    
    Reply
  2. Alby says:
    
    15 Oct 2014 at 6:44 pm
    
    Well I did PAM file manipulation. SSHD config file. Still didnt work. This just did it. Thanks a lot. Luvyaaa..
    
    Reply
Pingback: Raythos Lab – Securing Big Data with cloud-based homomorphic key encryption | Raythos Interactive
Pingback: Installing Cloudera QuickStart VM with VirtalBox on Mac Step by Step | BeyondParadise's Blog
viresh says:

8 Oct 2013 at 3:15 am

Hi, when I run start-all.sh, i see some processes launched and I can see the icons for those on my window. I am working in some window, and if I launch some map-reduce job, my work get interrupted by those processes launching and coming to foreground/focus. How to avoid that on mac?

VJ

Reply
andrew kittredge says:

12 Nov 2013 at 2:37 am

Thanks buddy.

Reply
Pingback: Raythos Lab – Securing Big Data with cloud-based homomorphic key encryption - Porticor Cloud Security
tonycao says:

7 Mar 2014 at 12:56 am

Great instruction! It works on Mavericks!

Reply
Neethu says:

2 Sep 2014 at 6:48 pm

Hi, I am trying to create a smoketest user in my hdfs. When i give the command hadoop fs -chmod 757 /mapred it shows the following error

chmod: Call From diliprnair-VAIO/192.168.1.136 to diliprnair-VAIO:8020 failed on
connection exception: java.net.ConnectException: Connection refused: no further
information; For more details see: http://wiki.apache.org/hadoop/ConnectionRef
used

Could you heplp

Reply
pig roast says:

26 Feb 2015 at 2:29 pm

Ido not even know how I finished up here, however I believed this poxt was great.
I do not recognize who you might be buut definittely
you arre going to a famous blogger if you happen to are not
already. Cheers!

Reply
Pingback: Hadoop environment variables - TecHub
Pingback: Hadoop: 1 || JM: 0 – Pt. | JXM BLog