Tuesday, April 21, 2015

Pi Cluster Week 2, Session 1: Passing around the Pi

Today was a day of working out the logistics of incorporating multiple raspberry pis.
This meant not only prepping the additional hardware, but also preparing the SD cards with the appropriate operating system. I thought that this would mean getting a clean Raspbian download, when in fact what needed to be done was to clone the master's OS. I did this using a handy tool called Win32DiskImager on my windows side. 


Sadly this took a lot of class time, so I made sure to put some time aside for a small session in the lab to catch up. 

After I had the images up and running I made it an a point to bring the hardware together, and attempt to get everyone (the pi's) talking to each other, even if it wasn't in the most official of manners. After having a box of project props to deal with for the past few weeks, I finally came up with this configuration:



With my Pi acting as Node1, the master node, and then the CS department's older B+ acting as its slave. This is where the static IPs come into play, in that you pretty much create any class of network you'd like, so long as all of the Pi's know about each other. At this point the cluster is now it's own thing, away from the internet. I managed to get everyone able to ping each other and even ssh.



However I noticed that my tutorial seem to emphasize just including the hostnames as the basis of connection, so when I began to run the hadoop scripts, they were still unable to communicate. I thought it would be best to try to connect using names in the style of hduser@10.40.X.X (for specific users on each node), but that didn't seem ideal for dynamically adding and removing lots of nodes. I then looked up connecting to the hostnames, and saw that I'd need to either set up a DNS server to resolve the hostnames to IPs (Man, so networking!), or kind of fudge a DNS server by telling each pi that name X is equivalent to IP Y, or at least letting the master know. If the former (each Pi), than I perhaps should have done this before creating the image, but thankfully because I have a low number of nodes, it won't be that much of a big deal.

I figured that this would be a good place to stop, and (seemingly) one of my last roadblocks in getting distributed programs running. Thursday I will hopefully run my first distributed program, and maybe even attach a third node! At this point we'll conclude, and talk about how to expand this project, and optimize it the next time it's implemented. Already I've encountered a few ways to make this process simpler, and run more efficiently. Moving forward, I drafted this for a layout of the primitive cluster:



No comments:

Post a Comment