Well, this is embarrassing. There should be an image here. Great picture, too!
CS61c: Proj4-2 Spec

Step 0: Set up the actual code you'll run on Amazon EC2

Go into your proj4-XX-XX folder on a hive computer (it's okay if you're ssh'd in).

Create a branch called "part2".

git checkout -b part2

Pull proj4 starter files onto that branch.

git pull proj4_starter part2

Push the branch to Github, so you can later pull "part2" into your EC2 instance.

git push origin part2

Step 0.5: Create folder and create an account with Amazon EC2

I like to think of the folder we're creating now as sort of a "base camp" from which to log into Amazon EC2.

git clone git@github.com:cs61c-spring2015/proj4_ec2.gitcd proj4_ec2 # Go inside the directory that was created

Donggyu has set up a cool script to automatically generate logins for us! These logins (when they work) seem to magically work with the make commands to let us do stuff on EC2 from the Hive.

make account

The problem is that the account generation is, in Donggyu's own words, a little hacky. We're told to just call make clean then make account over and over until it doesn't error. But that's what scripts were made for! Use this neat shell script Alex Fu wrote.

#!/bin/bashi="1"while [ $i -ne 0 ] ; do        make clean        make account | tee testing.txt        i=`grep "403 InvalidClientTokenId" testing.txt | wc -l`        # cat $testingdone

Just copy and paste it into the terminal, and it'll stop when you have a working account.

Step 1: Run a test to make sure that your code isn't going to take forever and make Donggyu bankrupt

Start renting a cluster on EC2.

make launch-test

If this screws up at any point, just call:

make resume 
until it works.

Now you've started renting your much smaller cluster for the sake of testing that it all works fine.

Note 1: Watch the time, because every minute you're renting it costs money.

Note 2: You can call make master and get a nice UI in the browser to look at information about the cluster you've rented. I left this step out because I never got it to work for me and I don't think it's required to complete the project. (Thanks Maya for pointing it out!)

Now you're ready to log in (basically ssh-ing in) to the cluster you rented.

make login

Now copy your code into the cluster so you can actually run it.

git clone https://github.com/cs61c-spring2015/proj4-XX-YY.gitcd proj4-XX-YYgit checkout part2

Now we need to get Cython and other dependencies onto our cluster machine. Run Donggyu's shell script to do this automatically.

./setup_ec2.sh

Finally, we can run some test code.

make ec2-cnn-test

Look at how long this takes - if it's too slow then you may need to change something before you actually run your code on the massive data sets.

Now that you are done, you need to kill your cluster and end the rental.

make destroy

Step 2: Run the 3 benchmarks with "launch-small"

This is just the same as step 1, except that:

  • Instead of renting a small test cluster, you are now running your code on 5 instances of c3.x8large.
  • Instead of just running make ec2-cnn-test, you're running 3 much larger benchmarks and recording the results.

As in step 1, we need to start renting our cluster from Amazon:

make launch-small

If this screws up at any point, just call:

make resume 
until it works.

Now we go into our cluster to run the benchmarks.

make login

Now copy your code into the cluster so you can actually run it.

git clone https://github.com/cs61c-spring2015/proj4-XX-YY.gitcd proj4-XX-YYgit checkout part2

Now we need to get Cython and other dependencies onto our cluster machine. Run Donggyu's shell script to do this automatically.

./setup_ec2.sh

Now we run tests on 3 benchmarks:

  • ec2-cnn-large
  • ec2-cnn-huge
  • ec2-cnn-full
If you would like, you can call make ec2-cnn-large, make ec2-cnn-huge, and make ec2-cnn-full - or you can just call another shell script Donggyu wrote for us:

screen./batch.sh

Donggyu's command will save all output to a log file on your cluster (not the Hive machine). The same information will pop up on your terminal so you can also just copy and paste it.

Broken Pipes: If you get a broken pipe or some other ssh-related error, don't kill your cluster, or you'll regret it like I am right now! You can still get your completed logs! You called screen (a terminal multiplexor, explained in the original spec), so you can just go back into your Hive machine and call screen -r and you'll be be brought back into your cluster like nothing ever happened.

How to Read From Logs: If you plan on reading your information from the logs instead of copying and pasting from the terminal, you'll need to copy the log files from your cluster back to your class account: scp ec2-cnn-full.log cs61c-XX@hiveY.cs.berkeley.edu:~/(destination folder). Obvously just switch out ec2-cnn-full.log for ec2-cnn-huge.log, or whatever log file you want to copy over. (Thanks Oliver for mentioning this process)

Either way, you will get benchmark results for each that look something like this:

cnn train-large 8000 2> ec2.log[CS61C Project 4] start classifier trainingiteration: 0, loss: X, time: X seciteration: 1, loss: X, time: X seciteration: 2, loss: X, time: X sec[CS61C Project 4] done training[CS61C Project 4] test the classifier[CS61C Project 4] accuracy: X[CS61C Project 4] training performane: X imgs / sec[CS61C Project 4] time elapsed: X min

Record them in some way, because you'll need them to write your answers in part2.txt.

Once you have your precious benchmark results, you should kill your session on the cluster (remember that every minute renting the cluster is costing the school money).

make destroy

Step 3: Run the 3 benchmarks with "launch-big"

This is almost exactly the same as step 2, except that:

  • Instead of running your code on 5 instances of c3.x8large, you now have 10 instances.

As with every other step, we need to start renting our cluster from Amazon:

make launch-big

Call make resume as needed.

From here, you know the drill:

make logingit clone https://github.com/cs61c-spring2015/proj4-XX-YY.gitcd proj4-XX-YYgit checkout part2./setup_ec2.shscreen./batch.sh

Record your results for:

  • ec2-cnn-large
  • ec2-cnn-huge
  • ec2-cnn-full

Kill the session.

make destroy

Step 4: Submit

I'm going to link you back to the spec for this last step: link. I hope this less confusing spec has helped you!