Monday, June 22, 2009

iPhone OS 3.0

Awesome update. There seems to be one issue -- syncing contacts. I have them syncing over the web to google contacts. When I upgraded to 3.0, a TON of duplicate contacts appeared, as well as some that I didn't put in there in the first place (google's "suggested contacts" -- lots of sale-######@craigslist.org, etc). There seem to be 2 workarounds:

1. If you have a mac, create a blank group in mac contacts and choose to sync with that group, as well as over the web w/ google. Somehow that works, but I don't have a mac, so I didn't pay much attn.

The one that worked for me was:
2. Choose "overwrite existing contacts" in the Info tab on your iPhone in iTunes. Also, on the iPhone itself, go to Settings - Mail, Contacts and Calendars, and select your account (xxxx@gmail.com), that says "Contacts, Calendars" under it. Turn off Contacts and Calendars, and when the pop-up comes up, choose "Delete." That will clear your iPhone of some of its contacts. Then, turn off contact syncing from within iTunes. Sync your iPhone. Go back to the menu you were at on your iPhone, and turn back on the contacts and calendars, and when the pop-up comes up, choose "Delete" again. Now, you are writing to a blank iPhone. Phew.

Thanks, http://discussions.apple.com/thread.jspa?threadID=2048666.

Edit: the most useful iPhone/gCal link: m.google.com/sync (from your iPhone).

Labels:

Saturday, June 13, 2009

Kmeds Algo

This is the final project for a bioinformatics class I took last semester:

http://err.bio.nyu.edu/courses/index.php/V22.0480_Final_Project


I worked on Project 1. The goal was to find a scalable algorithm to cluster large data sets (something that couldn't all fit into memory) with arbitrary dimensions. We wrote a SQLite adapter to grab n data points at a time (based on k, the number of expected clusters), and then ran a clustering algorithm on those n points, storing the results in memory (if the data set was large enough to require it, we could write them back to the SQLite DB). After num.iter iterations, we run the algorithm again on the result set to get our final medoids. From there, its relatively easy to assign each point to a medoid, forming the final clusters.

Labels: , , ,

Thursday, June 11, 2009

rsync

The rsync utility is very cool. Its cool because its fast, effective and the syntax for the command is relatively simple. Its a great way for the uber-paranoid to avoid the Cloud. [Side note: if you are not uber-paranoid and do not mind the Cloud, check out Dropbox <-- shameless referral link. But you get extra space with that link, too! 2 GB -> 2.25 GB]

You can get rsync to do automatic, incremental backups for you, although I'm still working on that part. The best I have so far is to sync 1 or more computers with your server computer (I sync my computers at home using my dad's computer in San Francisco [static IP] as the server computer).

To do it (assuming the server is running an rsync deamon [more on that here], and your server name is in the /etc/hosts file), each computer (except for the server computer) needs 2 identical scripts. Mine are:


rsync \
--verbose \
--archive \
--compress \
--update \
coffee:~max/test1 ~/rsync/


and


rsync \
--verbose \
--archive \
--compress \
--update \
~/rsync/test1 coffee:~max/

This keeps the folder "test1" in my home directory on coffee (the server computer) and my ~/rsync directory on my home computers. The options are:
--verbose = tell me whats going on as its happening!
--archive = this is a cocktail of options: recurse into the directories, copy symlinks as symlinks and preserve permissions.
--compress = use compression to speed up the transmission, but use up more CPU
--update = don't overwrite newer files on the receiver (server) computer

The rest of the syntax is the same as a copy command. If you use a remote computer, preface its directory structure with the name of the computer followed by ":".

Another very very useful option when testing all this is --dry-run. rsync will go through all the steps it would have taken to make the sync without actually transferring any files. Especially useful if you use the --delete flag, which deletes files on the server computer that are no longer present on the source computer. Because of rsync's trailing slash issues, its easy to delete the contents of an entire directory with one wrong "/".

Here's a simple script if you're just using rsync to backup:

#!/bin/sh
BUNAME=$(date +%A)
rsync \
--verbose \
--archive \
--compress \
--update \
--backup \
--backup-dir=~max/$BUNAME/ \
~/rsync/test1/ coffee:~max/test1/


Because of the --backup and --backup-dir=... options, when rsync is going to change a file, it first makes a copy of the file to the directory specified by $BUNAME (which is defined as the day of the week i.e. Thursday) and then overwrites the file in the main directory (test1, in this case). This way, if you accidentally change a file you didn't mean to, you have backups of it. You could go so far as to make hourly directories within the day-of-the-week directories with some simple shell scripting...

Labels: , , , ,