Fork me on GitHub
The letters of Walt Whitman
Needless to say, Walt loved his mother.
The above graphic is showing outgoing mail from Whitman, who he was sending it to, and how much he was sending. The colors reflect the average letter length for a given person - green is longer. Click when a letter is displayed and you will get linked to the source.

Last year a friend informed me that a bunch of Walt Whitman's correspondence was available in digital form on The Walt Whitman Archives. I wrote a screen scraper with python to harvest the data we wanted (the letters) from the site. The original goal was to extract interesting information from the letter, like locations, so I turned to python's nltk for text analysis. Originally I simply had a list of american cities to cross reference against, but the problem with that is there are many ambiguously named cities like "George", that match names or other proper nouns.
Above, locations mentioned in Whitman's letters are marked with black dots, and illuminated based on when they were sent. You can see how much more he sent, or is on record, during the civil war (early 1860's).
The named entity recognizer gave only slightly better results, I think because it was trained on more modern and less colloquial corpuses, but it also classified other things besides locations. Next I geotagged the cities to get a list of latitudes and longitudes that could be projected onto the map above. In the end a huge json file was created, with a ton of weird data about each letter. The possibilities are endless... you could make a visualization of Whitman's verb usage as he got older... As you clean data, and gain more metadata, you have to discard a lot of stuff. The final data-set only has about 1000 letters left in it, but the treatment of each featured letter is comprehensive. Take a look at my github and download the data to see how it is formatted. Special thanks to Jennifer Louthan for helping me make these visualizations.