Pictures vs. words

A word is also a picture of a word

Remember how on March 5th I was supposed to give a presentation in 1′, 2′, 10′, but it got bumped? And then the following week’s class was canceled, and then we had spring break. So it wasn’t until three weeks later, March 26, that I finally got to take my turn squirming at the front of the room. Three extra weeks! So, naturally, I used all that time working on my project, right?

Oh, no, that wouldn’t have been fair. I did revise my slides, but I left it until the last fucking minute, as usual, so as not to have an undue advantage over my classmates. Right. That’s totally why.

What I did do in the interim, however, was stumble across this fab webcast by Nancy Duarte about how to give better presentations:

After hearing her talk, I bought and started reading her book, Slide:ology, which is a more detailed presentation of the same suggestions.

(Here’s another Nancy Duarte webcast, which I haven’t watched yet: Creating Powerful Presentations.)

So between that and taking copious notes on my classmates’ midterm presentations, especially in Wearables, I got a lot of ideas about how I should redo my slides, as well as my overall presentation style. The result is a deck that does not make any sense unless I’m standing there with a remote, explaining it to you (PDF, 1.1 MB)—and using a remote is, I decided after watching a lot of in-class presentations, a good thing to do. I got my Mac remote to turn pages in Acrobat using a program called iRed Lite. I can’t really recommend it, since it stumped me for quite a while the first time I used it, and the next time I tried, a few weeks later, I positively could not figure out how I had ever made it work in the first place. There’s something about the UI that confuses the hell out of me. But it can, theoretically, do the job, and it’s free.

Some other things I learned from watching classmates’ presentations:

  • Proofread, proofread, proofread.
  • Stand while you present, even if you don’t have a remote. Think of someone you know who’s poised and relaxed speaking in front of a group, and then try to channel that person for five minutes. Breathe between sentences. Make eye contact.
  • I really don’t care about the technical side of your project. Don’t tell me what hardware and libraries and so forth are used in it; describe it to me as though I were a normal human being who doesn’t have four Arduini in her apartment right now. Just because I have them doesn’t mean I know how to use them.
  • Those very corporate-looking system diagrams showing how information will flow through your application? They’re completely unintelligible. Skip them.
  • You don’t have to make all your graphics all slick, in Illustrator or Photoshop. Hand-drawn diagrams or sketches can be much more engaging.
  • As early as possible in the presentation, show me some kind of image of what your project will be—or, better yet, the prototype you’re working on—so that I can hold that in my mind as you go into all the background and process and detail. If I don’t know what your project is yet, I probably won’t find the rest of that information interesting. This was true even though I knew perfectly well what my classmates’ projects were. When I watched their presentations, if they didn’t show and describe what they were making early on, I was unable to hold my attention on whatever else they were saying instead. Context.
  • Don’t put a lot of text on the screen. If you’re talking and there’s a whole paragraph on the screen behind you, my attention’s going to be split. And if it turns out that you’re just repeating what that paragraph says, almost word for word, I’ll feel exasperated. People should be listening to you for the words in your presentation, not reading them off the slides.
  • If you don’t have anything sexy to put on a slide for a given portion of your talk, it’s fine to
    1. repeat a previous slide, or
    2. show a slide that contains just one word representing that moment’s topic—“research,” for instance, or “inspiration.” Treat that text as a graphic element—make it big, pay attention to how it looks.
  • Typography!
  • If you have a relevant quotation to share, don’t bury it in a whole long paragraph; give it a slide by itself.
  • Don’t try to cover too much. It’s better to give people a thoughtful, measured thumbnail-presentation of the project and stop talking early enough that there’s time for people to ask questions about the parts that actually interest them than it is to brain-dump every piece of information you have, leaving time for only a few dazed comments from your audience at the end.
  • Videos of a thing working are helpful, but you have to explain what’s going on while it’s playing. This may be a good time to unload some of those boring technical details, while there’s a moving image to spice them up.
  • Proofread, proofread, proofread. If I had a dime for every typo I saw during midterm presentations . . . I offered my services as a proofreader in the Webgrrls-style need/give session we had in 1′ 2′ 10′, but nobody seemed to think they needed such a thing. They are wrong.
  • If you’re going to read some text that appears on a slide, do it slowly, with feeling; don’t just rush through it breathlessly, making it impossible for people to either read the text for themselves or follow what you’re saying. Make it clear that you’re reading what’s on the screen so people don’t have to struggle to figure it out. If you’re not able to introduce the text with something like “I’d just like to read you this quote, which really inspired me . . .” you probably shouldn’t be giving it a slide.

So, here again are the slides I ended up using (PDF, 1.1 MB) for my midterm presentation. I suppose some day maybe I’ll write captions more or less like what I said in front of the class, but in the meantime you can read the old slides if you want to know the gist.

Photo: A WORD IS ALSO A PICTURE OF A WORD by gwalton1; some rights reserved.

Markov bookshelf

best-seller covers

Oh, well. We’re back to kill-me-now territory.

This week’s assignment was

Get some XML from a web service. Extract some interesting information from the XML and use it as input to one of your homework assignments from previous weeks.

You could do this either by piping the output of your XML parsing program to the input of your previously implemented program, or by using a class (e.g., instantiate and use the MarkovChain class).

So I thought, cool, I’ll scramble the New York Times Best Seller titles using their shiny new API. Okay, so, I got an API key and managed to pull in lists of titles using—

import org.dom4j.Document;
import org.dom4j.io.SAXReader;
import org.dom4j.Element;
import java.util.List;

public class BestSeller 
{
  public static void main(String[] args) throws Exception 
  {

/*  Input should be one of the NY Times Best Seller list categories. 
    Available categories:
    hardcover-fiction
    hardcover-nonfiction
    hardcover-advice
    paperback-nonfiction
    paperback-advice
    trade-fiction-paperback
    picture-books
    chapter-books
    paperback-books
    series-books
    mass-market-paperback
*/
    String listName = args[0];

    SAXReader reader = new SAXReader();
    EasyHTTPGet getter = new EasyHTTPGet(
      "http://api.nytimes.com/svc/books/v2/lists/" + listName + ".xml?api-key=f25de92cf1d2615621b68d2d31f81b63:4:1703697"
    );

    Document document = reader.read(getter.responseAsInputStream());
    List bookTitle = document.selectNodes("//results/book/book_details/book_detail/title");

//  If I wanted to scramble the authors, . . .
//  List bookAuthor = 
//  document.selectNodes("//results/book/book_details/book_detail/author");

    for (Object o: bookTitle) 
    {
      Element elem = (Element)o;
      String text = elem.getText();
      System.out.println(text);
    } // end for
  } // end main
} // end class

I had tried to make an array of category names and then loop through them, but I got lost after the first bit—

import org.dom4j.Document;
import org.dom4j.io.SAXReader;
import org.dom4j.Element;
import java.util.List;

public class BestSellerAll 
{
  public static void main(String[] args) throws Exception 
  {

/*  There are 11 categories of NY Times Best Seller lists. I want to download all of them, but the Times makes you download them individually. 
*/

    String [] book_cats = new String [] { 
    "hardcover-fiction", "hardcover-nonfiction", "hardcover-advice", "paperback-nonfiction", "paperback-advice", "trade-fiction-paperback", "picture-books", "chapter-books", "paperback-books", "series-books", "mass-market-paperback"};

//    String listName = args[0];

    SAXReader [] reader = new SAXReader() [];
    EasyHTTPGet [] getter = new EasyHTTPGet [];
    for (int i = 0; i < 11; i++)
    {
        getter[i] = "http://api.nytimes.com/svc/books/v2/lists/" + book_cats[n] + ".xml?api-key=f25de92cf1d2615621b68d2d31f81b63:4:1703697";
    } // end for i

// I got sadly confused somewhere in this block:
    Document [] document = new Document [];
    for (int j = 0; j < 11; j++)
    {
        document[j] = reader.read(getter[j].responseAsInputStream());
        List bookTitle = document.selectNodes("//results/book/book_details/book_detail/title");
    } // end for j

//  If I wanted to scramble the authors, . . .
//  List bookAuthor = 
//  document.selectNodes("//results/book/book_details/book_detail/author");

    for (Object o: bookTitle) 
    {
      Element elem = (Element)o;
      String text = elem.getText();
      System.out.println(text);
    } // end for bookTitle
  } // end main
} // end class

So, semimanually, then, I ran this on one category list at a time and used cat to agglomerate them. So then I had my list of all the current best sellers’ titles:

HANDLE WITH CARE
CORSAIR
THE ASSOCIATE
THE HOST
RUN FOR YOUR LIFE
PROMISES IN DEATH
DEAD SILENCE
HEART AND SOUL
ONE DAY AT A TIME
NIGHT AND DAY
THE GUERNSEY LITERARY AND POTATO PEEL PIE SOCIETY
WHITE WITCH, BLACK CURSE
PATHS OF GLORY
TERMINAL FREEZE
FOOL
THE HELP
DON’T LOOK TWICE
FAULT LINE
TRUE COLORS
STORM FROM THE SHADOWS
OUTLIERS
HOUSE OF CARDS
THE YANKEE YEARS
OUT OF CAPTIVITY
DEWEY
THE LOST CITY OF Z
A LION CALLED CHRISTIAN
A BOLD FRESH PIECE OF HUMANITY
MY BOOKY WOOK
THE UNFORGIVING MINUTE
INSIDE THE REVOLUTION
MELTDOWN
ARE YOU THERE, VODKA? IT’S ME, CHELSEA
JESUS, INTERRUPTED
JOKER ONE
NO ANGEL
LORDS OF FINANCE
THE NEXT 100 YEARS
MULTIPLE BLESSINGS
PICKING COTTON
ACT LIKE A LADY, THINK LIKE A MAN
THE LAST LECTURE
THE POWER OF SOUL
THE SECRET
THE ULTRAMIND SOLUTION
FLAT BELLY DIET!
THE GREAT DEPRESSION AHEAD
PEAKS AND VALLEYS
UNCOMMON
THE SURVIVORS CLUB
MAGNIFICENT MIND AT ANY AGE
THE TOTAL MONEY MAKEOVER
EMOTIONAL FREEDOM
THE 4 DAY DIET
FIGHT FOR YOUR MONEY
THREE CUPS OF TEA
THE MIDDLE PLACE
I HOPE THEY SERVE BEER IN HELL
DREAMS FROM MY FATHER
THE TIPPING POINT
EAT, PRAY, LOVE
THE AUDACITY OF HOPE
MY HORIZONTAL LIFE
90 MINUTES IN HEAVEN
TEAM OF RIVALS
SAME KIND OF DIFFERENT AS ME
BLINK
MARLEY & ME
THE FORGOTTEN MAN
THE OMNIVORE’S DILEMMA
THE ZOOKEEPER’S WIFE
BEAUTIFUL BOY
A WHOLE NEW MIND
ANIMAL, VEGETABLE, MIRACLE
INFIDEL
THE LOVE DARE
WHAT TO EXPECT WHEN YOU’RE EXPECTING
EMERGENCY
SUZE ORMAN’S 2009 ACTION PLAN
NATURALLY THIN
TWILIGHT
SKINNY BITCH
THE FIVE LOVE LANGUAGES
THE POWER OF NOW
HAPPY FOR NO REASON
THE PURPOSE-DRIVEN LIFE
A NEW EARTH
THE BIGGEST LOSER 30-DAY JUMP START
HE’S JUST NOT THAT INTO YOU
THE BIGGEST LOSER FAMILY COOKBOOK
THE SHACK
THE READER
FIREFLY LANE
AMERICAN WIFE
SUNDAYS AT TIFFANY’S
PEOPLE OF THE BOOK
A THOUSAND SPLENDID SUNS
TAKE ONE
THE ALCHEMIST
SARAH’S KEY
THE MIRACLE AT SPEEDY MOTORS
THE BRIEF WONDROUS LIFE OF OSCAR WAO
REVOLUTIONARY ROAD
WATER FOR ELEPHANTS
THE WHITE TIGER
STILL ALICE
THE ELEGANCE OF THE HEDGEHOG
LOVING FRANK
THE KITE RUNNER
LUSH LIFE
THE HOUSE IN THE NIGHT
THE COMPOSER IS DEAD
BLUEBERRY GIRL
LISTEN TO THE WIND: THE STORY OF DR. GREG AND “THREE CUPS OF TEA”
LADYBUG GIRL AND BUMBLEBEE BOY
GALLOP!
CAT
SWING!
NAKED MOLE RAT GETS DRESSED
ALL IN A DAY
MILES TO GO
THE GRAVEYARD BOOK
THIRTEEN REASONS WHY
SCAT
THE HUNGER GAMES
FADE
THREE CUPS OF TEA
3 WILLOWS
SEEKERS: GREAT BEAR LAKE
THE MYSTERIOUS BENEDICT SOCIETY AND THE PERILOUS JOURNEY
EVERMORE
THE BOY IN THE STRIPED PAJAMAS
THE BOOK THIEF
THREE CUPS OF TEA: YOUNG READERS EDITION
TWEAK
WICKED: WITCH AND CURSE
CORALINE
THE MYSTERIOUS BENEDICT SOCIETY
THE TALE OF DESPEREAUX
SLAM
THE TWILIGHT SAGA
HOUSE OF NIGHT
DIARY OF A WIMPY KID
THE 39 CLUES
THE CLIQUE
PERCY JACKSON & THE OLYMPIANS
HARRY POTTER
NIGHT WORLD
KISSED BY AN ANGEL
INKHEART
THE WHOLE TRUTH
HOLD TIGHT
BONES
THE GRAND FINALE
PLAGUE SHIP
LOST SOULS
MONTANA CREEDS: DYLAN
DANGER IN A RED DRESS
THE APPEAL
THE MACKADE BROTHERS: RAFE AND JARED
MAVERICK
SMALL FAVOR
ANGELS AND DEMONS
THE READER
SECRETS
FIRST COMES MARRIAGE
CHASING DARKNESS
SHADOW COMMAND
TEMPTATION RIDGE
CONFESSIONS OF A SHOPAHOLIC

And then, all I wanted to fucking do was run the Markov code from week 5 and make some crazy new titles. I had it working earlier in this process, when I was working with just the hardcover fiction list, and came up with these completely uninteresting results:

TERMINAL FREEZE
TERMINAL FREEZE
TERMINAL FREEZE
THE ASSOCIETY
ONE DAY AT A TIME
NIGHT AND SOUL
ONE DAY AT A TIME
FOOL
STORM FROM THE HELP
FOOL
TERMINAL FREEZE
CORSAIR
ONE DAY AT A TIME
TRUE COLORS
FOOL
ONE DAY AT A TIME

Clearly, the set of words was too small to do anything really interesting with, which is when I decided to make one file of all the lists, to get more words. But then . . . I couldn’t get the Markov code to work anymore. Kept getting this stupid error:

java.lang.StringIndexOutOfBoundsException: String index out of range: 4
at java.lang.String.substring(String.java:1765)
at Markov.feedLine(Markov.java:21)
at MarkovFilter.eachLine(MarkovFilter.java:12)
at com.decontextualize.a2z.TextFilter.internalRun(TextFilter.java:326)
at com.decontextualize.a2z.TextFilter.run(TextFilter.java:208)
at MarkovFilter.main(MarkovFilter.java:6)

And it took me one million years to determine that this was because the code was choking on the colon in

LISTEN TO THE WIND: THE STORY OF DR. GREG AND “THREE CUPS OF TEA”

and then the quotation marks, and then something else, and then I still don’t know what. It will not work if I include any lines after “GALLOP.” So, with that frustrating exception noted, here, finally, are the new titles:

I HOPE THE STORM FROM MY FATHERE, VODKA? IT’S ME, CHELSEA
THE FIVE LANE
MY HORIZONTAL LIFE OF DIFFERENT MIND THE LOVE DARE
THE BRIEF WONDROUS LIFE OF THERE, VODKA? IT’S WIFE
INSIDE THE REVOLUTION AHEAD
THE 4 DAY JUMP START
MAGNIFICENT AS ME, CHELSEA
THE ELEGANCE OF SOUL
THE ELEPHANTS
THE UNFORGOTTEN TO THE YANKEE YEARS
THE HEDGEHOG
THE COLORS
THE BOY
PROMISES IN HEAVEN
I HOPE THE TOTAL LIFE OF Z
HE’S DILEMMA
PROMISES IN THE TIPPING MIND SPLENDID SUNS
THE ZOOKEEPER’S JUST NOT THAT TO EXPECT WHEN YOUR LIFE OF NOW
TWILIGHT ANY AGE
A THOUSE OF DIFFERENT MINUTES IN HEAVEN
A BOLD FRESH PIECE OF OSCAR WAO
THE KIND OF HUMANITY
OUT OF GLORY
SAME KITE TIGER
FIGHT ANY AGE
MAGNIFICENT MIND SOUL
SAME KIND OF HUMANITY
THE FIVE LANGUAGES
THE MIDDLE WITH CARDS
TERMINAL FRESH PIE SOCIATE
90 MINUTES IN DEAD SILENCE
HE’S 2009 ACTIONARY AND SOLUTIONAL FRESH PIECE OF DIFFERENT MIND SOUL
NIGHT FOR ELEGANCE OF THERE, VODKA? IT’S WIFE
STORM FROM THERE, VODKA? IT’S JUST NOT THAT INTO YOUR LIFE OF THE LOVE DAY JUMP START
THE NIGHT ANY AGE

Comparalator

date merchant

As you may recall, for my midterm project, I got stumped on several seemingly simple tasks. One of those—the most important, since upon it depends my semester-long assignment for Mainstreaming Information—was figuring out a way to compare one list of words to another and pull out the words that were unique to one of those lists. In my head, I can see very easily how this would be done. Given my special way of haphazardly flailing through code, however, I just couldn’t get it to work.

Until today!

In fiddling with the Bayesian comparison code for this week’s homework, I finally pulled out a list of unique words. Of course, this is a completely perverse misuse of that code—like using a steamroller to kill a pillbug—but as long as it works, I don’t fucking care.

So, here’s what I did. In BayesClassifier.java, I replaced the last two for loops with the following:

for (String word: uniqueWords) 
    {
      for (BayesCategory bcat: categories) 
      {
        double wordProb = bcat.relevance(word, categories);
        if (wordProb < 1)
        {
        println(word);
        }
        else {}
      } // end for bcat
    } // end for word

    for (BayesCategory bcat: categories) 
    {
      double score = bcat.score(uniqueWords, categoryWordTotal);
      println("---The following words were not found in " + bcat.getName());
    } // end for bcat

And in BayesCategory.java I replaced the percentage and relevance blocks with

 public double percentage(String word) 
  {
    if (count.containsKey(word)) 
    {
      return count.get(word);
    } // end if
    else 
    {
      return 0.001;
    } // end else
  } // end percentage

  public double relevance(String word, ArrayList<BayesCategory> categories) 
  {
    double percentageSum = 0;
    for (BayesCategory bcat: categories) 
    {
      percentageSum += bcat.percentage(word);
    } // end for bcat
    return percentage(word);
  } // end relevance

So now, if I run the command

$ java BayesClassifier A2_unique.txt < B1_unique.txt | sort >results.txt

I get a list of words that are in B1_unique.txt (The Masada Scroll by Paul Block and Robert Vaughan, 2007) but not in A2_unique.txt (Zuleika Dobson or, An Oxford Love Story by Max Beerbohm, 1911). For example,

Akbar, Allah, Allahu, Apostolic, Ariminum, Arkadiane, Asmodeus, Astaroth, Barabbas, Beelzebub, Bellarmino, Blavatsky, Brandeis, Breviary, Byzantine, Caiaphas, Calpurnius, Catacombs, Charlemagne, Clambering, DNA, Diavolo, Franciscan, Freemasons, GPS, Gymnasium, Haddad, Hades, IDs, IRA, Jettisoning, Kathleen, Lefkovitz, MD, MRI, Masada, Masonic, Muhammad, Muhammadan, Nazarene, Nazareth, Olympics, Orthodoxy, Palatine, Palazzi, Palestine, Palestinian, Palestinians, Petrovna, Pleasant, Plenty, Plunge, Pocketing, Pontiff, Pontifical, Pontius, Praetorian, Prissy, Professors, Protestants, Rasulullaah, Ratsach, Revving, Rosicrucians, Satan, Scrolls, Seder, Shakespeare, Syracuse, Tacitus, Theosophical, Torah, Trastevere, Turkish, USB, Uzi, VAIO, VCR, Yeah, Yechida, Yeetgadal, Yiddish, adrenalin, agita, airliner, airport, ankh, awesome, bitch, bomb, bookstores, braked, breastplate, briefcase, broadsword, broiler, brotherhood, bulrushes, cellular, checkpoint, chuckling, chutzpah, combatant, computer, dashboard, database, departmental, desktop, divorce, dysentery, electricity, enabling, entrepreneurs, firearms, firestorm, fishtailed, flagon, forensics, goatskin, groggily, gunfire, gunman, gunshots, handbag, handball, handbrake, handgun, helicopter, helmets, highwaymen, hijinks, homeland, homeless, homespun, hometown, innkeeper, internship, journalist, kebob, kidnappers, kilometers, lab, laptop, lyre, mawkish, monitor, muezzin, nickname, nightfall, nonbeliever, northeaster, notebook, notepad, notepaper, numerology, paganism, password, pastries, phone, photo, photocopies, photocopy, photograph, photos, pig, pigeons, pistol, playback, police, quintessentially, recycles, redialed, roadblock, roadway, sandwich, screensaver, site, sites, submachine, superheating, synagogue, taped, taxi, terrorism, terrorist, terrorists, thousandfold, thrashing, toga, tortured, trigonometry, universe, unto, vegetables, vehicles, video, videotape, vinegar, violence, warehouses, waterfall, welfare, wholeheartedly, whoosh, whore, windshield, worker, workstation, worldwide, yardstick, yarmulkes, yeetkadash, zooming

And if I run the comparison in the opposite direction, I come up with words such as

Abernethy, Abiding, Abimelech, Abyssinian, Academically, Academy, Accidents, Achillem, Adam, Adieu, Admirably, Age, Agency, Agents, Alas, Albert, Alighting, America, Atlantic, Australia, Balliol, Baron, Baronet, Britannia, Broadway, Brobdingnagian, Colonials, Cossacks, Crimea, Devon, Dewlap, Duchess, Duke, Dukedom, Earl, Edwardian, Egyptians, Elizabethan, Englishmen, Englishwoman, Europe, Holbein, Ireland, Iscariot, Isis, Japanese, Kaiser, Liberals, London, Madrid, Meistersinger, Messrs, Monsieur, Napoleon, Novalis, Papist, Parnassus, President, Prince, Professor, Prussians, Romanoff, Segregate, Slavery, Socrates, Switzerland, Tzar, Victoria, Wagnerian, Waterloo, Whithersoever, Zeus, absinthes, acolyte, adventures, affrights, affront, afire, afoot, aforesaid, aggravated, album, analogy, anarchy, ankle, ape, aright, aristocracy, ataraxy, automatically, avalanche, avow, balustrade, bandboxes, bank, beastliest, beau, beauteous, billiards, biography, bodyguard, bosky, boyish, broadcast, bruited, bulldog, businesslike, bustle, calorific, casuistry, catkins, chaperons, chidden, cigarettes, clergyman, cloven, comet, compeers, coquetry, cricket, crinolines, custard, dandiacal, dapperest, decanter, devil, dialogue, diet, dipsomaniacal, disemboldened, disinfatuate, drunken, ebullitions, equipage, exigent, eyelashes, eyelids, farthingales, female, femininity, fishwife, fob, forefather, forerunners, freemasonry, furbelows, gallimaufry, goodlier, gooseberry, gorgeous, gypsy, haberdasher, halfpence, handicapped, handicraft, handiwork, handwriting, hearthrug, helpless, hip, hireling, honeymoon, housemaid, housework, hoyden, hussy, idiotic, impertinent, impudence, inasmuch, incognisant, insipid, insolence, insouciance, item, keyboard, landau, legerdemain, loathsome, luck, maid, maidens, manhood, manumission, matador, maunderers, model, mushroom, nasty, newspaper, noodle, nosegay, novel, oarsmen, omnisubjugant, ostler, otiose, parasol, pinafore, poetry, poltroonery, postprandially, prank, prestidigitators, propinquity, queer, romance, sackcloth, salad, sardonic, saucy, schoolmaster, seraglio, sex, skimpy, skirt, snuff, socialistic, streetsters, surcease, surcoat, swooned, teens, telegram, telegraphs, thistledown, thither, thou, threepenny, tomboyish, toys, tradesmen, treacle, ugly, uncouthly, unvexed, vassalage, waylay, welter, wigwam, witchery, withal, woe, woebegone, womanly, womenfolk, wonderfully, wonderingly, wretchedness, wrought, yacht, yesternight, zounds

Exciting!

Cookalator

Five eggs

For our Context Free assignment, I chose to make a recipe generator. Mmm, don’t you want to cook this up right now?

  1. Toss 1/3 cup thinly sliced baking soda and 1 quart egg yolks with a whisk until just blended.
  2. Knead 5 cups cubed light corn syrup with a wooden spoon until golden brown.
  3. In a cast iron Dutch oven, pour 1/4 pinch egg yolks, 1/4 cup whole milk, and 2 pounds sifted egg.
  4. Simmer 3 pounds light corn syrup with 1 cup thinly sliced brown sugar and 2 teaspoons sifted heavy cream.
  5. In a small saucepan, boil 3 ounces thinly sliced egg, 4 cups peeled water, and 3/4 tablespoon water until incorporated.
  6. Whisk together 1 pound ginger and 2 cups candied egg whites until golden brown.
  7. Beat 2 quarts marzipan with a wooden spoon until the pieces are pea-sized.
  8. Grate 1/2 teaspoon sifted unsweetened cocoa powder with 2 quarts baking powder and 3/4 ounce ginger.
  9. Beat 5 pounds thinly sliced buttermilk, 5 quarts sifted whole milk, and 4 tablespoons thinly sliced egg with a wooden spoon until slightly warm.
  10. In a large bowl, boil 5 cups candied kosher salt, 2 cups diced lemon zest, and 1/2 cup sifted buttermilk until incorporated.
  11. In a shallow pan, simmer 5 tablespoons thinly sliced ginger until golden brown.
  12. Beat 2/3 teaspoon brown sugar, 2 ounces diced egg yolks, and 3 teaspoons unsalted butter with 3 teaspoons light corn syrup, 1/2 tablespoon thoroughly washed lemon zest, and 3/4 quart peeled whole milk.
  13. Brush 2 tablespoons grated kosher salt, 2/3 tablespoon sugar, and 4 pounds diced baking powder until it forms a thick syrup.

Here’s the grammar I used:

# procedures
S -> In a V Action IP with a Utensil until When
S -> In a V Action IP until When
S -> In a V Action IP .
S -> Action IP with IP .
S -> Action IP with a Utensil until When
S -> Action IP until When
IP -> IN
IP -> IN and IN
IP -> IN , IN , and IN
IN -> SQuantity SUnit Prep Ingredient
IN -> SQuantity SUnit Ingredient
IN -> PQuantity PUnit Prep Ingredient
IN -> PQuantity PUnit Ingredient
V -> Size Quality Vessel
V -> Size Vessel
V -> Quality Vessel
V -> Vessel

# components
SQuantity -> 1/4 | 1/2 | 1/3 | 3/4 | 2/3 | 1
PQuantity -> 2 | 3 | 4 | 5
SUnit -> pinch | teaspoon | tablespoon | ounce | cup | pound | quart
PUnit -> pinches | teaspoons | tablespoons | ounces | cups | pounds | quarts
Ingredient -> all-purpose flour | water | egg whites | egg yolks | heavy cream | whole milk | almond paste | marzipan | sugar | baking powder | baking soda | kosher salt | ginger | unsalted butter | egg | buttermilk | lemon zest | light corn syrup | bittersweet chocolate | unsweetened cocoa powder | brown sugar
Prep -> grated | cubed | chilled | sifted | diced | candied | thoroughly washed | peeled | thinly sliced | melted
Vessel -> bowl, | saucepan, | pot, | Dutch oven, | pan, | ramekin, | skillet,
Quality -> separate | airtight | cast iron | heatproof | nonstick | heavy-bottomed | shallow | buttered
Size -> large | small | medium
Action -> combine | whisk together | knead | brush | sprinkle | pour | stir | beat | boil | simmer | saute | brown | grate | mash | toss | blend
Utensil -> wooden spoon | whisk
When -> combined. | the pieces are pea-sized. | incorporated. | just blended. | it forms a thick syrup. | translucent. | golden brown. | slightly warm. | completely cooled.

Yummers!

A2Z midterm: Vocabu-lame

vocabulap, slide 7

Apparently, I have learned absolutely nothing all semester, because what seemed like a very straightforward project proved to be completely beyond my abilities.

The overarching goal is to generate data for the visualization I’m making for Lisa Strausfeld and Christian Marc Schmidt’s Mainstreaming Information class. The following are some slides explaining the gist of the project, provisionally called Vocabulap (vocabulary + overlap; not a handsome coinage):

My specific goals for the A2Z midterm were as follows (with subsequent comments in all caps):

For A2Z midterm
===============
Prep
—-
* Remove all blank lines
DONE
* Remove all extra spaces
DONE
* Break all lines – DONE
* Rename all to number consecutively: A01, A02, . . . A10 (for old books); B01, B02, . . . B10 (for new books)
DONE

Compare major sets
——————
* Extract the text from between the body tags in each file. Dump it out as a new file with the extension body.txt in the folder ../body.
THIS IS HARDER THAN IT LOOKS (FOR ME, AT LEAST). EASIER TO JUST CUT THEM OFF BY HAND.
* Concatenate all the files in each set.
DID THIS FROM THE COMMAND LINE, USING CAT
* Make a list of unique words in each concatenated set, with the number of times the word appears.
CAN GET THE UNIQUE WORDS, BUT NOT THE COUNT.
* Strip out all words beginning with numerals.
DONE BY HAND
* Create the following lists:
– Words shared by both major sets, with frequency counts
– Words unique to set 1, with frequency counts
– words unique to set 2, with frequency counts
I APPARENTLY CANNOT DO ANY OF THIS.

Find unique words in each book
——————————
For each book:
* Concatenate all the files in that major set *except* the file for that book.
* Make a list of the unique words, with frequency counts, in
– the current book
– the set of all books except the current one
* Make three lists:
– Words shared by all books in the major set, with frequency counts
– Words that appear only in the current book, with frequency counts
– Words that appear only outside the current book, with frequency counts

Return lines surrounding specific words
—————————————
For each word in a given list:
* Get the line numbers on which it appears.
For each appearance,
* Print the line above
* Print the line with the word, replacing it with itself wrapped in span tags to apply color
* Print the line below

The most essential piece of code that I could not get working is the comparison doodad. It almost worked for, like, five seconds, but it was generating a huge file of every unique word times however many words were in the document, or something like that. When I tried to fix it, it completely stopped working. The offending code is as follows:

/*  1. Takes in a file name from the command line.
    2. Makes a string array out of the hard-coded comparison file.
    3. Imports the contents of the file whose name was passed in.
    4. For each line of the input file (i.e., each word), changes it to 
       lowercase and checks to see if it's contained in the comparison file.
    5. If it's not in the comparison file, checks to see if it's in a hashset of 
       unique words.   
    6. If the word's not in the hashset, add it.
    7. Print the contents of the hashset.
*/

import java.util.ArrayList;
import java.util.HashSet;
import com.decontextualize.a2z.TextFilter;

public class CompareUnique extends TextFilter 
{
    public static void main(String[] args) 
    {
        new CompareUnique().run();
    } // end main

    private String filename = "body/unique/allB_uci.txt";
    private HashSet uniqueWords = new HashSet();
    private HashSet lowercaseWords = new HashSet();

    // make a String array out of the contents of the comparison file
    String[] checkAgainst = new TextFilter().collectLines(fromFile(filename));
    
  public void eachLine(String word) 
  {
    String wordLower = word.toLowerCase();
    for (int i = 0; i < checkAgainst.length; i++)
    {
        if (checkAgainst[i] != null && checkAgainst[i].contains(wordLower))
		{} // end if
		else if (checkAgainst != null)
		{
            if (lowercaseWords != null && lowercaseWords.contains(wordLower))
            { } // end if
            else if (lowercaseWords != null)
            {
                uniqueWords.add(wordLower);
                lowercaseWords.add(wordLower);
            } // end else
 		} // end else
    } // end for
  } // end eachLine

  public void end() 
  {
    for (String reallyunique: uniqueWords) {
      println(reallyunique);
    } // end for
  } // end end

} // end class

I know, it seems very simple, but you have no idea how long it took me to get this far.

So, basically, for the midterm I’ve got bupkis—just a big pile of text files, and a list of unique words for each.

Stuffstash slides

stuffstash cover

Was supposed to present this in 1′, 2′, 10′ today, but (a) I arrived late, and (b) the other presentations ran slightly long. So I have another week to work on it, before I have my turn with the projector. In the meantime, I’m putting together a little survey so I can try to narrow down the proposed feature set a bit. Details to come . . .

Kill me now.

Stabby McKnife

Oh, honestly.

For once I actually set out to be a slacker on the homework. The assignment began,

Modify, augment, or replace one of the in-class examples. A few ideas, in order of increasing complexity:

  • Make Unique.java insensitive to case (i.e., “Foo” and “foo” should not count as different words).
  • Modify WordCount.java to count something other than just words (e.g., particular characters, bigrams, co-occurrences of words, etc.). . . .

So I thought, “Today I feel like doing the easy thing. I’ll just take that first option.”

Yeah, right. Many hours later, after trying several extremely complicated methods of doing this really fucking simple thing, I finally found the rat-simple method that I’d been looking for all along and had all but given up hope of. Goddamnit.

The original code was this:

import java.util.HashSet;
import com.decontextualize.a2z.TextFilter;

public class Unique extends TextFilter {
  public static void main(String[] args) {
    new Unique().run();
  }
  
  private HashSet<String> uniqueWords = new HashSet<String>();

  public void eachLine(String line) {
    String[] tokens = line.split("\\W+");
    for (String t: tokens) {
      uniqueWords.add(t);
    }
  }

  public void end() {
    for (String word: uniqueWords) {
      println(word);
    }
  }
}

and what I came up with after way too much beating my head against the desk is this:

import java.util.HashSet;
import java.util.regex.*;
import com.decontextualize.a2z.TextFilter;

public class UniqueCI extends TextFilter 
{
  public static void main(String[] args) 
  {
    new UniqueCI().run();
  } // end main
  
  private HashSet<String> uniqueWords = new HashSet<String>();
  private HashSet<String> lowercaseWords = new HashSet<String>();

  public void eachLine(String line) 
  {
    String[] tokens = line.split("\\W+");
    for (String t: tokens) 
    {
    	// If hashset that's all lowercased contains t all lowercased, then don't add anything.
		
		String tLower = t.toLowerCase();

		if (lowercaseWords != null && lowercaseWords.contains(tLower))
		{
		} // end if
		else if (lowercaseWords != null)
		{
			uniqueWords.add(t);
			lowercaseWords.add(tLower);
 		} // end else
    } // end for
  } // end eachLine

  public void end() 
  {
    for (String word: uniqueWords) {
      println(word);
    } // end for
  } // end end
} // end class UniqueCI

If you put this in,

It is a truth universally acknowledged, that a single man in possession of a large fortune must be in want of a wife.
It Is A Truth Universally Acknowledged, That A Single Man In Possession Of A Large Fortune Must Be In Want Of A Wife.

you get this out:

of
possession
wife
truth
be
large
It
fortune
universally
single
that
acknowledged
man
a
must
want
is
in

Big whoop. I wish I could say I learned a lot from this, but I think all I learned is that I’m much more lost than I thought I was.

Photo: The Downward Knife by Jill Greenseth; some rights reserved.

StuffStash proposal

fabric stash

For my 1ʹ 2ʹ 10ʹ project, I’d like to create a craft-supply shopping and inventory website, provisionally titled stuffstash.com.

There are lots of great craft sites—such as Ravelry.com, for knitting and crocheting, and PatternReview.com, Vintage Sewing Pattern Wiki, and BurdaStyle.com, for sewing—that allow registered users to catalog their material or pattern stashes—stash being the most common term for the sprawling collection of supplies that crafters tend to accumulate over time. None of these sites seems to have a dedicated mobile version, however, and none of them allows one to record all the little bits that a project requires—pattern, fabric or yarn, notions, needles. This is unfortunate, as it would be really useful to be able to look up, while one are in a store, what one has and what one needs.

The website would have uses at all three interaction distances.

The mobile application should have a clean, simple interface that’s optimized for (a) looking stuff up when you’re in the fabric or yarn store and (b) forcing other people to look at a slide show of your finished or in-progress projects. The shopping lookup would have two paths—pattern first or material first. That is, you’re either holding a pattern in your hand and trying to remember what fabrics or yarns you have at home, and how much, or you’re fondling some yarn or fabric and trying to think of what you could make out of it and how much you’d need to buy to do so.

Other handy features to have while craft-supply shopping would be a unit converter, a reference chart of yarn sizes (there are several systems for indicating yarn weight—names such as “worsted” and “fingering,” recommended stitches per inch + needle size, wraps per inch, supposedly industry-standard numbered categories, etc.), needle inventory, and notions inventory (e.g., 1 × 7˝ black zipper, 24 × 3/4˝ rhinestone buttons). For each pattern in your collection, you’d want to be able to include

  • photos or illustrations of each view;1
  • the notions needed (thread, buttons, seam tape, beads, etc.);
  • the quantity of material or yarn needed;2
  • the kinds of material or yarn recommended by the publisher or pattern author; and
  • the URL where the complete pattern can be found, if applicable.

All the gnarly data entry would happen in a regular browser, of course, because typing on a phone or iPod Touch completely sucks. It would be great to be able import data from Ravelry and PatternReview, but only Ravelry seems to have an API. CSV import might also come in handy. And one should be able to print a shopping list for a given pattern, to carry to the store, for those [me] whose phones are not qualified to do anything other than make calls.

10ʹ

The project slide show function would be appropriate for TV viewing, so that one could delight one’s loved ones with a slide show of all the stuff one has been making.

  1. A sewing pattern envelope typically shows photos or stylized drawings of several options, named with letters or numbers—View A, View 1. It also includes line drawings of the front and back of the finished garment.
  2. Yardage for sewing patterns is traditionally shown as a matrix of all the options for each garment size, fabric width, fabric direction (“with nap,” i.e., monodirectional print or texture, vs. “without nap,” i.e., a fabric in which there is no difference between up and down), and view (see note 1).

India’s ITP blog