Category Archives: final project

OMFG! I can’t believe this actually works!

Book grid sorted by word count

I made a Flex application!!!1111!!!!

Okay, so what is this thing?

Well, each green block represents the total word count of a book; that count is also given by the big number beneath each book. The orange stripe represents the number of unique words in the book, case insensitive (so The and the count as one). It’s all a continuation of this data visualization project I’ve been working on all semester.

And what’s it made of? Aside from snips, snails, and puppydog tails, it comprises some really stupid code, four static XML docs, some CSS, and one . . . class? I think it’s what passes for a class in Flex. God only knows. Here’s the filthy, embarrassing source code.

Yes, I am well aware that this is some of the most fucked-up, redundant, unnecessarily hard-coded shit you’ve ever seen, and that the radio buttons don’t work right on the first click, but considering that I only started learning Flex on, like, Thursday, and that I didn’t start trying to code this thing in earnest until the wee hours of Monday morning, I think it’s Oh. Kay.

And, yeah, no, I couldn’t figure out how to get rid of the gap between the orange and green blocks. CSS in Flex is really weird and undernourished.

Next steps:

  • Make this code not suck.
  • Add more data (I’ve got about 20 more books on hand to process, and then it’s time to hit Bittorrent).
  • Make the code for pulling out the word counts not suck. Right now it’s case-sensitive, which I don’t want it to be (I’ve been batch-converting the text to lowercase before processing it), and I’ve been manually deleting all words that start with numbers or that look likely to be roman numerals. I’ve also been doing this from Terminal, one file at a time, when it really ought to be able to process in batches. This proves that I am not nearly lazy enough, otherwise I would have dealt with this weeks ago, in order to spare myself a lot of tedious busywork.
  • Make the design not suck (e.g., get rid of those gaps, and replace the radio buttons with something less nasty).
  • Add more views—for example, something should actually happen when you mouse over or click on a book thumbnail, besides it lighting up in hideous powder blue. There are a lot more ways I want to slice up this data, and I still want to be able to compare books or sets of books. That will require building the word-counting code into the Flex app somehow.

But in the meantime, w000t! It works!

Comparalator

date merchant

As you may recall, for my midterm project, I got stumped on several seemingly simple tasks. One of those—the most important, since upon it depends my semester-long assignment for Mainstreaming Information—was figuring out a way to compare one list of words to another and pull out the words that were unique to one of those lists. In my head, I can see very easily how this would be done. Given my special way of haphazardly flailing through code, however, I just couldn’t get it to work.

Until today!

In fiddling with the Bayesian comparison code for this week’s homework, I finally pulled out a list of unique words. Of course, this is a completely perverse misuse of that code—like using a steamroller to kill a pillbug—but as long as it works, I don’t fucking care.

So, here’s what I did. In BayesClassifier.java, I replaced the last two for loops with the following:

[java]for (String word: uniqueWords)
{
for (BayesCategory bcat: categories)
{
double wordProb = bcat.relevance(word, categories);
if (wordProb < 1) { println(word); } else {} } // end for bcat } // end for word for (BayesCategory bcat: categories) { double score = bcat.score(uniqueWords, categoryWordTotal); println("---The following words were not found in " + bcat.getName()); } // end for bcat[/java] And in BayesCategory.java I replaced the percentage and relevance blocks with [java] public double percentage(String word) { if (count.containsKey(word)) { return count.get(word); } // end if else { return 0.001; } // end else } // end percentage public double relevance(String word, ArrayList categories)
{
double percentageSum = 0;
for (BayesCategory bcat: categories)
{
percentageSum += bcat.percentage(word);
} // end for bcat
return percentage(word);
} // end relevance[/java]

So now, if I run the command

$ java BayesClassifier A2_unique.txt < B1_unique.txt | sort >results.txt

I get a list of words that are in B1_unique.txt (The Masada Scroll by Paul Block and Robert Vaughan, 2007) but not in A2_unique.txt (Zuleika Dobson or, An Oxford Love Story by Max Beerbohm, 1911). For example,

Akbar, Allah, Allahu, Apostolic, Ariminum, Arkadiane, Asmodeus, Astaroth, Barabbas, Beelzebub, Bellarmino, Blavatsky, Brandeis, Breviary, Byzantine, Caiaphas, Calpurnius, Catacombs, Charlemagne, Clambering, DNA, Diavolo, Franciscan, Freemasons, GPS, Gymnasium, Haddad, Hades, IDs, IRA, Jettisoning, Kathleen, Lefkovitz, MD, MRI, Masada, Masonic, Muhammad, Muhammadan, Nazarene, Nazareth, Olympics, Orthodoxy, Palatine, Palazzi, Palestine, Palestinian, Palestinians, Petrovna, Pleasant, Plenty, Plunge, Pocketing, Pontiff, Pontifical, Pontius, Praetorian, Prissy, Professors, Protestants, Rasulullaah, Ratsach, Revving, Rosicrucians, Satan, Scrolls, Seder, Shakespeare, Syracuse, Tacitus, Theosophical, Torah, Trastevere, Turkish, USB, Uzi, VAIO, VCR, Yeah, Yechida, Yeetgadal, Yiddish, adrenalin, agita, airliner, airport, ankh, awesome, bitch, bomb, bookstores, braked, breastplate, briefcase, broadsword, broiler, brotherhood, bulrushes, cellular, checkpoint, chuckling, chutzpah, combatant, computer, dashboard, database, departmental, desktop, divorce, dysentery, electricity, enabling, entrepreneurs, firearms, firestorm, fishtailed, flagon, forensics, goatskin, groggily, gunfire, gunman, gunshots, handbag, handball, handbrake, handgun, helicopter, helmets, highwaymen, hijinks, homeland, homeless, homespun, hometown, innkeeper, internship, journalist, kebob, kidnappers, kilometers, lab, laptop, lyre, mawkish, monitor, muezzin, nickname, nightfall, nonbeliever, northeaster, notebook, notepad, notepaper, numerology, paganism, password, pastries, phone, photo, photocopies, photocopy, photograph, photos, pig, pigeons, pistol, playback, police, quintessentially, recycles, redialed, roadblock, roadway, sandwich, screensaver, site, sites, submachine, superheating, synagogue, taped, taxi, terrorism, terrorist, terrorists, thousandfold, thrashing, toga, tortured, trigonometry, universe, unto, vegetables, vehicles, video, videotape, vinegar, violence, warehouses, waterfall, welfare, wholeheartedly, whoosh, whore, windshield, worker, workstation, worldwide, yardstick, yarmulkes, yeetkadash, zooming

And if I run the comparison in the opposite direction, I come up with words such as

Abernethy, Abiding, Abimelech, Abyssinian, Academically, Academy, Accidents, Achillem, Adam, Adieu, Admirably, Age, Agency, Agents, Alas, Albert, Alighting, America, Atlantic, Australia, Balliol, Baron, Baronet, Britannia, Broadway, Brobdingnagian, Colonials, Cossacks, Crimea, Devon, Dewlap, Duchess, Duke, Dukedom, Earl, Edwardian, Egyptians, Elizabethan, Englishmen, Englishwoman, Europe, Holbein, Ireland, Iscariot, Isis, Japanese, Kaiser, Liberals, London, Madrid, Meistersinger, Messrs, Monsieur, Napoleon, Novalis, Papist, Parnassus, President, Prince, Professor, Prussians, Romanoff, Segregate, Slavery, Socrates, Switzerland, Tzar, Victoria, Wagnerian, Waterloo, Whithersoever, Zeus, absinthes, acolyte, adventures, affrights, affront, afire, afoot, aforesaid, aggravated, album, analogy, anarchy, ankle, ape, aright, aristocracy, ataraxy, automatically, avalanche, avow, balustrade, bandboxes, bank, beastliest, beau, beauteous, billiards, biography, bodyguard, bosky, boyish, broadcast, bruited, bulldog, businesslike, bustle, calorific, casuistry, catkins, chaperons, chidden, cigarettes, clergyman, cloven, comet, compeers, coquetry, cricket, crinolines, custard, dandiacal, dapperest, decanter, devil, dialogue, diet, dipsomaniacal, disemboldened, disinfatuate, drunken, ebullitions, equipage, exigent, eyelashes, eyelids, farthingales, female, femininity, fishwife, fob, forefather, forerunners, freemasonry, furbelows, gallimaufry, goodlier, gooseberry, gorgeous, gypsy, haberdasher, halfpence, handicapped, handicraft, handiwork, handwriting, hearthrug, helpless, hip, hireling, honeymoon, housemaid, housework, hoyden, hussy, idiotic, impertinent, impudence, inasmuch, incognisant, insipid, insolence, insouciance, item, keyboard, landau, legerdemain, loathsome, luck, maid, maidens, manhood, manumission, matador, maunderers, model, mushroom, nasty, newspaper, noodle, nosegay, novel, oarsmen, omnisubjugant, ostler, otiose, parasol, pinafore, poetry, poltroonery, postprandially, prank, prestidigitators, propinquity, queer, romance, sackcloth, salad, sardonic, saucy, schoolmaster, seraglio, sex, skimpy, skirt, snuff, socialistic, streetsters, surcease, surcoat, swooned, teens, telegram, telegraphs, thistledown, thither, thou, threepenny, tomboyish, toys, tradesmen, treacle, ugly, uncouthly, unvexed, vassalage, waylay, welter, wigwam, witchery, withal, woe, woebegone, womanly, womenfolk, wonderfully, wonderingly, wretchedness, wrought, yacht, yesternight, zounds

Exciting!

Stuffstash slides

stuffstash cover

Was supposed to present this in 1′, 2′, 10′ today, but (a) I arrived late, and (b) the other presentations ran slightly long. So I have another week to work on it, before I have my turn with the projector. In the meantime, I’m putting together a little survey so I can try to narrow down the proposed feature set a bit. Details to come . . .

StuffStash proposal

fabric stash

For my 1ʹ 2ʹ 10ʹ project, I’d like to create a craft-supply shopping and inventory website, provisionally titled stuffstash.com.

There are lots of great craft sites—such as Ravelry.com, for knitting and crocheting, and PatternReview.com, Vintage Sewing Pattern Wiki, and BurdaStyle.com, for sewing—that allow registered users to catalog their material or pattern stashes—stash being the most common term for the sprawling collection of supplies that crafters tend to accumulate over time. None of these sites seems to have a dedicated mobile version, however, and none of them allows one to record all the little bits that a project requires—pattern, fabric or yarn, notions, needles. This is unfortunate, as it would be really useful to be able to look up, while one are in a store, what one has and what one needs.

The website would have uses at all three interaction distances.

The mobile application should have a clean, simple interface that’s optimized for (a) looking stuff up when you’re in the fabric or yarn store and (b) forcing other people to look at a slide show of your finished or in-progress projects. The shopping lookup would have two paths—pattern first or material first. That is, you’re either holding a pattern in your hand and trying to remember what fabrics or yarns you have at home, and how much, or you’re fondling some yarn or fabric and trying to think of what you could make out of it and how much you’d need to buy to do so.

Other handy features to have while craft-supply shopping would be a unit converter, a reference chart of yarn sizes (there are several systems for indicating yarn weight—names such as “worsted” and “fingering,” recommended stitches per inch + needle size, wraps per inch, supposedly industry-standard numbered categories, etc.), needle inventory, and notions inventory (e.g., 1 × 7˝ black zipper, 24 × 3/4˝ rhinestone buttons). For each pattern in your collection, you’d want to be able to include

  • photos or illustrations of each view;1
  • the notions needed (thread, buttons, seam tape, beads, etc.);
  • the quantity of material or yarn needed;2
  • the kinds of material or yarn recommended by the publisher or pattern author; and
  • the URL where the complete pattern can be found, if applicable.

All the gnarly data entry would happen in a regular browser, of course, because typing on a phone or iPod Touch completely sucks. It would be great to be able import data from Ravelry and PatternReview, but only Ravelry seems to have an API. CSV import might also come in handy. And one should be able to print a shopping list for a given pattern, to carry to the store, for those [me] whose phones are not qualified to do anything other than make calls.

10ʹ

The project slide show function would be appropriate for TV viewing, so that one could delight one’s loved ones with a slide show of all the stuff one has been making.

  1. A sewing pattern envelope typically shows photos or stylized drawings of several options, named with letters or numbers—View A, View 1. It also includes line drawings of the front and back of the finished garment.
  2. Yardage for sewing patterns is traditionally shown as a matrix of all the options for each garment size, fabric width, fabric direction (“with nap,” i.e., monodirectional print or texture, vs. “without nap,” i.e., a fabric in which there is no difference between up and down), and view (see note 1).

Baby steps

grid of 61 colored squares

This is the second smidgen of the code for our final project. It pulls RGB values and color names from a tab-delimited text file (which is, itself, based on the actual Krylon color options) and outputs this grid of swatches. The swatches don’t do anything yet—just drawing them took me, like, two days, thank you very much, and that was with some very helpful help from Shawn. Partly this is because I apparently can’t keep in my head for more than thirty seconds how arrays and objects work, and partly it’s because I just. can’t. focus. And partly it’s because I apparently have no idea what the fuck I’m doing.

I’m beginning to really like Diego’s Plan B, as proposed over the weekend:

Fake our own deaths.

Continue reading Baby steps