Oh, honestly.
For once I actually set out to be a slacker on the homework. The assignment began,
Modify, augment, or replace one of the in-class examples. A few ideas, in order of increasing complexity:
- Make Unique.java insensitive to case (i.e., “Foo” and “foo” should not count as different words).
- Modify WordCount.java to count something other than just words (e.g., particular characters, bigrams, co-occurrences of words, etc.). . . .
So I thought, “Today I feel like doing the easy thing. I’ll just take that first option.”
Yeah, right. Many hours later, after trying several extremely complicated methods of doing this really fucking simple thing, I finally found the rat-simple method that I’d been looking for all along and had all but given up hope of. Goddamnit.
The original code was this:
[java]import java.util.HashSet;
import com.decontextualize.a2z.TextFilter;
public class Unique extends TextFilter {
public static void main(String[] args) {
new Unique().run();
}
private HashSet
public void eachLine(String line) {
String[] tokens = line.split(“\\W+”);
for (String t: tokens) {
uniqueWords.add(t);
}
}
public void end() {
for (String word: uniqueWords) {
println(word);
}
}
}[/java]
and what I came up with after way too much beating my head against the desk is this:
[java]import java.util.HashSet;
import java.util.regex.*;
import com.decontextualize.a2z.TextFilter;
public class UniqueCI extends TextFilter
{
public static void main(String[] args)
{
new UniqueCI().run();
} // end main
private HashSet
private HashSet
public void eachLine(String line)
{
String[] tokens = line.split(“\\W+”);
for (String t: tokens)
{
// If hashset that’s all lowercased contains t all lowercased, then don’t add anything.
String tLower = t.toLowerCase();
if (lowercaseWords != null && lowercaseWords.contains(tLower))
{
} // end if
else if (lowercaseWords != null)
{
uniqueWords.add(t);
lowercaseWords.add(tLower);
} // end else
} // end for
} // end eachLine
public void end()
{
for (String word: uniqueWords) {
println(word);
} // end for
} // end end
} // end class UniqueCI[/java]
If you put this in,
It is a truth universally acknowledged, that a single man in possession of a large fortune must be in want of a wife.
It Is A Truth Universally Acknowledged, That A Single Man In Possession Of A Large Fortune Must Be In Want Of A Wife.
you get this out:
of possession wife truth be large It fortune universally single that acknowledged man a must want is in
Big whoop. I wish I could say I learned a lot from this, but I think all I learned is that I’m much more lost than I thought I was.
Photo: The Downward Knife by Jill Greenseth; some rights reserved.