Minor Text Munger Rothdas book review RSS
8-14-2014

Ok, a small project is close to being complete. The project is called Err-Text, and is one of my first ventures into analyzing and modifying text. The idea is to take text, and then apply styles to it (e.g. Common-Grammar-Mistakes, Misspellings, 4Chan, Orcish, Rabellesian, etc.) to generate new text. The original project motivation was to create a sort of more advanced Dis-Emvoweler, where you could degrade text in more subtle and perhaps more fitting ways than just removing the vowels. I thought it could be useful for forum moderators or possibly people with terrible senses of humor like myself. As I worked on it though I've also enjoyed adding styles which are more humerous or possibly interesting or improving of the original text.

The project is really meant to be one of a pair; the second part will be Irr-Text, and will generate HP Lovecraft/Necronomicon like texts, similar to the ones found in Annihilation. Hopefully the second part will go faster.

Oh yes, and everything is terrible. For some reason there aren't any good, simple dictionaries in C#, instead all they have are these massive NLP programs meant for your PhD research. I finally found a datafile in WordNet that I could read and use, but it's a long way from perfect. In some ways it has too little information (doesn't realize that "crushing" is a word, only that "crush" is a word), and in other ways too much (it says that "are" is a noun, since apparently "are" is a really obscure unit of measurement). (EDIT: ok, I'm an idiot, slightly different search terms produced a wealth of excellent dictionaries.)

And then there is Unity again. For a second time I tried using Unity to provide a web interface for my app, and for the second time it was a mistake. Unity doesn't let me highlight text changes like WinForms do, and it does not yet support .Net 4.0 (which was released years ago), and so I have to downgrade all of the utility libraries that I am using to support Err-Text. Oh, and the web-version doesn't allow the Serialization libraries, so that's another downgrade I had to put in place for the Utilities. Hmm, what else. I tried out another of the Unity GUI libraries, nGui, which is supposedly better. I bounced off of it though, as like a lot of the Unity stuff nGui is very WYSIWIG oriented rather than code-oriented. For instance, here is the absurd procedure that nGui usese to create a new button through code. I did at least get Visual Studio working with Unity, though I'm still not advanced enough to actually debug using Visual Studio. Instead I've been using logging statements to debug the GUI, like some mud-smeared tribesman.