Friday, March 23, 2007

Google Translate Engine 0.8.0 released

For some of my upcoming projects, I was hoping to find some publicly available translation web service. I managed to find one, but it seemed to be under fairly heavy load and unavailable at times.

Google has a very nice, simple web interface for performing text translations. Unfortunately they don't offer an API for it. Since I couldn't find a suitable web service, it made sense to do my own implementation by scraping google's site. Besides, I could always use more experience with web services.

Before I could do any of that, I had to write the core library. It's not a web service, but it's completely usable in a java application. I haven't had an opportunity to test it out very much, but I decided to share it anyway. You can find the Google Translate Engine v 0.8, here on my software page. It appears to work fine, but I've had a real hard time getting my workstation to properly output unicode on the console, so it definitly requires more testing.

I played with lots of new stuff while making this: maven, javadoc, SAX, unit testing to name a few. I'm using John Cowan's nice tagSoup library to web scrape. It allows me to use a SAX handler even on badly formed html.

Now I'm in the process of trying to figure out how to best implement an old school JSR-109 servlet driven JAX-RPC web service. My target platform at the moment is Jboss. As a pre-requisite exercise I'll probably make a tiny RMI service that hooks into the translate engine. Hopefully I can get that done in the next couple of days.