Author Topic: Translator For Blockland  (Read 5994 times)

Like Chinese? Chinese has like, a million characters to it. Plus, it is a pictographic language.

I'll warn you now, sometimes even a direct translation won't work.  If I recall correctly from my Spanish class:

me gusta la mesa -> I like the table
me gustan mesas -> I like tables

Going from spanish to english doesn't look to bad there, just map "gusta" and "gustan" to "like" and that solves it, but going backwards:

I like the table -> me gusta la mesa
I like tables -> me gustan mesas

This is a bit trickier since you have to look at words after the verb to know which word to use (gusta or gustan).

One last example, if you had the following words mapped:

I -> Yo
like -> gusta
the -> el
table -> mesa
tables -> mesas

(I just picked the Spanish words that I figured occured most frequently when translating these English words to Spanish.)

Then if I said this:

I like the tables

it would translate to this:

Yo gusta el mesas

Now, I don't speak spanish fluently or anything, but I'd imagine that this would look like garbage to someone who only knows spanish because it should have been translated as follows:

Me gustan las mesas

I'd imagine that this would happen between any two languages, so good luck and have fun.

Oh yeah, this sort of thing is good times, BlueGreen. Verb conjugation in foreign languages is fun. I know in Greek, you can usually drop the personal pronoun; is this is the case in Spanish? In other words, could you just write "gusta la mesa" without the "me"?

A lot of this type of stuff in Greek too has to do with literal meanings, where technically the sentence would literally translate as "the tables are liked by me" in some cases, and as such you can't do a strict word-by-word translation. This is a bad example for Greek but you get the idea.

All in all, I rate this project 10/10 in difficulty; I've written language parsers before and they are NOT fun. The hard part is the actual parsing and translation, not the support functions.

could you just write "gusta la mesa" without the "me"?
No, the "me" tells who the subject. If you had "Yo me gusta bla bla bla", you could drop the Yo since the "me" tells the subject. Many people drop Nosotros cause it's a long word, for example.

WOW its been like a week seens a Last looked at this I don't have alot of time to read this but I'm only going to go up to Dutch beta your right very hard stuff and very time taking from me

... I know in Greek, you can usually drop the personal pronoun; is this is the case in Spanish?

Well, for most verbs you can, but for gustar (to like) you can't because there would be no way to know who likes what.  The conjugation of the verb gustar in Spanish is different in that its form is chosen based on what is liked, not who is liking it.

The problem of making an automatic translator breaks down to something like this:

First you need to be able to directly translate the words.

IF you manage that to any reasonable degree (around 5,000 to 10,000 words), you can then move on to conjugation of verbs and arrangement of subjects and such so that things make more sense.

IF you manage to make it translate the regular stuff, you might want to try your hand at dealing with irregular things that break the rules you just spent most of your time trying to get to work in the first place.

IF, and only IF you manage that, the last thing you might try is translating common phrases and such that break every rule thus far.  For instance, directly translating "What's up?" to other languages will cause problems because whoever receives the message might start looking at the sky wondering what you saw.

IF, by some unholy, wicked and twisted bit of programming you manage that, then you somehow found the secret to eternal life.  PROOF:  Languages change all the time.  You'd have to always be updating your code to make sure it works.

Like Chinese? Chinese has like, a million characters to it. Plus, it is a pictographic language.

Actually, I believe there are around a few thousand Chinese characters.  Unicode only has room for 65,536 characters and I believe that all the characters from every language can fit neatly in that number with good bit of room left.

The problem with the pictographic language isn't really a problem if Unicode support is available with an apropriate font, but if I'm correct, the blockland chat is limited to ASCII anyway so Chinese just isn't possible (well, maybe phonetically it might, but then they'd have to be able to read and pronounce the characters used which would defeat the purpose.)