Author Topic: RTB Development (Read 397138 times)

M · February 02, 2010, 07:58:16 AM

Quote from: Ephialtes on February 02, 2010, 07:52:40 AM

Your escaped code is incorrect - How are you expecting it to represent something like U+00D6? UTF-8 will only use a single byte for anything up to 7F (127) beyond that, it uses a second byte.

Code: [Select]

\xD6It's not leaving the character as a single character. It's converting it into the escape sequence that Torque can use collapseescape() to convert back into the character.

Ad Bot · Advertisement

Ephialtes · February 02, 2010, 09:25:04 AM

You do know that \x is the unicode escape character right? This is still not using UTF-8 characer encoding so it would constitute invalid XML.

M · February 02, 2010, 10:29:44 AM

Quote from: Ephialtes on February 02, 2010, 09:25:04 AM

You do know that \x is the unicode escape character right? This is still not using UTF-8 characer encoding so it would constitute invalid XML.

Escape character. As in, it's not a unicode character of itself, it's a backslash and an X followed by a hexadecimal number.

Ephialtes · February 02, 2010, 10:32:49 AM

\xD6 is unicode. This needs to be turned into UTF-8. I don't know why I'm still having this discussion.

M · February 02, 2010, 10:35:43 AM

Quote from: Ephialtes on February 02, 2010, 10:32:49 AM

\xD6 is unicode. This needs to be turned into UTF-8. I don't know why I'm still having this discussion.

Then send it as \\xD6.
If you can't send a backslash followed by an X then a D then a 6, I am seeing some serious problems in the future of RTB Connect.

Edit: Snipped frustration.

Ephialtes · February 02, 2010, 10:39:59 AM

\xD6 does not constitute valid UTF-8 encoding. Java will not even interpret \xD6 as anything - that's just invalid. \x does not exist in Java.

M · February 02, 2010, 10:43:25 AM

Quote from: Ephialtes on February 02, 2010, 10:39:59 AM

\xD6 does not constitute valid UTF-8 encoding. Java will not even interpret \xD6 as anything - that's just invalid. \x does not exist in Java.

I am saying to you, to send a backslash. Followed by an X. Followed by the sequence. As individual characters. If you have to, escape the backslash so it's its own character, not part of a sequence. Then, when the Torque client receives this sequence with its individual backslashes and other characters, you can collapse the escape sequences, and voila. Unicode characters, and all you ever sent to the server was backslashes and letters and numbers. No umlauts or accents.

Ephialtes · February 02, 2010, 10:47:56 AM

Then people using other clients like Pidgin or Psi are going to see people with names like "Ge\xD6rge" - that's not acceptable.

M · February 02, 2010, 10:50:22 AM

Quote from: Ephialtes on February 02, 2010, 10:47:56 AM

Then people using other clients like Pidgin or Psi are going to see people with names like "Ge\xD6rge" - that's not acceptable.

And why would these clients not have some way to convert unicode characters as well? I'm sure that any widely-used client has some form of unicode support via XMPP. That is, however, a valid point.

Ephialtes · February 02, 2010, 10:53:13 AM

Because the XMPP specification is for UTF-8 only. There is no "oh but use Unicode or EBCIDIC if you feel like it" clause.

Quote from: XMPP RFC 3920

11.5. Character Encoding

   Implementations MUST support the UTF-8 (RFC 3629 [UTF-8])
   transformation of Universal Character Set (ISO/IEC 10646-1 [UCS2])
   characters, as required by RFC 2277 [CHARSET]. Implementations MUST
   NOT attempt to use any other encoding.

M · February 03, 2010, 05:06:34 AM

Whoops. Forgot that I didn't respond to that.

Well basically from how I read that, by using XMPP you are consenting that you will not use any non-UTF-8 encoding in any way. I wasn't aware of this part of the specification, but that basically looks like as long as you're using XMPP you will have to block anyone who uses unicode characters in their name, and strip them from messages, etc. - with no option, as long as you're using XMPP, for translation of these characters in any way, escaped or otherwise.

I wasn't aware that this was part of the specification. If you'd posted that earlier, the entire discussion wouldn't have happened, as by using XMPP you are effectively agreeing (by my understanding) not to use any non-UTF-8 characters, escaped or otherwise.

Ephialtes · February 03, 2010, 05:50:09 AM

Quote from: M on February 03, 2010, 05:06:34 AM

that basically looks like as long as you're using XMPP you will have to block anyone who uses unicode characters in their name, and strip them from messages, etc. - with no option, as long as you're using XMPP, for translation of these characters in any way, escaped or otherwise.

You need to do some work with character encoding, UTF-8 can represent all the characters that unicode can. There is no compromise, UTF-8 just allows backwards compatibility with ASCII where standard Unicode doesn't.

Quote from: M on February 03, 2010, 05:06:34 AM

If you'd posted that earlier, the entire discussion wouldn't have happened

When you decide to start proposing solutions for an issue you were never even invited to give your opinion on it is generally considered polite to understand (or atleast research) the situation correctly before leading people on some sort of wild goose chase.

Anyway, I'm going to use the extra dev time (RTB v4 will have to be released after v15) to look into sharing save files as well as getting a head start on the new website development.

Chrono · February 03, 2010, 01:48:36 PM

Quote from: Ephialtes on February 03, 2010, 05:50:09 AM

When you decide to start proposing solutions for an issue you were never even invited to give your opinion on it is generally considered polite to understand (or atleast research) the situation correctly before leading people on some sort of wild goose chase.

Anyway, I'm going to use the extra dev time (RTB v4 will have to be released after v15) to look into sharing save files as well as getting a head start on the new website development.

Good idea. I would also stop wasting time talking to people about character encoding. I'd rather let them feel ignored and carry on to more important things. If they want to know that badly they can look it up.

Butler · February 03, 2010, 05:00:47 PM

Now, dont' flame me... but why AFTER v15?

Chrono · February 03, 2010, 05:26:15 PM

Quote from: Butler on February 03, 2010, 05:00:47 PM

Now, dont' flame me... but why AFTER v15?

He's already explained it.

There needs to be engine changes.

And he doesn't mean when v16 comes out. He just means when v15 comes out, he'll work on getting it working.