Categories: MSDN / DotNet / Java / Scripts / Linux / PHP Ask - La ask - La Answer

Unicode

The unicode value of "Latin small letter e with acute accent" () is 00E9. Why then does the following code print a question mark?
System.out.println("\u00e9");
Thanks,
Josh
[189 byte] By [jab630] at [2007-11-11 7:00:05]
# 1 Re: Unicode
Your code value is correct. What happens when you use a single quote ('\u00E9')? [I expect there is no difference, but since this is a CHARACTER data type, we might as well try addressing it as a character]
nspils at 2007-11-11 22:39:40 >
# 2 Re: Unicode
It works on my computer: Win98 with java 1.5
System.out.println("what is this: \u00e9");
what is this:
Norm at 2007-11-11 22:40:45 >
# 3 Re: Unicode
nspils: It does not matter whether it's a single quote or a double quote.
Norm: I can't believe it works on a Windows 98 machine but not on either my G4 running OS X or my Debian Linux box running Sarge, both with Java 1.5.
jab630 at 2007-11-11 22:41:44 >
# 4 Re: Unicode
Here's a cute program that will shows ASCII to Unicode conversions.
Its from "The Java Class Libraries" by Chan and Lee

import java.awt.*;
import java.awt.event.*;
import java.awt.datatransfer.*;

public class ASCII2Unicode extends Frame implements ActionListener {

public ASCII2Unicode() {
super("Copy Unicode to clipboard");
setFont(new Font("Monospaced", Font.PLAIN, 14));

// Create the 256 buttons.
Panel p = new Panel(new GridLayout(16, 0));
for (int i=0; i<256; i++) {
Button b = new Button("" + (char)i);

// If control character, display hex value.
if (Character.isISOControl((char)i)) {
String s = "0" + Integer.toHexString(i).toUpperCase();

b.setLabel(s.substring(s.length()-2));
}

b.setName(""+(char)i);

// Listen for events.
b.addActionListener(this);

// Add to panel.
p.add(b);
} // end for(i)

addWindowListener(new WindowAdapter() {
public void windowClosing(WindowEvent we) {
dispose();
System.exit(0);
}
});
// Layout and show components.
add(p, BorderLayout.CENTER);
pack();
// Center the frame
Dimension ss = Toolkit.getDefaultToolkit().getScreenSize();
setLocation((ss.width-getBounds().width)/2,
(ss.height-getBounds().height)/2);
show();
} // end

public void actionPerformed(ActionEvent evt) {
// Fetch the character.
char c = ((Component)evt.getSource()).getName().charAt(0);

// Format the unicode string for 'c'.
String result = "\\u00" + Integer.toHexString(c&0xff);

// Place result in system clipboard.
StringSelection contents = new StringSelection(result);
getToolkit().getSystemClipboard().setContents(contents, null);
}

// Start the app
public static void main(String args[]) {
new ASCII2Unicode();
}
}
Norm at 2007-11-11 22:42:49 >
# 5 Re: Unicode
Thanks for finding that, but unfortunately, that doesn't help me for two reasons: 1) I am actually trying to display characters outside of the ascii character set (i just used the accented e as a trivial example) and 2) I can already get unicode characters to show up fine in awt components, it's only when I try to System.out.print them that I'm getting the question marks.
jab630 at 2007-11-11 22:43:53 >
# 6 Re: Unicode
So the problem is with the OS and its font tables? When presented with a character not in its set of printables it displays a ?
Norm at 2007-11-11 22:44:47 >