lefedt logo
21 Oct 2013 / posted by Nikku

Working on an internationalized web application I realized that Glassfish 4.0 is still using ISO 8859-1 encoding to serve web resources. Too bad, as UTF-8 is the de facto standard for encoding on the web.

This short post shows my recipe to make glassfish ready for non latin1 languages.

TLDR;

Add a filter with the following lines of code

request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8");

as the first filter in your web application filter chain.

Add -Dfile.encoding=UTF8 to JVM properties in case you load resources from disk.

Fix Request Encoding

Explicitly set the encoding to UTF-8 before request body and or parameters are accessed:

request.setCharacterEncoding("UTF-8");

Alternatively, add the following line to the WEB-INF/glassfish-web.xml of a web application:

<glassfish-web-app>
  <parameter-encoding default-charset="UTF-8" />
</glassfish-web-app>

Fix response encoding

Response encoding cannot be externally configured. Instead it is always derived from current locale programmatically configured character encoding.

Explicitly set encoding before response writer is accessed:

response.setCharacterEncoding("UTF-8");

Otherwise, Glassfish will try to determine encoding from current locale which will always result in inappropriate ASCII derivates.

Fix loading of web resources

In case you are loading files from disk (i.e. from exploded web application archives), make sure to force Glassfish to load these files as UTF-8 by specifying the JVM property

-Dfile.encoding=UTF8

Read more

Tagged as [ glassfish, encoding, java, jee ]

24 Mar 2013 / posted by Nikku

... or how to spend unneccessary time on infrastructure problems.

Ruby is a great language and offers many tools to quickly develop (web) applications and awesome libraries. But sometimes it sucks. And that is the case when migrating applications to new Ruby versions.

The last weekends I worked on switching my Ruby on Rails 2.x applications from Ruby 1.8 to Ruby 1.9. One thing I learned is that Ruby 1.9 uses UTF-8 for file encoding now. Rails 2.3 applications and libraries (e.g. the mysql gem) have a number of issues with that, different ones depending on the platform you are running ruby on. This is a non-complete list of things to care about when updating.

Encoding Update

To start with, Ruby uses the LANG environment variable to determine file encoding. So make sure you have it set to a UTF-8 locale on your system.

irb(main):001:0> ENV['LANG']
=> "en_US.UTF-8"

Next, make sure that you do not use any special NON-ASCII characters (åößè are good candidates) in mail templates because the action-mailer gem for Rails 2.3 does not quite seem to understand UTF-8 yet.

Also, make sure your source files are UTF-8 encoded (adding the # encoding: utf-8 header may be required, too). And then, instruct your rails application to use UTF-8 as the default encoding:

# add to top of config/environment.rb
Encoding.default_external = Encoding.default_internal = Encoding::UTF_8

If you use regular expressions, do not wonder why they don't match anymore. They just don't because you may not have solved all encoding problems yet.

After encoding

You are done with encoding problems now and want to use all the latest UTF-8 compatible gems now? Reconsider. Then make sure to stick to the 1.x version of rubygems. Because rails 2.3 does not like the 2.0 versions of the gem.

Check your database connectivity library (I used the mysql gem for Ruby 1.8). The mysql gem has encoding issues on Linux systems, so switching to the mysql2 gem is inevitable. Only versions < 0.3 work though. Just because. If you are on windows stick to the mysql gem for now. Because mysql2 gives Bad file descriptor errors.

Remove deprecation warnings

Use Object#tap instead of #returning. Looks awful in code but prevents your server log from getting spammed.

Think about monkey patching your Rails 2 app for some more UTF-8 compatibility or switch directly to Rails 3.

Finally

Drink a coffee, eat a muffin and read about others going through the same kind of stuff, some even spending month to do it.

Tagged as [ ruby, ruby on rails, encoding ]