First post here. Hi everyone!
I am using hbm2java via Maven integration in a project where we are doing database first development. I have used a custom ReverseEngineeringStrategy in order to import the comments from tables and fields into the generated classes.
The project language is spanish, so usually the comments in the database will contain characters with accents (á, é, í…) etc. When the code is generated, it will produce errors because the files have the wrong encoding and these characters are mangled.
I took a look at the hbm2x sources and turns out that the code that does the actual writing of the files looks as follows (in the class org.hibernate.tool.hbm2x.TemplateProducer) :
fileWriter = new FileWriter(destination);
fileWriter.write(tempResult);
The ‘destination’ variable is a File reference, and tempResult is the templates output.
Well, turns out that with this code, java will write the file using the default plattform encoding, which in my case, being a Windows user, will end up being CP-1512.
So, for international character codes support I think this code should at least write UTF-8, or even better, let the user select the encoding.
I can easily fork the project and make a pull request, but first I’d like some feedback so I don’t work in vain.
- Should I open a bug report instead or are you OK with a PR?
- If the latter, what requirements with regards to testing would you need me to fulfill?
- Would I have to implement optional encoding or is it fine if I just default to UTF-8?
Thanks in advance, cheers to all.