Random software development

Thursday, May 19, 2011

Encoding form data in Java servlets

Today I got tricked and frustrated again into bad handling of non ISO-8859-1 (also known as LATIN-1) form data in a Java web application. Russian and German users rightfully complain about losing there localized input once they press the submit button. A few things I have heard in the past, but had to look up again because I tend to forget these things easily (at least for the Java web app world):

explicitly indicate some Unicode encoding in response, both in HTTP header AND HTML meta data

set the character encoding on the incoming HttpServletRequest BEFORE reading any value

By default, most web application servers simply use LATIN-1 encoding, which would NOT match the UTF-8 encoding used by the browser. The full story is rather complicated, and drags a lot of legacy combined with pragmatic choices. But in the end, following those 2 rules brings the solution much closer in most cases.

In general, UTF-8 Unicode encoding seems to have the best and widest support, so I suggest to stick to this recommendation. For people using the Spring Framework, have look into the CharacterEncodingFilter.

As a side note: one specific JSF 1.2 application does NOT expose the encoding problem, without setting the character encoding manually on the request. Still need to find out WHY it seems to work fine there. Maybe the application server is setup slightly different, causing UTF-8 mapping as default, or maybe I'm simply getting blind. :-)

AJAX exception?

For some reason, our Firefox 4 browser submits AJAX POST data with a explicit character set indication (UTF-8) in the HTTP header, and therefore those AJAX submits (based on RichFaces 3.x) work fine out of the box. Should investigate why the behavior is different here. Is it a feature of the XmlHttpRequest, or is it some fancy generated JavaScript code from the used library?

Some useful links

Sunday, December 19, 2010

Some Devoxx 2010 talk summaries

Roughly a month ago, I joined a few talks of Devoxx 2010. Couldn't attend all of them, but here is my summarized impression of the things I have seen. A pity that I missed the performance talk from Joshua Bloch and the HTML5 presentation due to too much audience, but I'm sure I can get the info elsewhere as well.

Ajax Library Smack down: Prototype vs. jQuery

Invest time in some library, learn to use. Different styles and focus. Test!

A few things I captured: do use some JavaScript library, it is useful and not really hard to get started. Understand different scopes and styles of the libraries. Some of them, like Dojo and YUI, include a large set of GUI component out of the box, others rather serve as building blocks and provide extension points.

Both prototype and jQuery heavily rely on the usage of the dollar $ sign as special keyword. While prototype has “smaller” focus and Ruby style of syntax, jQuery is more feature rich, employs a different syntax and offers some kind of “plugin” mechanism, with both advantages and disadvantages. Prototype is often combined with script.acoul.us for GUI components, while jQuery is usually bundled with jQuery UI. Documentation is pretty good in both projects.

Speaker has nice presentation, giving an quick overview of JavaScript libraries and comparing prototype vs jQuery. Does not clearly identify a “winner”, so you might wonder if you know much more after attending.

Restfulie: quit pretending, use the web for real

It was a very quick talk, sometimes a little chaotic because of the speed and navigation through the slides. The main thing that sticked to my mind, is that the existing JAX-RS API can be improved a a few levels. There should be less duplication of the URL paths, and less boiler plate code. Restfulie shows how this can be achieved. Also included: support for easily switching between different data formats like XML, JSON, ..

Although the presentation itself was not really polished, the referred frameworks and ideas are probably worth further investigation. Have a look into the different REST frameworks, compare their styles and features, and learn different ways to solve similar problems. They agree that Restlet has very similar benefits, both their implementation differs thoroughly.

Scalable and RESTful web applications: Kauri and Lily

This presentation felt like some commercially oriented sales talk and introduction into 3 related products from the local Outerthought consultancy company: Daisy, Kauri and Lili. Some references to REST, CMS, NoSQL storage systems are made, but too bad the talk in the whole did not gave a very coherent and clear message.

The Next Big JVM Language

Stephen offered a nice overview of the current JVM situation, and explains what parts are and are not important for the future JVM language. Many of these ideas are clearly mixed with personal opinions, but nevertheless often making sense. He suggests that programming style, ease of use, compile type checking, support for scripting-like behavior, closures, .. will probably be part of that successor. After eliminating a few candidates, he pitched Scala and Fantom to each other. It felt a little funny that he found Scala probably a little too much, and Fantom too less, and so guessing that neither of them will become the next big JVM language. Adding incompatible changes to the existing Java language was the last idea, but he did not dare to say how that would evolve ..

Nothing from this talk can be used in the immediate future and in most of our daily work, but it doesn’t hurt the reflect about it once in a while.

Vaadin - Rich Web Applications in Java

Reasonable good introduction to the Vaadin web framework, which offers GWT integration. Too bad missed a big part of the presentation. As complete newbie, the main idea I remember, is that Vaadin provides a similar way of programming like GWT, where code is written in Java and not in JavaScript or XML templates. The main difference is that Vaadin runs the code mostly on server side, with tight client side integration with Ajax, where GWT tries to push as much state as possible to the client. GWT shows clearly scaling benefits when some applications has enormous amount of clients, but keeping state on server has the benefit of keeping some thing simpler. Definitely worth a second look when I have more time.

What's New in Scala 2.8?

Bill Venners and Dick Wall briefly discussed and showed some of the new Scala 2.8 features. Luckily, they did not repeat too much of older Scala language syntax and structure. But I still have a mixed feeling about the presentation, mostly because I already read and seen some of the new features in blogs and other online documentation.

Future-proofing collections: from mutable to persistent to parallel

I was a little afraid that Martin Odersky would focus on syntactical updates and changes from the Scala 2.8 release again, but my fear was ungrounded. He clearly explained the benefits of having immutable collection classes, combined with the evolution of multi-core machines and avoiding thread synchronization problems. Scala happens to be the implementation for those “parallel” collections, showing that the theory could be turned into a real language feature. Definitely worth the time, and once more a reason to investigate some time in Scala.

The Future Roadmap of Java EE

Disappointing presentation about upcoming changes for the Java EE platform. Can’t remember any big announcement, nor updates which we are all waiting for. It was more a very brief introduction and pointer to other talks of the conference.

Comparing JVM Web Frameworks

Matt Raible fluently presents his overview and opinions about several current web frameworks. But you can also find a very similar presentation of him from a few years ago, mainly for a PHP and Ruby on Rails public, and I did have the feeling that this talk did not add. Advices and opinions were also rather “soft”, and not helping me much to select the next framework. It was not bad at all, but halfway I chose to run away in the hope to discover something more useful or exiting.

The Essence of Caching

Entering the talk halfway was not ideal, because I felt that I really missed some concepts from the first half. Although I did recognize and understood parts about cache coherency in clustered setup, it was not always clear what the final message was. I had the feeling it was something like “caching is useful, but be careful what you do ..”. And that was not really surprising news IMHO.

Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Actors

Followed this talk because it’s linked to Scala Actors, but had the impression I have the exact same talk previous year on Devoxx. Was little hard to image real world applications which would strongly benefit from the Akka framework, though I do agree with the general ideas. I guess we need to setup such high load, multi node cluster first and hit the performance and locking problems using “traditional” methods.

Hadoop and NoSQL at Twitter

Dmitriy Ryaboy works at Twitter and the back end and analytical side, and gives a a brief insight in their different ways of handling the massive parallel workload and high volumes of data. It is mostly a combination of multiple frameworks and tools. Some of them for online services: snowflake, memcache, flockdb, cassandra, rainbird. On the batch processing side: Hadoop, Elephant Bird, HBase, Hive, .. He advices us to have a look at those tools, but keep an open mind that the needs of Twitter might completely different from ours. Which probably is.

Implementing Emergent Design

During this presentation Neil Ford explains why many software projects grind to a halt after a few development cycles. Lack of clear specifications is only one possible problem, but general complexity, combined with the “broken window syndrome” can lead programs to a unmaintainable mess. One of the key concepts is the “technical debt”, defined by Ward Cunningham and the need for constant cleanup and refactoring. Of course it remains rather theoretical, and does not address how to convince upper management to follow “the right path”.

Does dev/ops matter for me?

Again two guys presenting here, rooted in the operations side of IT projects. They explain what happens and goes wrong when high walls are built between the development and operations team, and give advice to tear down those barriers. I strongly agree on those principles, but beside of mixing people from the two sides together, I have learned something really new. Pity.

Friday, August 6, 2010

Combine local Git branches with central Subversion repository

Assuming you already heard of Git as SCM alternative for Subversion, you might also know that it includes some integration with Subversion. That provides you the option to start using some Git advantages in your daily work without worrying your local operations department, possibly in a very stealthy way.

After using the gateway for a few months, I discovered and started using another nice Git feature: easy merging and working with a separate development branch. Problem was that we usually have some debug settings enabled in some of the project files, resulting into some kind of "development" build, instead of a "production" build of the program. It is possible to fix this with some extra build configuration parameters outside the project, but using Git offers you another easy solution: just locally fork the code with your own settings.

The "master" branch contains the production code, and the "dvl" branch my modified version. Once the customization is checked into the "dvl" branch, you can synchronize ("pull") the "master" branch based on "dvl" and undo the customization. Now switch back to your development branch is just keep work like you used to do previously. Any moment you can check your workspace using "git status" and your custom patch won't bother you. It also prevents you to push the development setting to the central Subversion repository by accident with the commit all command ("git commit -a").

But how do I push the wanted (!) code changes to our central repository when I'm ready? Well, by injecting one extra step, before directly pushing code from "master" to Subversion: rebase "master" on "dvl" branch. This involves a little extra administration, so the following little shell script fragment helps me out to save typing work:


# move to master branch
git checkout master

# pull changes from dvl and re-apply patch on top
git rebase dvl

# required 2 steps because dcommit would complain about sync
git svn fetch
git svn rebase

# push changes to central SVN repo
git svn dcommit

# switch back to dvl branch
git checkout dvl

One of the drawbacks of this system, is the visibility of the patch and undo steps in the history from the master branch, also propagated to the central Subversion. Maybe history rewriting fixes this, or I could implement some some other workflow, but for the moment I'm not really troubled about that. If anyone as a better alternative, I'd be happy to learn!

Some good links:

git-svn manual

Pre-tested commits with Hudson and Git

Pro Git book

Tuesday, October 6, 2009

Subversive not supporting rename/move in workspace

One of the nice features of Subversion over CVS, is the the support for resource renaming. It provides you the opportunity to easily follow the history of a file, even if it has been renamed of moved around your directory structure.

Too bad that the Subversive plugin for Eclipse has some problems with supporting this feature, and there seems to be no easy fix. The command line Subversion client works fine, but what if we want to stay in our comfortable, fancy IDE?

https://bugs.eclipse.org/bugs/show_bug.cgi?id=213991

The Subversion rename is actually a copy together with a delete. So, the workaround in Subversive: open the context menu of your file, select "Team", "Copy To .." and input the new destination, while keeping your history. Next step is to delete the file on the original location. If you check the history of the "new" file, Subversive won't show you any remote revision information. Don't be afraid, it will re-occur once you have committed your changes.

On the other hand, the SVN Repositories view supports moving files around directly on the repository. This is fine if you don't need to change file internally or any other file in the same commit. But if your move some Java source file, you probably need to update some import references and package definitions as well. So you want to apply these changes in your workspace first, and commit all changes together, instead of having a inconsistent (temporary) state at the repository.

Now you know this, take up the challenge and fix the bug in Subversive! ;-)

Tuesday, August 5, 2008

Performance monitoring web applications with JavaScript

As end users, we often complain about slow web applications, and the same complaint is fired by the user towards us as web application developer. Most used monitoring solutions are based on some request filter implementation, where the server logs the timestamps of the incoming request and outgoing response. This works fine and provides good insight how much time we spend at the application server.

But how do we monitor the delay between the browser and the server, consisting out many multiple firewalls, proxies and many other slow network nodes? Especially when the browser machine is connected to internet directly, without some kind of corporate LAN where you could introduce a monitoring proxy?

After a bit of brainstorming with a smart colleague (thanks Marc), I found a very simple solution based on JavaScript. It catches browser navigation events using the load and unload events and stores the timestamps in a HTTP cookie variable. It does NOT work for asynchronous AJAX events, nor for separately loaded images, but it works well enough for us.


function enterPage() {
    var endTime = new Date().getTime()
    var timeIndex = document.cookie.indexOf("exitTime=")
    var endPoint = document.cookie.indexOf(";", timeIndex)
    var startTime = document.cookie.substring(timeIndex + 9, endPoint)
    var duration = endTime - startTime
    document.cookie = "duration=" + duration + ";"
}

function exitPage() {
    var startTime = new Date().getTime()
    document.cookie = "exitTime=" + startTime + ";"
}

<body onload="enterPage()" onunload="exitPage()">
  my example page
</body>

The browser stores the load duration time in a cookie after the new page is rendered, and posts the cookie to the server in next HTTP request. Since we use J2EE at the server, we simply use the HttpServletRequest.getCookies() method to extract the value.

HTML events
JavaScript and Cookies

Thursday, July 10, 2008

Using dependency injection in EJB2

Many J2EE developers are still facing legacy projects, often containing horrible EJB2 session beans. Once I understood the improvements of EJB3 and the Spring alternatives for dependency injection, I cursed the ugly dependency look code in old EJB2 code.

Even if you cannot switch to an EJB3, you can still improve your existing code.
This is a sample of a typical EJB2 bean:


public class BookingBean implements SessionBean {
  private CustomerDAO customerDAO;
  private HotelDAO hotelDAO;
  public void ejbCreate() {
    customerDAO = Locator.lookup(CustomerDAO.class);
    hotelDAO = Locator.lookup(HotelDAO.class)
  }
}

If you have a lot of session beans, based on top of a large set of DAO instances, all this becomes a maintenance nightmare. It is very obvious that the code in the ejbCreate() method is really superfluous.

What I want to achieve in the new version:


public class BookingBean implements SessionBean {
  @Autowired private CustomerDAO customerDAO;
  @Autowired private HotelDAO hotelDAO;
  public void ejbCreate() {
    // use automatic injection of instances from global context
  }
}

The "global context" can be something like a Spring application context, a classic JNDI naming context, or anything you want. Using reflection and a little bit of generic code, you can move the ejbCreate() to a base class.

Remaining work to do to wire a session bean to the dependencies:

extend from the correct base class

annotate your dependencies

Tuesday, September 11, 2007

Container Managed Transactions in EJB

Be careful when using more "advanced" transaction configurations in EJB. Yesterday, we spent quite some time investigating the reason why calling a bean method marked with the RequiresNew TX attribute, did NOT spawn a separate TX context.


class SomeSessionBean implements SomeSession {
  @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
  public void helperMethod() {
    // must perform some update in separate TX
  }
  @TransactionAttribute(TransactionAttributeType.REQUIRED)
  public void mainMethod() {
    this.helperMethod();
  }
}

The code flows into mainMethod(), and that was expected to perform helperMethod() in another transaction. A simple test case and a debugging run clearly showed that only the original transaction was used in the entire flow.

What happened? The container does NOT intercept the second method invocation, so it cannot create a new transaction. One way to solve it, is doing some JNDI lookup (EJB2) or performing a EJB injection (EJB3). Both workaround will work as intended, but have one drawback: pollution of the original API with implementation "details".

IMHO the better solution: define a separate helper bean containing the helper methods. It requires some framework code in EJB2, but when you are "blessed" by EJB3, it can provide you a rather elegant solution, where the front session bean, delegates logic within a separate TX to another bean.

In case you need much more complexity, it may be better to use bean managed transactions, but I recommend to try this first. The bottom line: keep a close watch on transaction attributes of EJB session beans, and be aware of the existence, or the opposite, of the surrounding container!