Thursday, March 1, 2012

Serialization and changing Java enumerations

In our latest release, we accidentally broke a web service we provide. Good thing is it's discovered by testers in the acceptance system. Lesson to learn: use better integration testing, on top of simple unit tests.

What went wrong? Well, in the initial version of the API, web service could return serialized Java objects via Spring HttpInvoker, including some enumeration constant. The definition of the enumeration type:

public enum Channel { phone, mail, e-mail };

Business wants to support a fancy mobile application for people having certain smartphones, so we extend the type:

public enum Channel { phone, mail, e-mail, mobileApp };

Well, since we forgot to pass the updated binary client to the calling application, the deserialization went wrong if the response contained the new "mobileApp" constant, which is rather logical.

But two other questions remained on the compatibility from the old client, what if:
  1. the server uses the extended type, but only return old constants?
  2. we swap the order of the constants in the enumeration type definition?
First, the old client works fine as long the response does NOT contain the new constant. Secondly, playing with the order is NO effect. Easy, not?

All this is due to the language specification, requiring some special, but simple handling for enumeration constants. Mainly the name of the constant is important, and you can ignore the position, and even the extra, optional fields from your enum "class" ...

More reading:

Java language on serializing enumerations

Blog warning NOT to use serialVersionUID for enums

Thursday, January 26, 2012

Yet another small Scala collection trick

Once again, my colleague Danny popped up a Java collection question. Given a data map with small lists as values, what is the shortest or best way to create a new, big list containing all individual elements from the small lists?

Best thing that we found in a few minutes, using car manufacturers and models as example data:

// setup
Map < String, List > carModels =
        new HashMap < String, List < String > > ();
carModels.put("Mazda",  Arrays.asList("MX-5", "RX-8"));
carModels.put("Audi",  Arrays.asList("A1", "A4", "A8"));

// actual work in 4 lines
List < String > allModels = new LinkedList < String > ();
for (List < String > brandModels: carModels.values()) {
        allModels.addAll(brandModels);
}

Now in Scala, back home ..

I took me a little research online, because I never used Scala on a daily basis, and miss a lot of knowledge on the collections API. Luckily it's not too hard to find good pointers online, and the Scala interpreter offers code completion help.

// setup
val carModels = Map("Mazda" -> List("MX-5", "RX-8"),
        "Audi" -> List("A1", "A4", "A8"))

// actual work, one liner
val allModels = carModels.values.reduce(_ ++ _)

The end result is actually not terribly hard to understand, but let me break it into pieces before I forget it myself:

carModels.values  // easy, similar as in Java

Now we have a collection of lists, and need to put all individual model names into one big list. Explicit looping is one way, but not very elegant. Another way of operating on a list, is using the "reduce" function. In general, you simply need to supply it a function which acts on two arguments, producing a single element.

For example, sum all numbers:

def add(x: Int, y: Int): x + y  // define simple 'add' function
List(1, 3, 5).reduce(add)   // evaluates to 9

So, this could lead to the following construction (with ++ for joining):

carModels.values.reduce( (models1, models2) => models1 ++ models2)

And thanks to implicit naming using underscores, we can end with:

carModels.values.reduce( _ ++ _ ) 

I still have to figure out what the difference is between the "++" and ":::" operators to join lists, and have many (!) other things to learn on Scala.

Feedback or corrections are very welcome!

Friday, November 18, 2011

Short Devoxx 2011 visit summary

Got the chance from my company to visit Devoxx 2011 in Antwerp again this year. Attended only two first conference days, but overall, I had a good feeling about my time spent.

Wednesday (day one)


Keynote on Java SE and EE by Oracle people: some technical information about upcoming changes in SE and EE. Learned a few small things, no big surprises.

Continuous Delivery
Useful ideas, and main point: agile not on development level only but commit to deliver end result to user as soon as possible. The end user does NOT care about all the intermediate development, QA, operations stuff, he just wants the fix running in production. Maybe should buy/read the full book.

JRuby
A little too technical, no clear overview or idea. Also strange that presenter doesn't program on day to day basis, but works on another area.

NoSQL for Java
Couldn't help feeling that Chris Richardson was a little bored himself. I exited the room halfway when the "Spring Data" slide popped up for some basic "Hello world" like example, which I don't associate with the so called advantages on NoSQL (huge volumes and speed). Preferred the pragmatic, realistic "Hadoop and NoSQL at Twitter" talk from Devoxx 2010 much more.

HTML5 mobile
Entered presentation halfway, but happily listened to the enthusiastic James Pearce, who challenged us to explore the HTML5 possibilities on mobile devices. Short comparison of native versus HTML5 based web applications, with points to remember: good mobile support in HTML5 is barely starting, but already interesting to experiment with, and there is not always clear cut between native and browser context, often something in between can work fine as well.

JAX-RS 2.0
Spec lead for JAX-RS presents upcoming 2.0 version from the Java REST interface, including some clear examples, often with simple annotations. Looks promising, and sounds good they will try to avoid putting too many useless bells and whistles or unfinished pieces into the release. Definite worth a look with you need a HTTP based web service across Java applications. Looking forward for a good comparison to Spring HttpInvoker.

Thursday (day two)


Starting the day with a "sexy" Android development publicity keynote. 1 million Android device activation per 2 days sounds impressive, but how to earn something in this huge and fierce marketplace, will be probably much harder to tackle for individual developers/teams. Oh yes, new Devoxx in Paris upcoming, might be an opportunity to train my French again .. :-)

Cloud is such stuff as dreams are made on
Three people with "equal" time slices pushed into a single slot, was probably not a best idea. Enjoyed first piece about growing "platform as a service" infrastructure, slowly maturing opposed to the wild ideas from previous year. Following pieces about the Google appengine and Laforge's toy Gaelyk project weren't telling me much new.

What's in store for Scala?
Martin Odersky clearly listed the major, technical achievements in the latest Scala 2.9 release. Although some slides from previous year returned, along the focus on parallel programming, he kindly added some extras, including Scala's more extensive reflection library. Personally, I haven't done much since last year on Scala beside reading "Programming in Scala", but the talk at least intrigued me once more to continue experimenting.

HTML5 with Play/Scala, CoffeeScript and Jade
After my disappointment around Matt Raible's talk last year, I want to give him a second chance. No luck: after 10 minutes, I only foresee a quickly assembled list of experiments, tossing different new technologies together, making me leave the room and head for more interesting lessons.

JBoss AS 7 on OpenShift
Missed the first 15 minutes, but having seen Pete's descent technical presentations before, I guess it probably won't harm. Surprisingly, this was not a slick, or unrealistic advertising - typical partner slot - presentation, but a long "risky" demo of deploying and managing a basic application on the RedHat's own cloud platform. Tools still look a little rough and unfinished, but usable. Speed and performance is probably a blocker for using it as fast build and deploy development platform. Nevertheless, it seems to contain a few good ideas, including git based configuration and running Jenkins CI on the cloud. Hopefully polished well enough soon, and also open sourced which may counter VMWare CloudFoundry.

Cloud Foundry and Spring, a marriage made in heaven
Unfortunately Patrick Chanezon repeated some pieces of the cloud talk in the morning, but we also saw interesting integration between Spring and the flexible CloudFoundry structure. Using or wiring services in the cloud should become as easy as wiring regular Java POJOs in your application context. Shown sample configuration looked alright, next thing to wonder about: will it be as easy in a real world project? Major plus for me: choice between public, private and mixed cloud. Should be interesting to try out on a small cluster, and find which pieces of the puzzle are still incomplete.

DVCS For Agile Teams
Mainly sharing positive git usage experience. Also including pieces of suggested workflows. Not much new, but happy that my positive feeling around git, currently only used in own, small/personal projects, is confirmed by others.

Thursday, August 11, 2011

Play with Scala lists

A friendly colleague of mine asked me today what the "best" way would be to compare two separate, homogeneous lists, containing some instances of a simple data structure. The objects are actually Hibernate entities holding values from 2 database queries. The idea is to test on one particular string property from the Java entities, ignoring other, not relevant properties. The order of the objects should be ignored, and implementing a custom "equals()" method was out of the question.

We could have written something using Apache commons-collections, but I doubt if we would have liked the end result. Using (inner) callback classes implementing "Transformer" will always contain a lot of ugly Java boiler plate code.

I wondered how would it look in Scala, given my very limited knowledge of this language?
package tung


object Tester {
def main(args: Array[String]): Unit = {
/* Create two separate lists of Person objects, containing ID and name. */
var personList1 = List(new Person(1, "john"), new Person(2, "john"), new Person(3, "daisy"), new Person(3, "bart"))
var personList2 = List(new Person(100, "bart"), new Person(101, "john"), new Person(103, "daisy"), new Person(104, "daisy"))
System.out.println(personList1)
System.out.println(personList2)

// Now doing the actual test to check if same names are available
val names1 = personList1.map(_.name).distinct.sort((x, y) => x < y)
val names2 = personList2.map(_.name).distinct.sort((x, y) => x < y)
val equalNames = names1 == names2

// Printing some output
System.out.println(names1)
System.out.println(names2)
System.out.println("names from lists are equal = " + equalNames)
}
}
Output:
List(tung.Person@199a0c7c, tung.Person@33f42b49, tung.Person@6345e044, tung.Person@86c347)

List(tung.Person@f7e6a96, tung.Person@3487a5cc, tung.Person@35960f05, tung.Person@eb42cbf)
List(bart, daisy, john)
List(bart, daisy, john)
names from lists are equal = true
As you can see, the actual tests are 3 simple lines. Probably shorter variations exist, but I'm not the Scala guru. Other languages can do something similar, but the same simplicity with pure Java 6? I don't think so.

Ending note: in the end, he simply chose to write a custom SQL query because of the probable hassle in Java. Would he have done the same if he could use Scala?
Links

Thursday, May 19, 2011

Encoding form data in Java servlets

Today I got tricked and frustrated again into bad handling of non ISO-8859-1 (also known as LATIN-1) form data in a Java web application. Russian and German users rightfully complain about losing there localized input once they press the submit button. A few things I have heard in the past, but had to look up again because I tend to forget these things easily (at least for the Java web app world):
  • explicitly indicate some Unicode encoding in response, both in HTTP header AND HTML meta data

  • set the character encoding on the incoming HttpServletRequest BEFORE reading any value
By default, most web application servers simply use LATIN-1 encoding, which would NOT match the UTF-8 encoding used by the browser. The full story is rather complicated, and drags a lot of legacy combined with pragmatic choices. But in the end, following those 2 rules brings the solution much closer in most cases.

In general, UTF-8 Unicode encoding seems to have the best and widest support, so I suggest to stick to this recommendation. For people using the Spring Framework, have look into the CharacterEncodingFilter.

As a side note: one specific JSF 1.2 application does NOT expose the encoding problem, without setting the character encoding manually on the request. Still need to find out WHY it seems to work fine there. Maybe the application server is setup slightly different, causing UTF-8 mapping as default, or maybe I'm simply getting blind. :-)

AJAX exception?

For some reason, our Firefox 4 browser submits AJAX POST data with a explicit character set indication (UTF-8) in the HTTP header, and therefore those AJAX submits (based on RichFaces 3.x) work fine out of the box. Should investigate why the behavior is different here. Is it a feature of the XmlHttpRequest, or is it some fancy generated JavaScript code from the used library?

Some useful links

Sunday, December 19, 2010

Some Devoxx 2010 talk summaries

Roughly a month ago, I joined a few talks of Devoxx 2010. Couldn't attend all of them, but here is my summarized impression of the things I have seen. A pity that I missed the performance talk from Joshua Bloch and the HTML5 presentation due to too much audience, but I'm sure I can get the info elsewhere as well.

Ajax Library Smack down: Prototype vs. jQuery

Invest time in some library, learn to use. Different styles and focus. Test!

A few things I captured: do use some JavaScript library, it is useful and not really hard to get started. Understand different scopes and styles of the libraries. Some of them, like Dojo and YUI, include a large set of GUI component out of the box, others rather serve as building blocks and provide extension points.

Both prototype and jQuery heavily rely on the usage of the dollar $ sign as special keyword. While prototype has “smaller” focus and Ruby style of syntax, jQuery is more feature rich, employs a different syntax and offers some kind of “plugin” mechanism, with both advantages and disadvantages. Prototype is often combined with script.acoul.us for GUI components, while jQuery is usually bundled with jQuery UI. Documentation is pretty good in both projects.

Speaker has nice presentation, giving an quick overview of JavaScript libraries and comparing prototype vs jQuery. Does not clearly identify a “winner”, so you might wonder if you know much more after attending.

Restfulie: quit pretending, use the web for real

It was a very quick talk, sometimes a little chaotic because of the speed and navigation through the slides. The main thing that sticked to my mind, is that the existing JAX-RS API can be improved a a few levels. There should be less duplication of the URL paths, and less boiler plate code. Restfulie shows how this can be achieved. Also included: support for easily switching between different data formats like XML, JSON, ..

Although the presentation itself was not really polished, the referred frameworks and ideas are probably worth further investigation. Have a look into the different REST frameworks, compare their styles and features, and learn different ways to solve similar problems. They agree that Restlet has very similar benefits, both their implementation differs thoroughly.

Scalable and RESTful web applications: Kauri and Lily

This presentation felt like some commercially oriented sales talk and introduction into 3 related products from the local Outerthought consultancy company: Daisy, Kauri and Lili. Some references to REST, CMS, NoSQL storage systems are made, but too bad the talk in the whole did not gave a very coherent and clear message.

The Next Big JVM Language

Stephen offered a nice overview of the current JVM situation, and explains what parts are and are not important for the future JVM language. Many of these ideas are clearly mixed with personal opinions, but nevertheless often making sense. He suggests that programming style, ease of use, compile type checking, support for scripting-like behavior, closures, .. will probably be part of that successor. After eliminating a few candidates, he pitched Scala and Fantom to each other. It felt a little funny that he found Scala probably a little too much, and Fantom too less, and so guessing that neither of them will become the next big JVM language. Adding incompatible changes to the existing Java language was the last idea, but he did not dare to say how that would evolve ..

Nothing from this talk can be used in the immediate future and in most of our daily work, but it doesn’t hurt the reflect about it once in a while.

Vaadin - Rich Web Applications in Java

Reasonable good introduction to the Vaadin web framework, which offers GWT integration. Too bad missed a big part of the presentation. As complete newbie, the main idea I remember, is that Vaadin provides a similar way of programming like GWT, where code is written in Java and not in JavaScript or XML templates. The main difference is that Vaadin runs the code mostly on server side, with tight client side integration with Ajax, where GWT tries to push as much state as possible to the client. GWT shows clearly scaling benefits when some applications has enormous amount of clients, but keeping state on server has the benefit of keeping some thing simpler. Definitely worth a second look when I have more time.

What's New in Scala 2.8?

Bill Venners and Dick Wall briefly discussed and showed some of the new Scala 2.8 features. Luckily, they did not repeat too much of older Scala language syntax and structure. But I still have a mixed feeling about the presentation, mostly because I already read and seen some of the new features in blogs and other online documentation.

Future-proofing collections: from mutable to persistent to parallel

I was a little afraid that Martin Odersky would focus on syntactical updates and changes from the Scala 2.8 release again, but my fear was ungrounded. He clearly explained the benefits of having immutable collection classes, combined with the evolution of multi-core machines and avoiding thread synchronization problems. Scala happens to be the implementation for those “parallel” collections, showing that the theory could be turned into a real language feature. Definitely worth the time, and once more a reason to investigate some time in Scala.

The Future Roadmap of Java EE

Disappointing presentation about upcoming changes for the Java EE platform. Can’t remember any big announcement, nor updates which we are all waiting for. It was more a very brief introduction and pointer to other talks of the conference.

Comparing JVM Web Frameworks

Matt Raible fluently presents his overview and opinions about several current web frameworks. But you can also find a very similar presentation of him from a few years ago, mainly for a PHP and Ruby on Rails public, and I did have the feeling that this talk did not add. Advices and opinions were also rather “soft”, and not helping me much to select the next framework. It was not bad at all, but halfway I chose to run away in the hope to discover something more useful or exiting.

The Essence of Caching

Entering the talk halfway was not ideal, because I felt that I really missed some concepts from the first half. Although I did recognize and understood parts about cache coherency in clustered setup, it was not always clear what the final message was. I had the feeling it was something like “caching is useful, but be careful what you do ..”. And that was not really surprising news IMHO.

Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Actors

Followed this talk because it’s linked to Scala Actors, but had the impression I have the exact same talk previous year on Devoxx. Was little hard to image real world applications which would strongly benefit from the Akka framework, though I do agree with the general ideas. I guess we need to setup such high load, multi node cluster first and hit the performance and locking problems using “traditional” methods.

Hadoop and NoSQL at Twitter

Dmitriy Ryaboy works at Twitter and the back end and analytical side, and gives a a brief insight in their different ways of handling the massive parallel workload and high volumes of data. It is mostly a combination of multiple frameworks and tools. Some of them for online services: snowflake, memcache, flockdb, cassandra, rainbird. On the batch processing side: Hadoop, Elephant Bird, HBase, Hive, .. He advices us to have a look at those tools, but keep an open mind that the needs of Twitter might completely different from ours. Which probably is.

Implementing Emergent Design

During this presentation Neil Ford explains why many software projects grind to a halt after a few development cycles. Lack of clear specifications is only one possible problem, but general complexity, combined with the “broken window syndrome” can lead programs to a unmaintainable mess. One of the key concepts is the “technical debt”, defined by Ward Cunningham and the need for constant cleanup and refactoring. Of course it remains rather theoretical, and does not address how to convince upper management to follow “the right path”.

Does dev/ops matter for me?

Again two guys presenting here, rooted in the operations side of IT projects. They explain what happens and goes wrong when high walls are built between the development and operations team, and give advice to tear down those barriers. I strongly agree on those principles, but beside of mixing people from the two sides together, I have learned something really new. Pity.

Friday, August 6, 2010

Combine local Git branches with central Subversion repository

Assuming you already heard of Git as SCM alternative for Subversion, you might also know that it includes some integration with Subversion. That provides you the option to start using some Git advantages in your daily work without worrying your local operations department, possibly in a very stealthy way.

After using the gateway for a few months, I discovered and started using another nice Git feature: easy merging and working with a separate development branch. Problem was that we usually have some debug settings enabled in some of the project files, resulting into some kind of "development" build, instead of a "production" build of the program. It is possible to fix this with some extra build configuration parameters outside the project, but using Git offers you another easy solution: just locally fork the code with your own settings.

The "master" branch contains the production code, and the "dvl" branch my modified version. Once the customization is checked into the "dvl" branch, you can synchronize ("pull") the "master" branch based on "dvl" and undo the customization. Now switch back to your development branch is just keep work like you used to do previously. Any moment you can check your workspace using "git status" and your custom patch won't bother you. It also prevents you to push the development setting to the central Subversion repository by accident with the commit all command ("git commit -a").

But how do I push the wanted (!) code changes to our central repository when I'm ready? Well, by injecting one extra step, before directly pushing code from "master" to Subversion: rebase "master" on "dvl" branch. This involves a little extra administration, so the following little shell script fragment helps me out to save typing work:


# move to master branch
git checkout master

# pull changes from dvl and re-apply patch on top
git rebase dvl

# required 2 steps because dcommit would complain about sync
git svn fetch
git svn rebase

# push changes to central SVN repo
git svn dcommit

# switch back to dvl branch
git checkout dvl


One of the drawbacks of this system, is the visibility of the patch and undo steps in the history from the master branch, also propagated to the central Subversion. Maybe history rewriting fixes this, or I could implement some some other workflow, but for the moment I'm not really troubled about that. If anyone as a better alternative, I'd be happy to learn!

Some good links: