Writing tests

One thing I find time and time again is that writing tests always seems to be a controversial topic. In many teams I have worked within I have found a real variety in people’s opinions on writing tests. I will very rarely encounter a developer that is willing to say they shouldn’t be writing tests yet many of those developers will still avoid it. People seem really divided and in my opinion spend too much time discussing when the right time to write tests is and what kind of tests they should be writing.

As for me I don’t necessarily subscribe to TDD but am not against it either. I tend to follow a mixed approach, writing some tests before, or in concert with the application code, and writing some tests after. I guess it’s not that interesting to sit somewhere in the middle of the testing approach spectrum, but I find it useful. TDD is one tool in my toolbox. I think it’s important to work out when it’s the right tool to use.

Posted in Testing | Leave a comment

It’s been a while

It’s been a while but I’ve decided to make more of an effort to get back to blogging. Since I last posted, I’ve been busy writing various types of software using many different languages. Over the past year I have been mainly developing in C#, but still finding time to do some JVM based development on personal projects too.

Posted in Uncategorized | Leave a comment

Scala and db4o

Recently, I’ve been looking at db4o – an open source object database. I have plenty of experience with various object relational mapping frameworks, but wanted to try an alternative framework,  one which should result in clean code, with no need to have the performance hit, and configuration complexity often associated with ORM.

As others have noted, db4o seems like a good fit with Scala. Hopefully the combination of the two should result in a more maintainable codebase.

I will show a couple of examples of the db4o API here. First I need to create a simple class that I will persist, and later query. For this I have created a Person class, that has only one field, a name, well I did say it was a simple class.

class Person(val name: String)

Opening a connection to the database, and persisting an instance of our Person class can be achieved with a few lines of code, as shown below:


val db = Db4o openFile "test.yap"

val me = new Person(“Matthew”)

db set me

This will open, or create a new file based database called “test.yap” in the current directory and persist the created instance of Person. Persisting objects really is as simple as that, no schema or mappings to define. The set method can be used to store changes to existing objects, as well as storing new objects.

The next obvious step is retrieving any objects we have persisted. Db4o offers three ways to do this:

  1. Query by example – This is a simple form of querying , one that can be found in most ORM frameworks too, and provides a way to query the database based on a ‘template’ instance of a class. All persisted instances matching this template will be returned.
  2. Native queries – I would imagine that most queries would be written this way.  Native queries provide a powerful way to write queries against the database in code, providing a great typesafe way to write queries.
  3. SODA query API – SODA is db4o’s internal query system. You wouldn’t normally need to write a query using SODA unless there were performance issues when using one of the other two query types. All types of queries actually get translated into a SODA query by the framework.

I will show an example of a native query here, as I believe this will be the more common type of query used. Writing a query class involves implementing the match method on the Predicate interface. This method takes one parameter, an instance of the class being queried, and returns a boolean – true if this query should return the given instance.

Thanfully Scala allows us to make the writing of Predicates even easier through the use of an implicit conversion. We can convert a function that takes an instance of our person class and returns a boolean into a Predicate quite easily.  An example of a simple query, is shown below:


implicit def toPredicate[T](predicate: T => Boolean) =
new Predicate[T]() {def `match`(entry: T): Boolean = {predicate(entry)}}

val result = db query {person: Person => person.name.contains("t") }

The example above shows our implicit conversion that allows for easy creation of (typesafe) queries, as well as a simple query that will return every Person persisted that has a letter ‘t’ in their name.

The query method returns an ObjectSet, a db4o class that can be used to iterate over every matching instance. This ObjectSet won’t play nice with Scala’s ‘for’ loop. Thankfully conversion of the ObjectSet into an iterable object can be easily achieved as shown below:


class RichObjectSet[T](objectSet:ObjectSet[T]) extends Iterator[T] {
def hasNext:Boolean =  objectSet.hasNext()
def next:T = objectSet.next()
}

implicit def toRichObjectSet[T](objectSet: ObjectSet[T] ) =
new RichObjectSet[T](objectSet)

Executing our original query and displaying the results can now be performed in a couple of lines of code as shown below:


val result = db query {person: Person => person.name.contains("t") }

for(person <- result) println(person.name)

So, as you can see persisting objects as well as running queries can be achieved with only a few lines of code.

Seeing this code, you may question the efficiency of the query. Surely db4o can’t be passing every persisted object to our match method. This would be extremely inefficient. As I hinted at previously, db4o will attempt to convert our query into it’s own internal SODA representation before executing. It achieves this transformation by examining the bytecode of our query before executing. If this translation to SODA fails, only then will db4o resort to calling our match method with every instance of our class we have persisted.

Unfortunately for my experiments in using db4o with Scala, the query optimiser does not seem to cope with the bytecode generated by the Scala compiler well, resulting in most queries failing to get converted to SODA. I am currently investigating ways in which the query optimiser can be made to work with Scala, and will post my findings here.

I hope this has given an insight into how object persistence and retrieval can be achieved in a typesafe manner in a few lines of code, with no complex mapping required. I think the simplicity of the API that db4o provides is certainly a good match with Scala and if the query optimisation issue can be solved, then one that is worth looking at further.

Posted in Scala | Tagged , | 7 Comments

Programming in Scala

I recently bought the Programming in Scala book and thought I would describe my experiences so far with both the book and the Scala language itself.
Before I do though I thought I would describe why I decided that Scala could be worth a look.

1. It runs on the JVM, so something I am already familiar with
2. I can access Scala code from Java, and Java code from Scala, so it’s not like I am abandoning tools, libraries and code that I already have and like using.
3. Scala is a hybrid language i.e. it allows both a functional approach as well as an OO approach so to program in Scala, I don’t have to throw away my existing OO design skills, but instead I can build upon them, choosing to take a more functional approach when appropriate. 

The book

The book serves as an excellant tutorial to the Scala language. Working through the book, it flows well with each chapter building on concepts and examples described in earlier ones. The book takes care to explain the language constructs in depth, often providing examples of how the language differs from Java. As well as the main language, there is also some coverage of libraries such as containers and actors. I have found the book really easy to work through, and is probably one of the better written technical books I have read recently. I really would recommend this book to any programmer wanting to find out more about the Scala language.

The language

From my albeit limited experiences so far, I do like the language. I really like the fact that I can combine OO principles with functional programming. This makes Scala an extremely powerful language as well as allows an easier transition into programming in a functional style. Admittedly this power can be abused and some quite unreadable code can be produced, but this is true of any language when used incorrectly.  One thing that stands out for me is that Scala code tends to be more concise than the corresponding Java code would be. I have often found myself looking at Scala code and just thinking of how much more Java code would be required to achieve the same thing.  I have already found myself looking at some of my old Java code and have found ways that it could be refactored in far less lines of code  in Scala.  I am definately going to invest more time into learning Scala.

Resources

http://www.scala-lang.org/ – The home of scala
http://www.ibm.com/developerworks/views/java/libraryview.jsp?search_by=scala – IBM have a great collection of articles on the language
http://liftweb.net/ – Scala web framework
http://scala-blogs.org/ – Scala community blogs


Posted in Uncategorized | Tagged , | Leave a comment

Why I’m using Git with Subversion

Over the years I’ve used many version control systems: CVS, Perforce, ClearCase, Subversion being the more common ones. Most of the repositories that I have worked with recently have been Subversion repositories. There are many reasons why Subversion may not be an ideal version control system, but given that I have to work with it, I have tried to look for ways to make it more flexible.

I know there are plenty of blog posts about Git and Subversion, but I needed to summarise my findings for colleagues and thought it could be useful to share them.

What is Git?

For those of you that haven’t come across Git before, Git is a distributed version control system. What does this mean? Well each developer that works on a project has their own local repository – a clone of the master repository that is a fully functional and independent repository. Developers can work on their local repository commiting their own changes, and decide when they would push their commits to the master repository. Developers can create branches on their own repository without the master even needing to know anything about them.

What’s this got to do with Subversion?

Git-svn allows you to pull down down changes from a Subversion server into a git repository, make as many changes as you want, then push your changes back to the subversion server. This gives me the power to get all of the benefits of version control, and still be selective about when and how I push the changes back to Subversion.

What this means for me, is that when I implement a particular feature or fix a defect, I can easily create a local branch in my git repository for my changes. I can then implement and test my changes, which may involve several commits, and then only when I’m happy with it I can merge the changes from my local branch and push them back to the Subversion server. This allows me to develop a feature, getting the full benefits of version control, without having to worry about affecting the other developers until my change is complete.

Quite often when working on a particular feature I will spot an area of the code that I would like to refactor before continuing. I can very quickly create another local branch to perform the refactoring on, ensuring the tests pass. It might be that I end up pushing the refactoring changes to the Subversion server before my feature is complete. I can then easily switch back to my local feature branch, pull in only my refactoring changes and then carry on where I left off with the feature development.

Another handy feature that git provides is the ability to stash away changes on a temporary branch. For example, if I am part way through fixing a bug and get pulled off the bug fix to help fix some build issues or test failures , then I can perform a git-stash to store my changes on a temporary branch, fix the build issues, then when I want to resume working on the bug fix simply git-stash apply to apply my previous changes to the fixed code. Using this approach it keeps the changes separate and there is no chance of me accidentally checking in my half finished bug fix.

It does take a bit of getting used to, but I have found that using Git with Subversion really helps me keep track of my development, allows me to check in code more frequently, and generally makes things easier to manage. Whenever I have to use any Subversion repositories in the future, a git-svn clone will certainly be the first thing I will do.

It is surprising that although Git seems to get a lot of coverage online, most people I speak to in person seem to know little about it. I hope this has explained how Git, and Git-svn have helped me.

If you want to find out more about git and git-svn, then you may find the following links useful:

Are there any other resources that you have found useful in learning git, or using git with Subversion, or perhaps other benefits you have found when using Git / Git-svn?

Posted in Uncategorized | Tagged , | Leave a comment

The dreaded broken build

This post explains the experiences I have had with continuous integration, and the dreaded broken build. I have tried to outline some of the more common reasons that I have seen causing builds to remain in a broken state for periods of time.

Continuous Integration

Software that is built and tested automatically on a frequent basis can lead to an increase in the quality of the software and shorter release cycles. It is clear that the sooner an issue with the build is found, the quicker and therefore cheaper it is to fix. It’s far easier to fix an issue minutes, or hours after it has occurred, rather than days, weeks or even months later.

The trouble is, build issues don’t always get fixed quickly.warningsign

The broken build

If builds remain broken for long periods of time then developers will lose confidence in the continuous integration strategy. If there are numerous test failures that have been there for a period of time, then the individual developer will stop running the tests themselves, being unable to isolate any failures their particular change may have caused from the huge list of failures in the system. This over time will only make the problem worse, as the list of test failures grows, and the root causes of the failures become more difficult to debug and fix. Developers are quite often unwilling to fix issues they see as being caused by someone else.

Trying to avoid the permanently broken build

There are a few techniques that I have found can help when it comes to avoiding this kind of situation.

One is the use of pre-commit builds, that is a build target that each developer should run before checking any code in. These targets typically perform a compile, run some or all unit tests, and may even run some form of static code analysis too. The key thing with a pre-commit build though is that it has to be quick to run. If the target fails for some reason, then the developer needs to be able to address the issue, then run the pre-commit build again quickly. The pre-commit build should help reduce the likelyhood of the build becoming broken on the continuous integration server.

Another is to encourage developers to check in smaller changes to code more frequently, rather than waiting longer periods of time before making huge commits to the codebase, again making it more difficult to isolate the cause of the build failure.

Another issue that can cause the builds to remain broken for long periods of time, and one that is often overlooked, is that of build notifications i.e. how developers are notified the build has failed, and the information that is present in the build log. I have found that on build failure, having the continuous integration server emailing / instant messaging only the developers that have changed code since the last successful build can work well. The developer should then only hear from the continuous integration server if they need to look at the build result. If the continuous integration server is emailing everyone anytime it does anything, the emails will soon get treated as spam and ignored.

The information in the build log should clearly state the cause of the build failure. All too often I have seen build logs that are several pages long, containing hundreds of lines of build and test run output. You shouldn’t have to wade through all this to find the one line that tells you which test failed and why. If developers are confused as to what the cause of the build failure is, then chances are they aren’t going to be in a position to fix the failure quickly. And if they have to do this several times, then they will soon get fed up with this, and become less inclined to address build failures.When the build breaks, developers should make sure that fixing the build is a top priority, not just something that will happen when they get a chance. In order for this to happen, ultimately every developer has to understand and believe in the reasons for continuous integration in the first place.

 

I hope  you found this useful. There are many issues that can cause continuous integration to work against you rather than for you, but after working with several teams these were some of the common ones I experienced.

Posted in Continuous integration | Tagged | Leave a comment

No more redeploys for me

Isn’t it annoying – you make a change to your java web application, compile it, deploy it, start the application server, navigate to the web page you changed, and for some reason it doesn’t work. Fortunately it’s an easy fix, but in order to test it, you have to go through that build, deploy cycle again. This deployment cycle can often be time consuming. I’ve lost count of the amount of time I have spent watching JBoss / OC4J / Weblogic startup after I have made a change to my application.

I have recently started using a tool that aims to address this very problem – JavaRebel. JavaRebel is a small JVM plugin that automatically loads changes to Java .class files on the fly, avoiding the need to restart the application server. JavaRebel monitors the timestamp of the Java class file, and when it is updated, e.g. when the Java file is changed in the IDE, the changes to the class are automatically reloaded. 

In practice what this means is that when I am making changes to the Java classes of my web application, all I need to do to see the changes to my page is save the file in Eclipse, then press refresh in my browser.  What previously took me anything from 10 seconds to several minutes to do now only takes a couple of seconds. And over the course of a day or a week this time certainly adds up. It also makes it a lot easier to follow a more test driven approach when developing, but that’s a topic for another day. 

Beyond reloading changes to classes, JavaRebel also provides a plugin system to allow framework developers to provide hooks into their frameworks to allow more than just reloading classes on the fly. For example one plugin I frequently use is the JavaRebel Spring plugin. This allows my Spring configuration files to be monitored in addition to my Java class files, so that any changes to my Spring config are also automatically reloaded on the fly. This allows me to develop my Spring based web app, almost entirely without requiring  a server restart.

Java development just got a whole lot easier!

Posted in Java | Tagged , | 3 Comments