Wednesday, November 4, 2009

Windows App Store

I'm developing applications for PC and the iPhone at the moment. It looks like the iPhone application will be the first to market for two reasons: it is a simpler application (which always helps), and there is less hassle to get it out the door. The main reason there is less hassle developing an iPhone app (ignore Apple's approval process) is because most common things are managed for the developer.

The iPhone App Store

I can't say I'm a big fan of owning hardware that you have little or no control over but I do love both my iPhone and my Xbox 360. These controlled environments provide several benefits to the consumer:

  • People feel safe using their credit cards to buy software. I don't think people are very keen to use their credit card at any old place - they don't want to be ripped off by paying for something that doesn't arrive and they don't want their credit card details stolen for other unauthorized uses.
  • There is no fear of malware. Having all the software up for sale screened by a third party prevents malware from creeping in. Even if it does get in the vendor can release an update to disable it remotely.

There are also many benefits for the developer:

  • No need for a complex installer. This is a huge pain on Windows and Linux (I'm not sure about OSX) and having a simple standard means of deploying the application is great.
  • Distribution handled. The developer doesn't need to find or pay for hosting applications even free ones.
  • Payment handled. The payments from the customer are handled automatically.
  • Licensing handled. There is no need for a custom activating and licensing mechanism.
  • Error reports handled. On Windows there is winqual but it requires a $99 certificate, avoid that cost requires a custom solution.
  • Single payment. The app store vendor takes a cut of sales so there is no separate cost for a winqual certificate, signing certificate, hosting, etc.

Windows App Store

An app store for Windows would address several key problems for desktop development:

  • Remove the need for custom installers.
  • Automatic updates and a single update notification. It seems like almost everyone does a poor job of updates and on my PC I have separate Adobe, Java, Apple and Google processes checking for updates.
  • Distribution. No need to find hosting.
  • Unified "feel" for applications. Consumers would have a single spot they felt safe using their credit card and knowing they weren't downloading malware. A great example of this on Windows is Steam.
  • Mandated quality standards. This would hopefully move towards removing crapplets on PCs.
  • A easy & safe way to make payments.

Microsoft already have "Windows Marketplace" which apparently supports third-party titles however a quick look at the site fails to show any such applications. I think at a minimum they should re-brand whatever app store they make perhaps using Bing. "Bing App Store"?

Perhaps Valve will moving into distributing applications as well as games...

Wednesday, September 30, 2009

Secure Updates?

Currently I'm looking into secure automatic updates for a .net program I'm developing. I've asked on stackoverflow for people to review my approach to the update process here. In the process I've come across some interesting articles on automated updates and I thought I'd review how other applications go about it.

Java

<java-update-map version="1.0">
<mapping>
<version>1.6.0-rc-b98</version>
<url>http://javadl-esd.sun.com/update/1.6.0/au-descriptor-1.6.0_15-b71.xml</url>
</mapping>
<mapping>
<version>1.6.0</version>
<url>http://javadl-esd.sun.com/update/1.6.0/au-descriptor-1.6.0_15-b71.xml</url>
</mapping>

...

</java-update-map>

<java-update>

<information version="1.0" xml:lang="en">
<caption>Java Update - Update Available</caption>
<title>Java Update Available</title>
<description>Java 6 Update 15 is ready to install. Click the Install button to update Java now. If you wish to update Java later, click the Later button. To get a FREE copy of OpenOffice.org, the global standard in free, Microsoft compatible office productivity software, just click the More Information link below.</description>
<moreinfo>http://java.com/infourl</moreinfo>
<AlertTitle>Java Update Available</AlertTitle>
<AlertText>A new version of Java is ready to be installed.</AlertText>
<moreinfotxt>More information...</moreinfotxt>
<url>http://javadl-alt.sun.com/u/ESD6/JSCDL/jre/6u15-b71/jre/jre-6u15-windows-i586-iftw.exe</url>
<version>1.6.0_15-b03</version>
<post-status>https://sjremetrics.java.com/b/ss//6</post-status>
<cntry-lookup>http://jal.sun.com/webapps/installstat/CountryLookup</cntry-lookup>
<predownload></predownload>
<options>/installmethod=jau SP1OFF=1 SP2OFF=1 SP3OFF=1 SP5OFF=1 SP6OFF=1 SP7OFF=1 SP8OFF=1 SP10OFF=1 MSDIR=ms4 NEWMSTB=1 SPWEB=http://javadl-esd.sun.com/update/1.6.0/sp-1.6.0_15-b71</options>
<urlinfo>6068ce6c957932593d20059bebab0dfc8b056ac3</urlinfo>
</information>

...

</java-update>


Paint.net


; 3.xx manifest

DownloadPageUrl=http://www.getpaint.net/download.html

StableVersions=3.36.3158.38068
BetaVersions=3.50.3550.40197

3.36.3158.38068_Name=Paint.NET v3.36
3.36.3158.38068_NetFxVersion=2.0.50727
3.36.3158.38068_InfoUrl=http://www.getpaint.net/roadmap.html#v3_0
3.36.3158.38068_ZipUrlList=http://www.getpaint.net/updates/zip/Paint.NET.3.36.zip
3.36.3158.38068_FullZipUrlList=http://www.getpaint.net/updates/zip/Paint.NET.3.36.zip

3.50.3550.40197_Name=Paint.NET v3.5 Beta 1 (Build 3550)
3.50.3550.40197_NetFxVersion=3.5.1
3.50.3550.40197_InfoUrl=http://paintdotnet.forumer.com/viewtopic.php?f=46&t=31684
3.50.3550.40197_ZipUrlList=http://www.getpaint.net/files/zip/preview/Paint.NET.3.5.Beta.3550.Update.zip,http://www.dotpdn.com/files/Paint.NET.3.5.Beta.3550.Update.zip
3.50.3550.40197_FullZipUrlList=http://www.getpaint.net/files/zip/preview/Paint.NET.3.5.Beta.3550.Install.zip,http://www.dotpdn.com/files/Paint.NET.3.5.Beta.3550.Install.zip


  • This manifest file and the associated binaries don't appear to be signed in any way. Update: Rick Brewster has commented to say that the downloaded binary is signed and the signature is verified.

Skype


4.1.0.166


Other



  • iTunes posts a big blob of data back to Apple on startup checking for updates - I didn't look into this much.
  • Google Chrome also posts back information when checking for updates. My version of Chrome was up to date so I didn't see the update process in action.
  • Firefox was already in the process of downloading a new version so I had already missed the file download negotiation.

Summary


It seems like everyone is doing automatic updates differently (no surprises there). It also looks like there is plenty of scope for man in the middle and spoofing attacks if the downloaded binaries aren't signed or don't have their signatures. It doesn't seem like many people are checking their manifest files before downloading binaries which could lead to Safari style "carpet bombing" where malicious binaries are downloaded onto the system.

Tuesday, September 29, 2009

Roomba Review

Last Christmas we got a Roomba 530 (you can see a similar model here). A Roomba is a robotic vacuum cleaner, however, it mostly collects rubbish (dirt, garbage, hair, etc) using it's system of brushes. Since it is battery powered it doesn't have a very strong vacuum action. Having had Roomba in our house for almost a year I thought I give a quick review of it's pros and cons.

Firstly before I go into the details of the actual Roomba unit there are a few important background details:

  • I live in a small unit that has all hardwood floors.
  • I also have two cats who generate a lot of cat hair.

This combination is almost perfect for an automated cleaner: small amounts of dust and cat hair need cleaning every few days, there is no carpet to soak up dirt.

Pros

Roomba is awesome!!!11

It is quite cool to have an automated robot to go and clean your floors while you sit back and watch. And it is very mesmerising at first the way Roomba negotiates chair legs, people and other obstacles. Its a cliché but I do wonder if the next generation will even know what a "manual" vacuum cleaner is.

Roomba is small

In a small unit space is important, Roomba takes up a fraction of the space of a traditional vacuum cleaner.

Roomba cleans where you can't

To paraphrase the iRobot promotional video, 'Roomba cleans under stuff'. I am constantly surprised at how much gunk it is able to find underneath furniture like our couch (out of sight out of mind I guess).

Fire and forget

You can put Roomba on and go out; it will happily clean and re-dock itself once its done. However it practice you need to clear the floor of cables, etc. Otherwise you will come back to find Roomba has eaten the cord of your curtains and is hopelessly stuck.

Quiet

Roomba is noisy, but compared to a regular vacuum cleaner it is very quiet. My cats would run away when we started to put together the old vacuum cleaner but with Roomba they are happy to hang around and watch it with disdain.

Cons

Maintenance, maintenance, maintenance

The main problem with Roomba is that while it cleans your house without much supervision you need to clean it. After every clean you need to empty its pathetically small bin and after most cleans you need to clean the main brushes. Every three or four cleans I have to also remove hair from it's wheels and in the bearings of the main brushes. Occasionally you have to unscrew the 'flicking' brush to remove some long string that its eaten. Finally after about 8 months I notices that Roomba was no longer driving around in a straight line and I had to clean its sensors with a cotton bud.

 
Hair on the brushes

Hair in the bearings

Hair cleaned off the brushes

Gunk from the sensors

Fire and forget?

Fire and forget is sometimes like fire and get stuck: unless you prepare your floor removing any big particles (or are insanely neat) Roomba will eat a rubber band, pen, string, cable or similar and get is main brushes stuck. Usually I put Roomba on and then start to clear things that it might get caught on.

Dark furniture

Roomba uses several sensors to work out if it is going to bump into things before it actually triggers its big bumper. Unfortunately these sensors don't pick up dark furniture very well and as a result Roomba bangs into things (quite loudly). Short of putting reflector strips at Roomba height you can always move the furniture or just ignore it.

Summary

Roomba is awesome and I love it but it does require maintenance. Having seen it in action I feel that it works best on hard floors but it seemed to work well on carpet when I tried it at my parent's place. The main advantages are being able to put it on and leave it and also that it cleans under furniture. It does have some annoying quirks but nothing that has bothers me too much.

Tuesday, August 18, 2009

iPhone Reaching Critial Mass

It seems to me that the iPhone is beginning to reach critical mass. And by critical mass I mean that the iPhone is reaching the status of the iPod or Nokia 5110: everyone has one. Part of my reasoning is that everyone I know seems to be talking about their iPhone or planning to get one. (Of course my sample size is very small and probably biased, but hey).

Travelling to work and back home everyday it seems like heaps of people already have them. Since it is becoming more popular is also seems to becoming less flashy and pretentious. Admittedly I live and work in inner Sydney so the sample set may be biased again.

And what alternative is there? In the same way that PC makers are struggling to design and build PCs that match Apple's sleek designs no one seems to be making a viable iPhone competitor. Also the AppStore is a very attractive place to put an application since consumers are quite comfortable using it to purchase software. Which in turn means all the developers are looking at developing for the iPhone.

It all adds up to one thing: critical mass and unfortunately lock-in. And the scary thing is that everyone seems comfortable that Apple won't abuse its position and become an evil monopoly.

Microsoft

Since it seems like Microsoft has lost the lead in the mobile space I think the only way out of the Apple-opoly is to bring their awesome set of developer tools to the iPhone. The lure of Visual Studio and C# / VB.net shouldn't be under estimated. If developers could easily make cross platform applications that ran on both the iPhone and Windows Mobile/Blackberry/Android it would be a big step towards breaking that lock-in.

Who's working towards bringing the C# and the CLR to the iPhone? Novell.

It seems like a strange move but I'm very keen to try it out...

Saturday, June 20, 2009

TWAIN Dot Net

I just released the first version of TwainDotNet, a TWAIN library for the .net framework. It is an open source, MIT licensed work built on top of Thomas Scheidegger's article from The Code Project: .NET TWAIN image scanner.

I had two main reasons for starting the project on code.google.com. Firstly code attached to an article provides no way for people to collaborate. There is no issue tracking, not place to feed back changes, etc. Secondly, I wanted to change the code to work with WPF and feed those changes back to everyone else!

So anyway the code is now up there and an initial binary available for download. I've picked the MIT license since it seems to fit the spirit of Thomas's original public domain dedication but with limitation of liability. And hopefully now there will be a bit more collaboration! :)

 

On a side note I've tried out Mercurial for the first time. The python based TortoiseHg is not quite as polished at TortoiseSvn but it is still very easy to use. I will seriously be thinking about moving my other Subversion projects over to Mercurial at a point in the future...

Monday, June 1, 2009

Exchange Wedding Anniversary?

I saw this in the Microsoft "Exchange Server Protocols Master Property List Specification":

2.932 PidTagWeddingAnniversary

Canonical name: PidTagWeddingAnniversary

Property ID: 0x3A41

Data type: PtypTime, 0x0040

Area: MapiMailUser

References: [MS-OXOABK], [MS-OXOCNTC]

Alternate names: PR_WEDDING_ANNIVERSARY

Its quite funny how much detail you can put onto a contact in Exchange!

June Links - ASP.net Deployment & Misc

ASP.net

HTML Sanitisation

Friday, May 22, 2009

NSsh Progress

I've made some more progress with NSsh recently. Password based authentication is now basically complete. The next hurdle is getting public key authentication working - which is not only extremely useful but required by the SSH specification.

Surprisingly the main hurdle is not the task of validating the key that the user sends over for authentication (there are plenty of open source examples for that). Instead the Windows API is causing problems, specifically creating a process for the user.

To create a process for a user a token is required. This token can easily be created using Windows API calls that take a password. However without a password the alternatives are less attractive. If the machine is on a domain it is possible to perform Kerberos authentication using a constructor on the WindowsIdentity class. But, the only other alternative is to use the CreateToken / NtCreateToken functions. As one person puts it:

"It's possible, although it requires you to do a lot of code."

Hmmm, that doesn't sound good.

Export Processing

Here is a screen shot of Process Explorer's system information dialog while my export processing code is running. The glitches in the IO graph are the write-cache thread flushing its data in the background. Anyway I think it looks nice :-)

export

Tuesday, April 14, 2009

Smart Pointers != Managed Automatic Memory Management

I posted an answer on a stackoverflow question about C++ and garbage collection. My answer wasn't ideal and caused a bit of discussion in the comments so I thought I would just sum things up here.

Manual (Traditional) Memory Management

Manual memory management places the onus of memory management on the programmer. It is the style of memory management available to C and C++ developers. Here are some of the attributes of manual memory management:

  • Slow free-list allocation. To allocate memory in this style of memory management the runtime library consults a free list to find a free block. When allocating lots of blocks this can be slower than bump-pointer allocation.
  • No locality benefits. Blocks allocated sequentially won't usually be allocated next to each other. Later when some blocks are freed the remaining blocks a fixed in place and can't be moved.
  • Data is harder to share between modules. Module writers must manually decide who is responsible for freeing data and when it can be freed.
  • Invalid and dangling pointers are possible. Blocks can be freed multiple times. Every allocation must be checked. Etc, etc, etc..

Automatic Dynamic Memory Management

This is usually referred to as garbage collection and is the type of memory management used in Java, C#, Python, Ruby, etc, etc. Here are some memory management strategies possible with automatic GC:

  • Fast bump pointer allocation. Objects can be allocated using a bump pointer and then promoted to a free-list manage space only if they live. Alternatively a copying or compacting GC may always allocate with a bump pointer.
  • Locality benefits. Since objects can be moved and references updated objects can be allocated and later moved to improve locality. This results in better CPU cache usage.
  • No pointer problems, easy to share data between modules, etc, etc.

Smart Pointers

Some people argue that the answer to the problems of manual memory management is using smart pointers. They are certainly better than using nothing however they only provide the most primitive implementation of reference counting:

  • Unbounded cost on decrement. If the root of a large data structure is decremented to zero there is an unbounded cost to free all the data. (I personally find this property amusing since this problem is often an argument against GC)
  • Manual cycle collection. To prevent cyclic data structures from leaking memory the programmer must manually break any potential structures by replacing part of the cycle with a weak smart pointer. This is simply another source of potential defects.

First Class Reference Counting

Reference counting garbage collection has several advantages over smart pointers (beyond the base GC advantages).

  • Changes to an object reference count are ignored for stack and register references. Instead when a GC is triggered these objects are retained by collecting a root set.
  • Changes to the reference count can be deferred and processed in batches. This results in higher throughput.
  • It is possible to coalesce changes to the reference count. This makes it possible to ignore most changes to an objects reference count improving RC performance for frequently mutated references.
  • Incremental cycle detection cost. It is possible to process cycle detection tasks in bounded chunks at GC time.

Summary

Automatic memory management has several technical advantages over manual memory management. When you factor in the reduced cost of development and maintenance when using garbage collection you should be in front.

For details of a high performance reference counting GC implementation look at this paper.

Wednesday, March 25, 2009

Net Censorship in Australia

Current the planned Australian Internet filter  and the Australian Communications and Media Authority (ACMA) are causing quite a bit of controversy. The ACMA's Internet blacklist was leaked and there were some disturbing entries on the list. Among the entries were a Queensland dentist's website, a tour operator, and after disclosing the contents, wikileaks has been added to the list.

Transparency

There are two problems with the current blacklist as it stands. Firstly, a site that links to a blacklisted or illicit site is also blacklisted. The prime example of this is wikileaks.org which was blacklisted after disclosing the ACMA's blacklist. Media outlets have also been threatened with an $11,000 a day fine for linking to banned sites. This is insanity:

  • What if you link to a site then at a later date it posts offending material?
  • What if a site is placed onto the blacklist after you've linked to it?
  • What if a user posts a black listed link onto a forum you control?
  • The link itself doesn't contain any illicit material.

It seems like a violation of free speech to prevent someone from discussing illicit material, how is linking to it any different? Of course the linking issue it the key issue being played out in music sharing cases such as the current trial for "The Pirate Bay".

Secondly, the list is secret and thus there is no accountability. The article on ComputerWorld claims that the decision to add a site to the list can be made by a single bureaucrat. Without the list being disclosed how can there be any accountability for government. They may argue that disclosing the list will provide a central point for people to look for the offending illicit material (child pornography, etc). However:

  • People know how to get to illicit material anyway. Pedophile networks can be extremely sophisticated and secure.
  • If the governments filter is put in place all those sites will be blocked anyway.

Leave it to the Experts

There is already a lot of evidence that the Internet filtering plan will not work out (including ACMA's own reports). Instead of entering into an unaccountable Internet censorship setup, I feel that we should direct the funding into our state and federal police bodies. They are already doing a great job but could use the extra support. In NSW the police have just recently been given a boost to their online powers but only with a warrant - already much more accountable than a secret list.

Further Reading

Tuesday, March 17, 2009

Interesting Links

Sunday, March 8, 2009

Java Exceptions

Checked exceptions... they cause me pain. At first glance they seem to provide a way for design by contract to include exceptions. However they only increase complexity and have limited usefulness for design by contract.

SNR

Firstly, they increase the "signal-to-noise ratio" for the code. The SNR for code determines how clear your code is (someone else coined that term - but try Google for it!). Anders Hejlsberg also talks about imperative vs declarative programming which is a similar concept. Anyway consider the following code snippets:

Update UI from non UI-thread in Java:

try
{
// Run the update code on the Swing thread
SwingUtilities.invokeAndWait(new Runnable()
{
@Override
public void run()
{
try
{
// Update UI value from the file system data
FileUtility f = new FileUtility();
uiComponent.setValue(f.readSomething());
}
catch (IOException e)
{
throw new IllegalStateException("Error performing file operation.", e);
}
}
}
}
catch (InterruptedException ex)
{
throw new IllegalStateException("Interrupted updating UI", ex);
}
catch (InvocationTargetException ex)
{
throw new IllegalStateException("Invocation target exception updating UI", ex");
}

Update UI from non UI-thread in C#:

private void UpdateValue()
{
// Ensure the update happens on the UI thread
if (InvokeRequired)
{
Invoke(new MethodInvoker(UpdateValue));
}
else
{
// Update UI value from the file system data
FileUtility f = new FileUtility();
uiComponent.Value = f.ReadSomething();
}
}

The UI threading code in the C# example still adds some extra noise so I like to use Postsharp to reduce it down to this:

[UIMethod]
private void UpdateValue()
{
// Update UI value from the file system data
FileUtility f = new FileUtility();
uiComponent.Value = f.ReadSomething();
}

Which seems a lot clearer to me. When you start to do more and more UI work in Swing checked exceptions start to become really annoying and useless.


Jail Break


To implement even the most basic of implementations, such as Java's List interface, checked exceptions as a tool for design by contract fall down. Consider a list that is backed by a database or a filesystem or any other implementation that throws a checked exception. The only possible implementation is to catch the checked exception and rethrow it as an unchecked exception:

public void clear()
{
try
{
backingImplementation.clear();
}
catch (CheckedBackingImplException ex)
{
throw new IllegalStateException("Error clearing underlying list.", ex);
}
}

And now you have to ask what is the point of all that code? The checked exceptions just add noise, the exception has been caught but not handled and design by contract (in terms of checked exceptions) has broken down.


Service, s'il vous plaît?


When the cost of checked exceptions starts to kick in (wrapping in unchecked exceptions, millions of checked exceptions) you can see code like this in response:

try
{
// Perform some action
}
catch (Exception e)
{
log.warn("Unexpected exception processing action", e);
}

The problem with this code is well documented in the .net world. Basically if you can't handle the exception then step aside.


Summary


In non-trivial situations all of the above problems compound making debugging and maintaining code harder. They add noise to the code and the usefulness as a design by contract mechanism is questionable.


Really, in Java the divide between checked exceptions and runtime exceptions should have been a red flag: some exceptions are are unexpected.  Exceptions are for exceptional circumstances and you don't always want to deal with exceptional circumstances at all levels in your code.

Tuesday, March 3, 2009

Java Suppress Finalizer

A colleague pointed my to Dr Heinz M. Kabutz's article on Java finalizers. The article compares the performance of objects that have a trivial (empty) finalizer vs those that have a very simple finalizer. The basic outcome is that the running time for the non-trivial case is orders of magnitude larger than the trivial case.

I was quite shocked when I saw this since the simple finalizer given in the example is similar to what is proposed in .net IDisposable pattern. I was thinking "Holy Smoke, Batman! How can it really be that bad??". The answer is in Jack Shirazi's article: basically the finalizer causes the object to live through the nursery collection and thus puts much more pressure on the mature space collector.

.net?

I decided to implement the test in C# quickly and see how it performed. Here is the code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
public class ConditionalFinalizer
{
private static readonly bool DEBUG = false;

// Should be volatile as it is accessed from multiple threads.
// Thanks to Anton Muhin for pointing that out.
private volatile bool resourceClosed;
private readonly int id;

public ConditionalFinalizer(int id)
{
this.id = id;
resourceClosed = false;
}

~ConditionalFinalizer()
{
if (DEBUG)
{
if (!resourceClosed)
{
Console.Error.WriteLine("You forgot to close the resource with id " + id);
}
resourceClosed = true;
}
}

public void close()
{
resourceClosed = true;
GC.SuppressFinalize(this);
}
}

class Program
{
static void Main(string[] args)
{
// Allow the JIT to warm up
for (int i = 0; i < 10 * 1000 * 1000; i++)
{
ConditionalFinalizer cf = new ConditionalFinalizer(i);
if (i % (1000 * 1000) != 0)
{
cf.close();
}
}


DateTime start = DateTime.Now;

for (int i = 0; i < 10 * 1000 * 1000; i++)
{
ConditionalFinalizer cf = new ConditionalFinalizer(i);
if (i % (1000 * 1000) != 0)
{
cf.close();
}
}

TimeSpan time = DateTime.Now - start;
Console.WriteLine("time = " + time.TotalMilliseconds);

Console.ReadKey();
}
}
}
The most important change is the added call to "GC.SuppressFinalize". This test case is the specific reason for this call - basically it removes the object from the finalization queue thus allowing it to be collected in the nursery collection. I also added an initial loop to allow the JIT to compile the classes in use. Here are the results:

DotnetPerformance


Calling "GC.SuppressFinalize" dramatically improves the performance for .net and my initial results line up with Dr Kabutz's. Java really does a neat optimisation for the trivial finalizer case! Unfortunately it also gets smashed in the non-trivial case...


Java


Dr Kabutz's code for showing the classes pending finalization goes 90% of the way towards providing something similar to "GC.SuppressFinalize", I decided to add that last little bit:

import java.lang.ref.Reference;
import java.lang.reflect.Field;

public class FinalizeHelper {

static FinalizeHelper finalizeHelper = new FinalizeHelper();

private final Class<?> finalizerClazz;
private final Object lock;
private final Field unfinalizedField;
private final Field nextField;
private final Field prevField;
private final Field referentField;

public FinalizeHelper() {
try {
finalizerClazz = Class.forName("java.lang.ref.Finalizer");

// we need to lock on this field to avoid racing conditions
Field lockField = finalizerClazz.getDeclaredField("lock");
lockField.setAccessible(true);
lock = lockField.get(null);

// the start into the linked list of finalizers
unfinalizedField = finalizerClazz.getDeclaredField("unfinalized");
unfinalizedField.setAccessible(true);

// the next element in the linked list
nextField = finalizerClazz.getDeclaredField("next");
nextField.setAccessible(true);

// the prev element in the linked list
prevField = finalizerClazz.getDeclaredField("prev");
prevField.setAccessible(true);

// the object that the finalizer is defined on
referentField = Reference.class.getDeclaredField("referent");
referentField.setAccessible(true);

} catch (RuntimeException e) {
throw e;
} catch (Exception e) {
throw new IllegalStateException("Could not create FinalizeHelper", e);
}
}

private void suppress(Object instance) {
try {
synchronized (lock) {
// Get the start of the un-finalized list
Object current = unfinalizedField.get(null);
Object previous = null;

while (current != null) {
Object value = referentField.get(current);

// Check if this entry refers to the instance we are interested in
if (value == instance) {
// Unlink the current entry from the queue
Object next = nextField.get(current);
if (previous == null) {
unfinalizedField.set(null, next);
prevField.set(next, null);
} else {
nextField.set(previous, next);
prevField.set(next, previous);
}
break;
}

// Move to the next entry
previous = current;
current = nextField.get(current);
}
}
} catch (IllegalAccessException e) {
throw new IllegalStateException(e);
}
}

public static void suppressFinalize(Object instance) {
finalizeHelper.suppress(instance);
}
}

Here is the updated test files:

public class ConditionalFinalizer {
private static final boolean DEBUG = false;

// Should be volatile as it is accessed from multiple threads.
// Thanks to Anton Muhin for pointing that out.
private volatile boolean resourceClosed;
private final int id;

public ConditionalFinalizer(int id) {
this.id = id;
resourceClosed = false;
}

protected void finalize() throws Throwable {
if (DEBUG) {
if (!resourceClosed) {
System.err.println("You forgot to close the resource with id " + id);
}
resourceClosed = true;
super.finalize();
}
}

public void close() {
resourceClosed = true;
FinalizeHelper.suppressFinalize(this);
}
}

and:

public class ConditionalFinalizerTest1 {
public static void main(String[] args) throws InterruptedException {
long time = System.currentTimeMillis();
for (int i = 0; i < 10 * 1000 * 1000; i++) {
ConditionalFinalizer cf = new ConditionalFinalizer(i);
if (i % (1000 * 1000) != 0) {
cf.close();
}
}
time = System.currentTimeMillis() - time;
System.out.println("time = " + time);
}
}

And here are the updated results:


DotnetPerformance2


Suppressing the finalizer for those objects significantly increases the performance of the test. The difference in performance between .net and Java is probably because I'm using reflection to scan the finalizer list in the Java case.


Summary


There is a definite benefit to suppressing finalizers for object that get manually disposed, its just a pity that there is no native "GC.SuppressFinalize" method for Java. The IDiposable pattern is very useful for situations where you want to be able to:



  • Manually dispose of an item when you know it is no longer needed.
  • Be able to run clean up code to safely bring down connections, close files, etc.
  • Be able to rely on the finalizer to catch situation where it is not known when an object should be cleaned up or to catch mistakes that mean the clean code is not explicitly called.

Monday, February 2, 2009

Joel on SOLID

After listening to Jeff and Joel's Stack Overflow podcast today I felt like Joel had slightly missed the point of the SOLID principles. I can see how it is easy to miss the point of SOLID - an acronym of acronyms! But for me "SOLID" adds to design patterns and anti-patterns because it helps me to describe, more formally, when some code doesn't seem right - when code has a "code smell" to be a bit of a hippy about it.

Joel goes on to say that he doesn't think the principles apply well to real life situations, however, I think they apply perfectly. They are all about taking a pragmatic but disciplined approach. Take the current work I'm doing as an example: I was asked to change our product's export function and make it parallel where possible. To start with it was a single class with a few thousand lines of code. It had a "unit" test to go with it that covers about 70% of the entire code base when run and takes about five minutes to complete. To even think about how it might be made parallel I had to do some serious refactoring.

Anyway here is my interpretation of the principles and how I used them when refactoring our export code:

  • Single responsibility principle: Read "keep it simple stupid". We literally had an Export class that did the whole export. It exported the raw data, PDF-ed if required, TIFF-ed if required, etc, etc. I changed the code so that the export class creates a WorkQueue class that determines the items to process; processors for processing items at each step; a report writer; and much more. Since parts of the system were no longer tightly coupled I could add tests for each part which ran faster and were more maintainable. Here is a simplified before and after:
    Classes before and after.
  • Open/closed principle: extending a module shouldn't break stuff. To be honest I don't consider this principle much during my day to day work, perhaps that is due to the type of product I'm working on.
  • Liskov substitution principle: have a good reason for using a base class. I feel that this strongly tied in with the SRP. For example our Exporter class inherited from BaseExporter (we have multiple types of export) and common "export" code was placed here. Sometimes a better solution is to place common code in a supporting class instead of inheriting. The acid test for inheritance should be am I ever going to use the base class independently of the subclasses? The LSP is about avoiding broken inheritance hierarchies. Uncle Bob gives the example of a set of number types which is excellent. He says inheritance is not "is a" and in the numbering example he points out that while integers are a subset of real numbers they have no commonality in terms of storage: an integer might be stored with a single 32 bit value where a real number has a mantissa and an exponent.
  • Interface segregation principle: again read "keep it simple stupid" but also "you aren't going to need it". This is simply a principle to avoid creating fat interfaces. The point is to define a service in terms of the clients that are going to use it. This is especially important in Java where events an delegates aren't available. For example if you have an ExportListener interface that has itemProcessedNotification and itemCompletedNotification a client will have to implement both listeners even if it doesn't care about both. Fat interfaces reduce signal to noise ratio - it obscures what you are trying to do.
  • Dependency inversion principle: don't tightly couple things. If everything depends on concrete implementation classes, if there are global static singletons, etc, things will be hard to modify and test. This is especially important in C# where methods are sealed by default (in Java you can mock most classes and mock out their non-final methods). In my export code I change things so that the Exporter class only depended on abstractions such as the ExportProcessor interface. This decoupled the components and made testing easier.