Archive for the ‘Software’ Category.

ICSOC Day 1 Keynote – Services for Science

The 6th International Conference on Service Oriented Computing is on in Sydney this week. NICTA is a sponsor, and I managed to score a registration to attend.  Ian Foster opened with an interesting keynote. (Preceded by a 30 minute delay fussing with Mac technology issues!)  He spoke on “Services for Science” – how SOA is being used to support knowledge creation in science. Currently there’s a surprisingly strong growth of online services providing data and analysis, in astronomy and especially in the biomedical field.  He talked about the caGrid network. Ontologies are key there for meta-data of experimental results – Ian commented that the community is very “neat” (not scruffy) in being explicit and standardised in the representation and organisation of their data.

It’s interesting that for representing scientific workflow they’ve dropped BPEL in favour of the workflow notation and supporting infrastructure in Taverna. The workflows are used not only to coordinate data and analyses, but also to communicate methods and in principle to promote reuse. But the caGrid leaders recognise that it’s hard to design for workflow reuse, and hard to achieve reuse in practice.  Ian also discussed experimental use of functional programming techniques to support provenance – to capture computations as a first class entity for scientific audit, review, and mining. He finished with some discussion of scalability and text mining of research publications.

I think there are interesting analogues of some of the issues now being explored in the e-science domain that have already been thrashed out in software engineering. They are quite similar in some ways – in the two fields of practice at an industrial scale, there are teams of knowledge workers working on complex and partly-shared electronic assets. Large scale reuse and variation has been made methodical in Software Product Line Engineering, and provenance issues are very similar to those that are well known in the established discipline of (Software) Configuration Management.

COAG Invests in a National Electronic Conveyancing System

COAG met on Saturday and decided to invest to implement a national approach to conveyancing – the National Electronic Conveyancing System (NECS).  Currently each of Australia’s eight states and territories has its own different system for dealing with the transfer of real estate.  You might not think that’s a big deal – after all, wherever you are, the house you buy is only going to be in one state!  Why does it matter to have a uniform national system?

At an abstract level from the public’s point of view, when you buy a house, there’s just a buyer, a seller, and a central land registry that maintains the “golden truth” about ownership under the standard Torrens system of title.  It’s a little more complicated than that because mortgages for housing loans are also registered with land registries.  So banks and non-bank lenders are normally involved too.  It’s more complicated than that, because there’s a whole raft of other auxiliary entities involved in title exchange, such as title search companies, lawyers, property valuers, and insurers providing related services.  The whole industry (banks, non-bank lenders, and the auxiliary service organisations) operate nationally.  Currently they need to implement and maintain systems to deal with the land registry systems in each of the eight states and territories.  In the past conveyancing has been a manual process, and human processors have been able to deal with the inefficiencies of working with multiple interfaces.

However, access to land registries is starting to move online, to reduce the cost and time of buying real-estate.  When conveyancing becomes automated, there’s a large initial cost borne by everyone in the industry to integrate with the new system(s).  Companies would prefer to pay this initial overhead cost once, not eight times!

NECS is intended to address this problem.  The goal is not to create a single national land registry, but instead to create a single national interface to all of the state and territory land registries. Organisations will be able to integrate with the national interface, and gain access to the land registries in every state.

Our group at NICTA has been working with NECS, looking at issues in the definition and management of business vocabularies, business rules, and business processes.  NICTA’s research philosophy is “use-inspired research” – working on fundamental scientific advances and technology innovations in the context of, and with an understanding of, real-world problems.   The goal is to do research that has more impact, and benefits Australia.  Our work with NECS is an example of all of this.   It’s still early days, but having a deep engagement with conveyancing and e-government has already been important to motivate and direct the research we’re doing.

Computer Science vs Software Engineering

My University education is in Computer Science, but by professional life and renewed research career is in Software Engineering.  A lot of people (and perhaps some University departments!) probably think these are just the same thing, with different names.  But in my transition to Software Engineering I’ve discovered they’re very different, and I think their difference is not all down to the the normal arguments about science vs engineering.

In Computer Science, the “unit of analysis” is the procedure (in the sense of effective procedure, but I also mean to include non-terminating processes).  Entities of interest include algorithms and data-structures, interfaces, ADTs, types, and languages for expressing them.

In contrast, in Software Engineering, the unit of analysis is the whole software system.  Here the entities of interest include architectures, and system models. A whole software system is not just “bigger” in size than a single procedure/process.  It also has many more different kinds of functionality, many more developers, and many different users and other stakeholders.

There are a lot of common themes across Computer Science and Software Engineering.  For example, both are concerned with issues such as specification, construction, distribution, performance analysis, and verification.

The challenges for Software Engineering are not just dealing with the scale of the system, but also dealing with the scale of the development of the system. The challenges are not just technical, they’re also socio-technical.  So although Computer Science and Software Engineering both deal with software and have many common themes, their technologies and methodologies are usually quite different because they’re dealing with different kinds of entities in different contexts.

Computer Science and Software Engineering Software are very different disciplines.

Lessons on Standards from Build Management

I’ve written an article for the latest edition of the CM Journal at CM Crossroads, on Four Lessons about Company Standards and Procedures from Build Management. Writing in a practitioner’s forum is very different to writing academic papers! I’m not sure I’ve completely got the hang of it yet… but it’s fun trying.

CM Best Practices – Two Lists and One of Mine

A colleague forwarded to me a pointer to the current issue of the ITMPI journal, on Best Practices in Configuration Management. They make an interesting contrast with the best CM practices in the November issue of the CM Journal on “What Best Practice Is Best?”, and specifically the article CM: The Next Generation of Top 10 Best Practices.

The CM Journal article’s list is as follows:

  • 1. Use of Change Packages
  • 2. Stream-based Branching Strategy – do not overload branching
  • 3. Status flow for all records with Clear In Box Assignments
  • 4. Data record Owner and Assignee
  • 5. Continuous integration with automated nightly builds from the CM repository
  • 6. Dumb numbering
  • 7. Main branch per release vs Main Trunk
  • 8. Enforce change traceability to Features/Problem Reports
  • 9. Automate administration to remove human error
  • 10.Tailor your user interface closely to your process
  • 11. Org chart integrated with CM tool
  • 12. Change control of requirements
  • 13. Continuous Automation
  • 14. Warm-standby disaster recovery
  • 15. Use Live data CRB/CIB meetings
  • 16. A Problem is not a problem until it’s in the CM repository
  • 17. Use tags and separate variant code into separate files
  • 18a. Separate Problems/Issues/Defects from Activities/Features/Tasks
  • 18b. Separate customer requests from Engineering problems/features
  • 19. Change promotion vs Promotion Branches
  • 20. Separate products for shared code

The ITMPI Journal articles lists just 3 best practices:

The ITMPI Journal article doesn’t really cover the practices 6, 9, 10, 14, 17, and 20, partly because the CM journal article practices are at a lower level of detail and cover a few practical/administrative issues outside of the core of CM theory.

Anyway, the practices in both these lists are all “good” for “most” development groups. They aren’t rocket science, but I’d say there are a fair number of development groups out there that barely do any of them. Although version control is ubiquitous in industry, CM is still very poorly understood by most practitioners.

For what it’s worth, I think that for most commercial software development, the most important practice isn’t about sophisticated version control issues or processes – it’s a simple but critical piece of organisational structure and policy.

All Customer Deliveries Go Through the Release Team
Create a release team that’s different to the development team. Ensure all customer software deliveries happen through the release team. Have the most senior manager you can find declare it a firing offense for a developer to ship code or patches directly from their desktop to the customer. Even in “emergency” situations.

More than once I’ve seen or heard of companies that own high-end commercial SCM tools, but make those tools worthless because developers sometimes email patches directly to the customer. (Sometimes in the rush they also forget to check in their changes to version control, which just makes the nightmare worse.) The developers are usually trying to “get the job done”, and be “helpful” to a customer with a problem. Sadly, they end up causing bigger problems for everyone down the line. If you’re not managing releases properly, you don’t know what code the customer has – you degrade your ability to diagnose faults, you won’t be able to ship working patches in future, and you might easily introduce a regression problem by losing the fix itself.

Why Not is not Can Not

Our paper “An Exploratory Study of Why organisations Do Not Adopt CMMI” has started to be cited in the literature. It’s been in the “hottest 25 downloads” for the journal since it became available 3 quarters ago, so I’m hopeful the number of citations might grow.

On one hand, it’s great to cited! On the other hand, I worry that the results could be mis-interpreted. Specifically, people might be tempted to mis-read the paper as saying “small companies can’t do CMMI”. We don’t say that. So, I thought I’d try to clarify a few points here…

The main question we do provide some initial evidence for is:

  • When companies decide not to do CMMI, why do they say “no”?

Here’s some questions we don’t provide evidence for in the paper:

  • Do most companies do CMMI?
  • Can most companies do CMMI?
  • Do most companies think they can do CMMI?

(These are alternate “complementary research questions” – by saying what my main research question is NOT, I hope you get a better understanding about the boundaries of the main question, and how to interpret the results.)

In looking at the main question, we broke down the companies by size to see if we could get hints about why companies gave the reasons they did. Small companies tended to give different kinds of reasons than medium and large companies. Roughly, small companies said they couldn’t do CMMI, whereas medium and large companies said they shouldn’t.

But obviously some small companies can do CMMI! There are fairly credible reports of it in the literature. So, what’s going on?

What’s going on is that my target population is not “all software companies”. My target population is software companies that have decided not to do CMMI. To start my research, I first exclude companies that are doing CMMI, or that want to do it.

It’s interesting to progressively partition the population of companies, and think about where most software process improvement research happens. (This looks better as an animation, so think about these pictures as a sort of flick book in your mind’s eye.)

First, there’s the population of every organisation that might reasonably use CMMI.

whynot-1

Some of them will not have heard of CMMI, but some will have. You could study the differences between these groups as a piece of advertising or diffusion research, but I’m not aware of anyone who’s published research on this for CMMI or any other SPI approach.

whynot-2

Of the organisations that have heard of CMMI, some will have decided not to adopt it, and some will have decided to adopt it. This is the point in the adoption lifecycle that we examine in our paper. It is an extremely rare kind of research in software engineering.

whynot-3

For interest’s sake, let’s drill down some more…

Of the organisations that have tried to adopt CMMI, some will have succeeded in rolling it out, and some will have failed.

whynot-4

Of those who succeeded in adopting CMMI, some organisations will have got some benefit, while others will have had no benefit.

whynot-5

Where do you think most CMMI/SPI research sits?

Most research papers report on how organisations benefitted from using CMMI. In the literature, you almost never hear about organisations who tried and failed to adopt CMMI, nor about organisations who adopted but then failed to benefit from CMMI. (This effect is called selection bias, or publication bias, depending on how it arises.) It’s sort of understandable – companies don’t want to let themselves be seen in a bad light, and researchers may think that success is more glamorous than failure.

Looking at successful cases is necessary, but to really broaden the impact of software engineering research, we need to learn from failures as well!

F# Being Productized

The recent announcement by Somasegar at Microsoft that F# is being “productized” has now started to be picked up by various news agencies. Being productized means some future version of the F# language and its Visual Studio integration will be officially supported by Microsoft.

However, I see a couple of people are implying that because F# is functional language, it’s not an OO language. Wrong! It’s both.

Somasegar says F# will be appealing to developers in the domains of “financial, scientific and technical computing.” He says that’s because of the correspondence you can achieve between F# and the mathematics that those developers work with in their domains. I say, also because of the speedy execution performance you can get from F#. Strong static typing in F# gives the compiler many more opportunities to be much smarter about optimization. Speed matters when you’re working with massive data sets and data streams. Dynamic languages will always struggle in comparison.

F# also has great potential as a teaching language in Universities, and Somasegar says that’s a target. So, I really hope Microsoft puts out a free Visual Studio Express edition for F# to make sure the barriers to getting started with F# on Windows are as low as they can be. F# can also run on mono on *nix, so really there’s no excuse to not give it a try whatever platform your University uses.

F# is particularly nice move for Microsoft as a company, because it’s such a strong story of home-grown innovation. I’d agree with Somasegar’s opinion that “[F#] is one of the best things that has happened at Microsoft ever since we created Microsoft Research over 15 years ago.”

p.s. Does anyone know how to search for “f#” on google news without finding faux-swear words? It’s f#@&!ng annoying.

A Web Services Epiphany – Accessing Confluence from F# Using SOAP

Web services” has been a buzz word for so long you might wonder if there’s any buzz left. I’d known in principle about how web service technology was full of goodness, especially for achieving interoperability, but I’d never really taken the red pill. However, recently I had an epiphany.

I’ve been working in a team using the Confluence wiki to organise some information. Confluence is a page-based wiki, but I’ve been pushing it a little in the direction of being a semantic wiki, by using 2-column tables on some pages, to represent key-value pairs for page attributes and relationships. My problem was that, to Confluence, those tables were just ordinary textual content. There was no way to check that the special tables were well-formed, and there was no way to query the wiki using the page attributes and relationships.

My first plan was to export the wiki to XML and query that, but then I discovered that Confluence has a SOAP API. It looked promising, and it also looked like a good excuse to do some more scripting with F#.

I bumbled around the web looking for simple guide on how to actually use SOAP, and finally came across a page from Robert Pickering’s site that laid it all out for me. For Confluence, it goes like this:

  1. Generate a C# file from the Confluence WSDL file:
    wsdl http://<your-confluence-path-here>/rpc/soap-axis/confluenceservice-v1?wsdl
  2. Compile it to produce a dll:
    csc /target:library ConfluenceSoapServiceService.cs

(My wsdl.exe is in C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin, and my csc.exe is in C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727, but yours might be somewhere else.)

Then you’re set to go. For the purposes of exposition, say you want a page called “PAGE” from a Confluence space called “SPACE”, to which you’ve got anonymous access:

  1. In the F# interactive environment, load the dll, e.g. :
    #r "ConfluenceSoapServiceService.dll"
  2. Then, create a new type F# object to expose that SOAP API:
    let wiki = new ConfluenceSoapServiceService()

    (Right now, in the interactive environment, I need to call this twice to get it to work, because of some strange F# interactive bug, but Don assures me it works first time when it’s compiled.)

  3. The API functions are available as members of that new object, just like they were written natively in F#:
    let page = wiki.getPage("", "SPACE", "PAGE")
  4. But it’s not just the API functions – the values too are all just like they were written natively in F#. For example, the page returned above is in the RemotePage type. It has fields for things such as the page content:
    page.content

After that I was away – the standard libraries and seamless .net interoperability provided by F# made most things easy. The hardest part of the exercise was working out how to parse the Confluence wikicode using regexes. (Oh, the pain!)

So, now I’ve seen the light – web services really are ace. I was accessing an API implemented in Java, running on *nix, but to me it looked like it was native to F#, running on Windows. And setting it up took almost no effort. (After I found out how it was done. :-) )

I think my epiphany was mostly about web services, or was it about F#?

Feisty Fawn

Today, version 7.04 (“Feisty Fawn”) of Ubuntu Linux was released. Yay! Earlier this week I installed the Feisty beta release, and rediscovered Linux happiness.

I first started using Linux at home in the mid/late 90s, installing Redhat Linux 4 on a second-hand PC. I faithfully followed the Redhat releases, installing Redhat 6 on a new PC, and eventually upgrading to Redhat 9. Then, Redhat went all “enterprise”, and stopped their free package update management service. My old Redhat 9 box slowly became impossible to maintain. Maybe Redhat thought focussing on the enterprise was good for them – and maybe it did help their stock price for a while. At the time there was no credible transition (e.g. to Fedora), and so as a home user I just felt abandoned. Now I think Redhat’s getting what’s coming to them.

To successfully install Redhat 4, I had to host a private “installation party”, and buy pizzas for the Linux geeks who came around to mess around with OS configuration files on the command line. For a while I thought my Feisty beta installation experience was going to be similar. The desktop install stopped half-way through (ubiquity crashed), and then the alternate install managed to corrupt my software RAID configuration before failing. But finally the alternate installation worked without software RAID. Afterwards I was left in 800×600 screen resolution for a while, until I discovered some advice on the net about using dpkg-reconfigure.

There are still dome problems remaining now I’ve finished the installation. My box won’t poweroff when I shutdown. And, it’s still hard for Windows and Linux boxes to play nicely together on my network. I can share files in one direction but not the other, and it hasn’t “just worked” setting up my Linux box as a network print server for its local printer.

But, these are minor problems that don’t diminish my renewed Linux happiness. The desktop menus are clean, the system administration tools work (apart from the problems noted above), the package update management is fantastic, I like the use of sudo for root-less security, and it’s very neat to be asked to install system features only if/when you need them. Now we’ve upgraded, our scanner finally works, and we’ve been able to install Wesnoth to keep Liam occupied these school holidays.

An Exploratory Study of Why Organizations Do Not Adopt CMMI

Our paper on why companies don’t adopt CMMI is now available online. This paper describes the details of some issues I’ve talked about previously to various industry audiences. In this study, we found that some companies (especially small ones) felt there were barriers to adopting CMMI. For example, often they didn’t get around to considering the possible benefits of adopting CMMI, because they perceived that it would just be too costly or time-consuming. The goal of this sort of research is to better understand the needs of software-developing companies, to help researchers develop software process improvement approaches that are more relevant and accessible to companies. There are more details in the paper.

Maybe of note is that the article is currently #20 in the Top 25 Hottest Articles for the journal, even before it’s appeared in print!