Archive for the ‘Software’ Category.

Reference Management

Researchers “stand on the shoulders of giants“, which in practice means reading a lot of academic papers and reports.  Lots.  You not only want to read them, but also cite them in papers you write, search them, and organise them by whatever topics you’re investigating.  How do you do that?

When I was a PhD student, I kept hard copies of the papers I read, and a collection of bibtex files containing reference information.  The bibtex files were separate, each on a different topic.  Since getting back into research in 2004, I’ve gone digital and have tried a few solutions: Reference Manager, EndNote, bibtex again, and zotero.  After each one, I kept reverting to my clumsy manual approach: storing PDF documents in directories, with each directory representing a topic.  Often, papers relate to more than one topic – sometimes then I put a copy or soft link in each topic directory.  I said it was clumsy!  But at least I can work when I’m travelling and off-line.

As a result, I now have dozens of directories stuffed with thousands of PDFs.  (I have less than three thousand, but a colleague has more than ten thousand.)  These calcified directories represent a fixed collection of topics that, as my research focus evolves over the years, is increasingly inappropriate.  I suspect a lot of people work like this.

So I’ve been delighted to discover Mendeley Desktop.  It’s still in beta, but I like its approach.  It lets me keep my PDFs as they are, and works with me to index them and import their bibliographic information into a database, using text recognition and bibliographic web services.  The quality of both of those mechanisms are a bit patchy right now, but it has “needs review” status tracking so I can manually check and correct that bibliographic data over time.  What’s also cool is that I can tell it to “watch” my directories – if I dump more PDFs in there, it’ll incrementally import those too. Mendeley has all the normal features: reference-importing bookmarklets, exporting in bibtex/whatever, Word and Open Office plugins for creating reference lists, etc.  And of course it also has tagging: so now I can create tag topics for my references – organised into as many overlapping topic areas as I need.

I feel like my reference collection has opened up to me, and is becoming a much more useful resource.  That’s fun and exciting but I have to make sure I don’t spend so much time organising my references that I end up not actually doing research!

Breaking the Fractal V Lifecycle?

Liming has raised three points in reference to my Fractal V Lifecycle.  His questions are probing the limits of the model in interesting ways. Before I discuss them, I’d like to introduce a concept and some terminology from an earlier paper I wrote.

The V model can accommodate as many levels of design abstraction as you like.  For example, if you do high-level design/integration testing, add a layer for that between system architecture/system testing and low-level design/component testing. At each level of design abstraction, you have the same schema of activities: design the artifact (called “Plan” in the paper), pass it on for elaboration to a lower level of abstraction (called “Do” in the paper) and then verify that the design and implementation are consistent (called “Check” in the paper).  There is also another kind of activity in the V model, “Plan-to-Check”, which can proceed concurrently with the “Do” step.

So, the left-hand side of the V is “Plan”, the horizontal dotted lines are “Plan-to-Check”, the pointy bit is “Do”, and the right-hand side is “Check”.

On to Liming’s points about situations quoted or paraphrased as follows…

1. “Left side of the V bends up before reaching the bottom”

Liming wonders – how can you “Check” a developed artifact if it hasn’t been built yet!?  As Liming suggests, I want to argue that if you don’t have an implementation yet, you of course can’t test it, but you can certainly analyse the plan/design.  Your “Check”ing activity will be a requirements review, or a model-based design simulation, or perhaps testing with stubs (or “mock objects” if you prefer).  It depends on your level of design abstraction and what artifacts you already have from any previous iterations.

Liming thinks analysis is implicit in the left-hand-side of the V, because designers always carry out some sort of analysis immediately after creating artifacts.  I agree designers often do that sort of analysis, but I disagree that it’s implicit in the left-hand-side of the V!  I think it’s simpler and more consistent to think about it as being part of the right-hand-side of the V, with a null “Do” activity.  So designers would do many mini design/analyse (“Plan”/”Check”) iterations at their level of design abstraction before sending it off to the lower level of abstraction for elaboration.

2. “Right side of the V dips down before reaching the top”

Liming observes that this isn’t just about avoiding end-customer contact.  I didn’t highlight that in my original blog article, but I fully agree.  For example, you might have unit tested iterations that you don’t send up for integration testing just yet. These are “internal iterations” at some level of deign abstraction, avoiding the “customer” at the next highest level of design abstraction.

3. “Early or continuous testing breaks the V”

Liming claims that as testing happens on the right-hand side of the V, early or continuous testing means you can’t have a V any more.  I agree early and continuous testing breaks the V, but it doesn’t break the Fractal V!  In the Fractal V, yes you do still carry out the testing on the right-hand side of the V, but the whole point of the Fractal V is that it can accommodate a variety of different kinds of internal iterations.

I’m not claiming that all software development will fit the Fractal V Lifecycle.  For example, imagine a lifecycle with long-running requirements analysis performed concurrently with ongoing development, where you could pre-emptively abort a development iteration depending on the intermediate outcomes of that ongoing requirements analysis.  I don’t know of any lifecycle that would capture that exactly.  Maybe this is Liming’s point about continuous integration – the hairy reality is that developers won’t stop entirely while they’re waiting for feedback from continuous integration – they’ll probably already have started the next unit of work.  I would rebut that the aim of continuous integration is to given “immediate” unit and integration test feedback to developers – to let them do many micro-iterations as quickly as possible.  I say you should probably just work around that hairy reality for the sake of simplicity.  No model is perfect, but some models are still useful.

Lifecycle models are used because they make it easier for large groups to understand and coordinate complex projects.  The Fractal V provides more flexibility for iteration than the normal V lifecycle, but is still reasonably simple, and importantly retains the Plan/Do/Check schema that so many systems engineering disciplines and tools rely on.

It will be interesting to see the un-V lifecycle Liming mentions (I hope they find a better name!), and see how it deals with all of these issues.

Fractal V Lifecycle for Pre-Project Activities

Louis has got a new article version of my earlier blog post on the Fractal V Lifecycle up on alinement.net magazine/community website. He’s also added some thoughts of his own on extending the concept into pre-project activities.  For these sorts of activities I tend to favor putting them on top, i.e. on the left but at a higher level – giving you a taller V.  But there’s no hard and fast rule… use it whatever way helps!

Academic Academy Awards

I had to laugh at Liming’s latest micro-blog posting, Why are papers in top conferences very boring (these days)? It’s funny, but I’m not sure I entirely agree – I think top conferences do have interesting papers.  Liming is saying interesting ideas won’t necessarily have had time to be well validated, and by the time you have validated and published your idea in a top conference, it’s no longer new (and interesting).  However, I don’t want to see completely unvalidated ideas.  Ideas are cheap.  I want to see ideas that are realisable, and whose value has been described and justified somehow.

To the extent that Liming’s wry diagram is true, I think it’s more true of journals than conferences. In most academic disciplines, journals are regarded as the “proper” place to publish significant results.  Computer science is different – top conferences in computer science (and software engineering) can be more important than journals.  Citeseer statistics show most of the highest impact compsci venues are conferences, and even some workshops have more impact that some top journals!  But (or perhaps because!) in computer science, journals have longer review and publication lead times than conferences, so the results there can be more out-of-date and so less interesting.  (That is a bit odd when you think about it – journals are published several times a year, whereas each conference happens at most once a year – surely journals should be able to be more responsive than conferences in publishing new results?!)

Anyway, it makes me wonder how Ricky’s citemine system would work in the conference milieu.  I guess for maximum market efficiency in citemine, the evaluation for new papers should take place in public. So, no workshop or conference would ever have “new” results – everything that made it through “review” (weighted average market price over the period since the last conference greater than some threshold for papers within some discipline boundary?) would have been published for the best part of a year.  Conferences would be more like the Oscars – glorifying new exciting productions – rather than a way of learning about recent results.  Maybe that’s OK – I think the greatest value of conferences is networking and nuance, and you would still get that at a Computer Science discipline’s “Academy Awards”.  But these would be very different events, and norms of academic precedence would need to be re-conceptualised.

Developing Whole Verified Embedded Systems

NICTA’s recent Techfest in Sydney saw a flurry of news around the announcement of a significant research achievement- the formal verification of the seL4 microkernel. The team developed a mathematical proof of the functional correctness of the microkernel down to the level of the C source code.

The achievement is important for two reasons. Firstly, it makes it possible to prove code-level functional correctness of whole computer systems based on the L4 microkernel. This means computer systems can carry a new kind of assurance, supporting strong arguments for safety and security for high-integrity software systems.

Secondly, it makes the creation of these proofs more feasible in practice. It should now be possible to formally verify properties of whole systems where only the key parts of the system are formally verified. This is critical for the practical application of formal verification. The verification of the L4 microkernel took more than 20 person years of effort to verify just 7500 lines of C source code. Modern embedded computer systems can have millions of lines of source – it’s not practical to formally verify all the code in such systems at these levels of productivity.

However, you don’t need to verify all the code! L4 provides rigorous separation between processes running in the microkernel, and that separation is guaranteed by the recent proof. If you can isolate the safety-critical or security-critical parts of an entire system to one small formally verified component running in an L4 process, it should be possible to lift the guarantees for that component to the whole system, even if the other components in the system haven’t been verified.

That is the challenge for a new project at NICTA – Trustworthy Embedded Systems. The plan is to develop technologies to support the creation and verification of entire systems running on top of the microkernel. This is a large project involving several NICTA labs and researchers from many disciplines (operating systems, formal methods, software architecture). I’m part-allocated to the project for the next few years. My PhD was in formal methods, and this is really the first time I’ve dipped my toe back into that area for the last decade. However, I won’t be doing too much theorem proving myself – my focus in the project will instead largely be on other software engineering issues in this context, such as configuration management, and how to use the component architecture to support product line development.

Reflections at WICSA

WICSA was fun.  I usually find the most I can hope for in a conference is 1 or 2 papers that are really interesting, but I think WICSA cleared 5, so it was well worthwhile.  What I particularly enjoy about conferences is hearing how people verbally describe the ideas and challenges in the field.  You can get so much more nuance and emphasis from hearing people talk about their research, compared to just reading papers.

A great example was the final keynote for the conference, by Alexander Wolf.  He covered reflections on his personal history working in software architecture, but as one of the “fathers” of the field, his talk was also a history of early software architecture research.  It was fun to play spot the co-authors in the audience and also among other acquaintances.

He talked about the importance of simulation and experimentation for architecture, and called for more work to be done in the area.  At NICTA, Jenny Liu and Paul Brebner have been leading work in these areas, particularly for performance analysis of enterprise architectures.  They’ve been getting huge interest from industry.  It’s a very promising approach and I can support the observation that simulation and experimentation are critically important to the discipline of software architecture.

Alexander Wolf was also previously involved with Software Configuration Management research, which is an interest of mine.  He didn’t really elaborate on that line of work, but he did mention a paper of his discussing the relatedness of software architecture and configuration management.  I think there’s still a lot more that can be said in this area, particularly concerning architecture evolution.

The BASE of CREST

Yesterday WICSA 2009 finished. There were a number of interesting talks over the three days of the conference. One was by Richard Taylor on Architectural Styles for Runtime Software Adaptation. He was discussing a framework (BASE) for comparing approaches to dynamic runtime adaptation. The model classifies how various architectural styles deal with Behavior, Asynchrony, State, and Execution Context for adaptation.

One of the frameworks being analysed with BASE was CREST – Computational REST. In CREST, pieces of computation are represented as URLs and can be moved around the web just as static content is. Richard gave a demo of CREST in action – showing pieces of independent computation running and serving dynamic content to multiple distributed browsers. It certainly had a “wow” factor. It reminded me very strongly of the Google Wave demo. But CREST is a more general architecture – it’s not committed to the threaded content model that’s deeply built into Google Wave. Could you reimplement Google Wave on top of the CREST framework? It looks plausible, and it might also help you create and share a much richer variety of dynamic content – to put yourself ahead of the Wave (pun intended).

I had a few questions (some of which were prompted by discussions with Liming Zhu) but I didn’t get a chance to pin down Richard after the talk…

The first question is about CREST, but not BASE. We can observe that REST is “broken” on the web. For example, cookies aren’t part of (and violate!) the REST principles, but they are nonetheless essential to the workings of the Web. That’s fine – pragmatics will almost always get in the way of a naive realisation of an abstract model. So my question is – how (or if?) does CREST need to be “broken” for it to be workable?

My second set of questions is about the BASE framework discussed in the paper. What limitations do the various architectural styles carry on the scope of adaptation? How do you get assurance about invariant functionality? Why doesn’t BASE consider security? Dynamic adaptation is great, but not everything will be dynamically variable, and you probably want to know that some functionality won’t vary, and won’t be subverted at runtime. How do the various architectural styles enable that?

The Next Big Thing?

How can you tell what the next big thing is going to be? Google’s pagerank algorithm will tell you what web pages have been important enough in the past for other people to have linked to.   Google trends will tell you what search terms people have been using recently, again in the past. What about the future?

Some predictions about the future are doomed to failure.   For example, Popper’s Poverty of Historicism is largely about the futility of predicting future society. However, some aspects of the future are largely predictable – science and technology work because they accurately predict the behavior of the physical world.  There’s a large middle ground of futures that aren’t easy to predict.  Prediction markets have been proposed as a way of getting better-than-chance predictions of these events.

Ricky Robinson at NICTA has recently launched citemine – a prediction market for academic papers.  The predictions being made are about how much each paper will be cited by other papers.  Ironically for citemine, one of the poverties of historicism that Popper identifies is a poverty of imagination about the possibilities of the impact of future science and technology! (Still, I imagine that Popper’s criticism only applies to long-term predictions of the impacts of science on society, not the shorter-term predictions of the importance of recently published scientific papers.)

The benefit of citemine is that it can be a leading indicator of the quality of publications, whereas existing citation metrics are very lagging indicators of the quality of publications and researchers.   Ricky’s hope is that academics will care enough to trade in citemine to acquire its “Reals” which may become a widely recognised measure of academic reputation.  Your personal worth in Reals is a measure of two things: your ability to have written highly cited papers, and how much better you have been than others at spotting papers that will be highly cited.  You can tell how much of your worth is due to each different source.  (Interestingly, I think both of these are lagging indicators, despite that the market price of a publication is a leading indicator.)

Even if such a market could work well if universally adopted and in a steady state, it’s a challenge to launch it.   It’s a chicken and egg problem – activity is required to make the market function, but a functioning market is required to generate interest in being active in the market.  The market has to bootstrap Reals into having value in the real world somehow.

citemine is “very beta”, and there are certainly a few issues at the moment:

  • Some matches aren’t being made in the market – there are buyers and sellers at the same price who aren’t doing a deal.   (Looks like a bug?)
  • There’s currently very low market depth, especially among sellers.
  • There’s no sophisticated market overview mechanism – just a list of papers at their current prices.
  • There’s no market metrics for papers – e.g. historical returns, price volatility, etc.

Ricky’s paper explains citemine.  I have two queries, and two observations…

Is citemine a zero-sum game?  In citemine, Reals are given to shareholders as dividends based on citations, but those Reals come from the previously-paid cost of submitting the citing papers.  So it looks like a zero-sum game. In my limited understanding of the economics of real stock markets, value gets created through primary production and through productivity improvements in other sectors.   I don’t see how that happens in citemine. Which leads me to my next query…

Is citemine a pyramid/ponzi scheme?  In citemine, the only source of new Reals is from the registration of new users, whose initial allocation of Reals is used to submit papers to pay dividends for existing users.  This question is more stark because there’s no leverage in citemine (debt, shorts).  Maybe I’m just confusing value with liquidity.

My intuition is manuscripts in citemine will behave more like mining stocks than industrial stocks in real stock markets. (Is that why it’s called citemine? :-) )  Mines have a limited finite quantity of ore, and the value of the stocks for that mine decrease as the ore is removed from the mine.   The value of a manuscript in citemine derives from future citations, but for almost all scientific papers, there is a finite time horizon for possible citation.  At some point people lose interest in moderately influential papers and cite later derived works.  Even very influential papers become part of assumed/background knowledge and get cited less.  I think that in citemine, most manuscripts will trend to a near-zero market price.

Finally, there’s a “meta-gaming” anomaly currently at play in the citemine market. If it turns out to be a successful market, then Reals get real value, and Ricky’s citemine paper (and closely related papers by other authors) will also inevitably be highly cited.  If the market turns out to fade into obscurity, then the free Reals you get on joining stay as play money, so it doesn’t matter how you will have spent them.   Ricky’s paper (and related papers) are a safe bet – you can’t lose!  I would have bought some, but no one was selling – and I have no idea about how to pick a good price to offer!

Goannas Eat Bugs

After my PhD, I worked in industry on the verification and development of software for a safety-critical environmental control system.  The project used a variety of tools and processes to improve and demonstrate product quality.  However, the only static analysis tool being used was lint.  I thought there had to be something better.

As an intern at SQI I had worked one summer hacking Prolog to extend PASS-C, an extensible source code static analysis tool.  However, most of the checks in that tool were about programming “style”.  From academia, I knew that model checking had potential to support more powerful semantic checks.  Model checking was increasingly being used for high-level system and hardware verification, but I wanted a software model checking tool for C and assembly source code.

The most promising tool I could find at the time was PREfast.  Frustratingly, at the time I was looking, PREfast had been acquired by Microsoft to be used by them internally, and had been taken off the market!  There was much gnashing of teeth.  (Microsoft has more recently made it available again.)   PREfast did deliver more powerful analyses, but it didn’t use model checking per se.

Now the tool I had been looking for is finally available: Goanna.  It delivers many of the powerful and precise analyses I had dreamed of, and works for C, C++, and (unusually) embedded assembly.  Goanna comes out of years of research and development at NICTA.  It is packaged as an Eclipse plugin, and is available as a free trial.

Fractal V Lifecycle

At the drinks after Ivar Jacobson’s talk, I was speaking with a project manager from Honeywell who’s about to adopt a more agile development approach.  Honeywell is in the industrial automation business – they do systems engineering to deliver solutions for things like building automation and factory process control.  Their business context is one of the most challenging for agile methodologies:

  • Mature technology/problem space where customers can accurately define most of their requirements up-front.
  • High integrity systems that have regulatory demands for “heavy” process documentation to provide assurance.
  • Many/large development teams.
  • Large systems where no single customer can represent all stakeholders.
  • Customers who don’t have time to be “on site” with the development team full-time during development.
  • Customers who want (or who are required!) to only sign contracts where they know what they’ll be getting.

The last four are standard problems for agile.  Ivar had actually discussed how agile approaches don’t fit some business conditions, but that nonetheless you can often adopt “30%” of the agile practices.  One of these practices is iterative development: it’s certainly been popularized in recent times by agile methodologists, but it’s not unique to agile methodologies.

Ivar presented the waterfall lifecycle as a strawman “unsmart practice” in his talk.  I was surprised by the number of hands that went up in the audience when he asked how many people were using it!  However, Honeywell (like most systems engineering companies I know) doesn’t use the waterfall lifecycle – instead they use a “V” development lifecycle.  The V lifecycle is like the waterfall, but is bent upwards in the middle, with coding and unit testing down at the bottom (pointy end) of the V.   The well-known advantages of the V lifecycle are that it shows how testing lines up with planning at each level of design abstraction, and shows how you can progress test planning early in development, concurrently with design/coding.

V Lifecycle

Classic V Lifecycle, showing three levels here for illustrative purposes - test planning can progress as a parallel activity along the dotted lines

The Honeywell project manager had a problem – how could he do agile development in his business context?  I suggested that he adapt the V lifecycle he was already using.  The structure of the V lifecycle can easily support iterations. Normally people think of the V lifecycle as a “big V”, spanning an entire development project.   The first obvious way to have an iterative V lifecycle is to have lots of sequential “little Vs” – each cycle up and down could be done as short V sprints.

Big Iterations of the V Lifecycle

Naive whole-cycle iterations of the V Lifecycle

But that exposes the customer to each iteration, and the Honeywell project manager told me that his customers don’t want to run an acceptance testing and sign-off process every 2-4 weeks during development!  To address this, you can use a less obvious approach, which here I’m calling the Fractal V Lifecycle.

Let’s work our way there in small steps.   So to start with, consider that instead of iterating the whole V in each iteration, you can instead iterate some of the lower parts of the V more frequently.  So if you had two low level iterations, you’d have a W lifecycle!

W Lifecycle - a V with one low-level internal iteration

W Lifecycle! - a V Lifecycle with two low-level internal iterations

To generalize to the full Fractal V Lifecycle, you can see that it’s possible to have many internal iterations at various levels of design abstraction. giving you a (quasi-)fractal V.

Fractal V Lifecycle - iterate at various levels to suit your business, but keep your overall V lifecycle structure

Fractal V Lifecycle - the flexibility to iterate at various levels to suit your business, while keeping a lifecycle structure that accommodates traditional assurance processes

I call this “fractal” because it reminds me of the Sierpinski Triangle.

Sierpinski Triangle

Sierpinski Triangle

The Fractal V Lifecycle is really only quasi-fractal, because there’s only a finite few levels of recursion in a development lifecycle, and because the internal iterations don’t have to be regular or symmetric over the course of the development lifetime.

The Fractal V Lifecycle solves a problem – it lets you do iterative development when your customers only want to be involved at large infrequent milestones.   It gives you the flexibility to adapt your iterations to suit your business conditions and technical environment.   But it also retains the shape of the V, which lets you keep using your existing systems engineering disciplines to comply with customer/regulatory requirements for process assurance.

There’s one thing that the Fractal V Lifecycle doesn’t explicitly decide for you – when should your iterations finish?  This is a major difference between iterations in agile and traditional plan-driven methodologies: agile development approaches have time-boxed iterations (usually between 1 and 4 weeks long), but traditional development approaches have iterations defined by scope.  The Fractal V Lifecycle is consistent with either approach.