discussion / Open Source Solutions  / 28 January 2019

Sustainability of open source projects - a look at Octobox.io

The Changelog released this interesting piece on Octobox.io's journey to sustainability. Octobox was an open source project to improve notifications on GitHub, that the creators decided to work on full time last year. The folks involved are both well known and respected when it comes to open source sustainability efforts, previously having been part of Tidelift, one of the better known players in helping open source maintainers get paid.

To summarise the article:

  • Octobox is now one of the most popular open source tools on GitHub, with over 11,000 developers using it
  • It's used internally by big companies such as Shopify, and even GitHub itself
  • Despite that, it is not yet sustainable, with the income not covering the time of the 2 project founders
  • To that end, they are introducing new paid features, but offering the user the choice between whether the money goes to the company behind it, or back to the community

Whilst a lot of software in conservation occupies a very different space from developer tools like Octobox, I think there's a lot to learn from experiments like this. The conversation around open source sustainability seems particularly advanced in developer tools products, perhaps simply because of the fact that the products target other developers and therefore they expect the burden of open source maintainership to be understood.

I'd love to start a conversation here around the sustainability challenges faced by open source projects within this community, and conservation technology more generally. Are you working on an open source project? Have you managed to build a community around it? Is that community contributing to the project and making it easier to manage, or are contributions taking up more and more of your time with no relief in sight?
 

PS. Hi! Hello! This is my first post in WILDLABS. I joined the WILDLABS team last week to help out with project management. I join from GitHub Education where I helped students take their first steps in open source and industry, and I'm very excited to learn more about conservation and how tech is being used. Thank you for being a part of this wonderful community, and I look forward to getting to know you!




Hi Joe,

A great topic to unpack, and a theme that the SCB Conservation Technology Working Group have been discussing too recently.

As an open source advocate myself, I've been working to act as the "middle ground" in the conservation technology realm by providing services and support via the Arribada Initiative to help developers of open source hardware (and software) become sustainable. There are, however, a number of complexities that exist in doing so, some of which align with the Octobox story;

1) Octobox's experiment to allow you to support the community or the company - I think this is really refreshing to see. At present Arribada pools funds raised through the sale of our own, and other developer's hardware and solutions (The AudioMoth by Open Acoustic Devices is a prime example). In fact, we make available 80% of any proceeds made to the original developer, of which in this case, Open Acoustic Devices can use as they see fit. They could continue to develop the device, pay others to bug fix, host servers, essentially fund themselves and their community. We take 10% to keep Arribada's lights on, and the remaining 10% pays for payment gateway costs using GroupGets. This is in itself an experiment too. The case remains that open source models are still in flux, and there isn't a golden answer, but it's good to see popular open source services explain their methods and reasoning.

2) Taking an open source service, device or product forward from a prototype to an actual business / operational service isn't easy. Octobox are correct to state this too "Most open source projects struggle to navigate the legislative, legal, and social issues around sustaining their projects." 

3) In conservation technology, open source software and hardware development is a different kettle of fish. Hardware development is a long and winding road. Take for example the effort needed to actually field test - developers (often with limited budgets) first need to create a solution, then partner with field programmes and then inevitably, tweak designs based on feedback, potentially lose hardware due to field damage (animal or otherwise), and find that 12 months later they are still working on the "final" release or hoping that someone is flying out to the field to carry their latest release. How best to support those 12 months of costs / salary?

I think it would be beneficial for the Wildlabs community to continue to unpack how we can better work together as a community to try and help each other.

The creation of field conservation testing sites (Ol Pejeta's Lab) is a great move forward, as we are starting to focus on supporting field testing with actual viable infrastructure that can be made available to open source developers - but the same issue remains, in that field testing sites will need to generate sustainable funds too to keep their lights on. Perhaps the answer lies in driving funding through the sale of successful open source software and devices, developed and purchased by the community (AudioMoth) to also support the infrastructure we all need to exist.

Alasdair
arribada.org

This has been a problem with open source software—even very popular software—since the beginning. These are the solutions I have seen work:

  • Aligning goals with the needs of a company with money. Apple (and others) contribute to Clang/LLVM, Google has TensorFlow (and dozens more), Facebook has PyTorch (and dozens more), etc.
  • A "freemium" model with a free, open source community edition, and a for-pay edition. Often the community edition is restricted to non-commercial use.
  • "Open core." Similar to the last, selling closed-source add-ons. MySQL, RedHat, Gitlab.
  • Selling a packaged, easy to use product based on the open source software. CodeWeavers, Ingres.
  • The services and support model. CoCalc, RedHat.
  • "Adware," or support from some kind of advertisement. Gmail (infamously), Mozilla, websites, mobile games.

These strategies are not necessarily mutually exclusive, of course.

I have never seen the Patreon/donationware model work, ever. Even very large, widely used projects (like Ruby) don't take in enough for even a single full-time developer.

Grant funded software is infamous for having short lifespans and poor code quality. The grant runs out, or the lab changes focus, or the student graduates, or... The few exceptions I am aware of are projects housed in federal research facilities like Oak Ridge or Lawrence Livermore. At least part of the problem is that the grant funding isn't for the software, it's for some project that needs the software. There just isn't grant money for it. What's more, tenure and promotion in academia (in the US) does not value scientific software. It doesn't count as valuable scholarship in an environment whose only currency is the peer-reviewed journal article.

It is absolutely appalling how much of our scientific infrastructure rests upon the work of only a small handful of developers. 

I've also been invested in trying to untangle this problem lately and have reached much same conclusions as @Robert+Jacobson . I'm currently attempting to transition to a development model which is used by some very successful open source projects & that I think fits the bill here.

For background, I'm the one of the authors (though sole engineer) of Camelot, which is open source software for camera trap data management and is developed entirely on a voluntary basis. We've achieved good mindshare over the last couple of years, but have also found that the time available to work on this has slowly dropped off. We're not going to stop working on it anytime soon, but are now looking at how we can leave the community in good stead to empower themselves in the future.

Sadly we've seen no attempt by the community to contribute code to the project. Free open source is fantastic for libraries; (typically) small units of code whose audience is other software developers.  However it is very difficult to contribute to a large codebase. This is true for professional software developers, let alone for researchers in other fields.

To add to this, even if you did decide build on top of an existing application, you inherit both the technical risks (technical debt, "here-be-dragons") and people risks (conflicting visions & priorities with other contributors) of doing so. I'm aware of new projects being spun up (and yes, grant funded :-) ) which are entirely greenfields, but duplicating 95% of the functionality of existing software in the space. It's such a waste, but it's not hard to see why this happens given these barriers and risks to building upon existing projects.

So to solve this we're in the process of transitioning Camelot from a application to a platform.  I don't believe that the need to develop software in the field is diminishing; if anything it's greater than ever.  So we're creating pressure from the other side: to make it as easy as possible to develop software for the field.

In a nutshell the goal is to make it dead-simple to write software which:

  • is technology agnostic: build with whatever you're best able to solve your particular problem with (incl. low-code/no-code).
  • is purpose-specific: only worry about the problem you want to solve; use the platform to do the rest.
  • can be built with complete autonomy: there's no existing development model to buy in to, and no one needs to approve your changes; you build it how you want.

This is obviously no small order, but we'll build out platform support incrementally & learn as we go.  I'd be very keen to know if anyone is aware of similar projects which have attempted this that we could learn from.

I'm the other Camelot author, and came into it from the conservation product manager angle, a job title which doesn't seem to exist, but that's what I am. And I think that is where part of the problem lies. Conservation science researchers know that they need software to process the huge amounts of data, but they don't always approach it from the iterative mindset. And so they create a spec and it gets grant funded, and it gets developed, and then doesn't get used after that. They don't consider the user experience outside of their single use case. They don't consider how their software could be used. And for whatever reason they don't think about collaborating with other people outside. Maybe it's being time-poor???

We are going through this now, where we are looking for standards and looking for collaboration, but the engagement is not there. We suspect it's because we came from "outside" in, but maybe it's because there are just so many projects that no one has time to connect and establish a shared set of standards. We just don't know. 

I really appreciate this discussion because through this I have found the listserv for Conservation Technology Working Group. 

Separately, we keep getting asked when will Camelot process video. The sad thing is that there are no metadata standards for video, and so all this fantastic video footage that researchers are taking BEFORE they have found what software they will use to process it, cannot be processed. Software would have to be created per video camera make/model. I feel like they are going into the project ass-over-backwards. They get the footage and then think.. hmm how will we analyse it. Maybe they are waiting for the magic AI to process it. 

Chris and I at least have each other to bounce off, but we are both time poor, so having a way to make some money from Camelot would help to that end. BUT we always said that we are doing this to make the world better in some small part.

Heidi