Choosing Open Source Software for PCI DSS

I’m researching a longer article on some specific software solutions. It caused me to realize that there’s value in creating a checklist of what to look for when trawling github for open source software to use with your payments platform. In doing so, I think I can make a couple of safe assumptions:

  1. If you are reading this here, you are likely an information security professional; and
  2. If you are looking for open source software for a payments platform, you are involved in a commercial endeavor.

Together, these create a need to be conservative in the choices of what to bring into your software world.

Contents

License

Perhaps the biggest issue is the license under which the software is released. A surprising amount of software on github has no explicit license defined. If you see software with no license, you should steer clear of it. Just because it’s published online does not mean you have a right to use it! Software without a license is effectively software you have no right to use. For a personal website, this may be just fine. For a commercial project, that’s a big issue.

The GPL series of licenses are what’s known as copyleft licenses. These licenses tend to create obligations on you to share the proprietary code you link with them under certain circumstances. Those circumstances vary by license: in the case of the regular GPL, distributing the combined software outside your organization causes the need to release source to apply; whereas for AGPL simply offering a commercial service using the AGPL code means you have to make your source freely available.

Always review the README or LICENSE file to confirm the license terms are what github claims they are. If you can’t find the terms, beware!

There are many more permissive open source licenses such as the BSD licenses, Apache 2.0, MIT, etc. For each of these you can typically use the software for commercial use, subject to some basic requirements such as you provide credit to the original authors. For each of these you need to read the license carefully and ensure you do whatever is required to comply with the license terms, otherwise you do not have the right to use the software.

I am not a lawyer, but I know to consult with one when using open source code for commercial applications.

Is the source available?

Just because something is published on github does not automatically mean that its source code is available for review. I was able to find a payment HSM simulator where only the Java JAR files, that is compiled object code, were published to github. Having access to the source code is not a guarantee of security. But having the source does at least mean you have the opportunity to review it for vulnerabilities and outright malicious code.

If you only have access to compiled code, you have to consider why? If somebody isn’t willing to publish source, but is willing to publish the object code, at best they don’t really have the interests of the open source community at heart. At worst, they are hiding malicious code in there, waiting for someone to bite. Particularly if the code is for something esoteric, such as an interface to a payment HSM, the likelihood of this issue is raised, as the only people using the code are likely to be of interest to an attacker.

Software is rarely bug free, and it’s important that you can get the bugs fixed in the software you depend on. For commercial software, you can request an update from the vendor. Resolution is not guaranteed, but that is what you’re paying for. For open source software, you have the option to fix it yourself, or work with the community for a fix. But for compiled blobs, you may not have have any option to get fixes incorporated.

Vulnerability Management

Working in the PCI space, it’s basically a mandatory requirement that you have a mature vulnerability management program. When dealing with open source software, you have to consider vulnerability management both for the software you intend to use, and also for any dependencies this software has.

The intent of vulnerability management is to significantly lower the risks inherent in using software. If an open source library and its dependencies are well known, then vulnerability management is fairly straightforward.

Issues are constantly being found in open source software. These vulnerabilities are reported through responsible disclosure programs, and catalogued in databases such as the National Vulnerability Database run by NIST. This is where those CVE numbers you see are assigned. Scanners exist, which identify the versions of software you use. Some are even free to use! These scanners match software versions against the NVD to find and report on known vulnerabilities.

By exclusively using well known components such as Eclipse Jetty, any issues known in the public domain will quickly be visible to you through your vulnerability management software. But if you use some oddball software from github with no forks or issues reported, and therefore no public visibility, you will basically be on your own.

High Value Oddballs

So you’ve found a rarity. Open source software that solves exactly your problem, but nobody else seems to be using. Reimplementing this functionality would take you enough time and effort that you really would rather just use the code you’ve found. How do you manage this to an acceptable level of risk?

Verify the license to ensure you’re entitled to fork it, change it, and distribute it. Assuming you are able to fork the software, do so. Maintain your own version of the software, and bring it in house. Treat this new fork as your own software, while respecting the license terms. If you need to, publish changes back to the community. If you must provide attribution to the original authors in your documentation, create a process to do so.

What does treating it as your own software mean? Run it through a PCI DSS-compliant process. Have your developers review the code for security issues. Run the code through static & dynamic analysis tools, and take its findings very seriously. Address those findings through fixing the code, not through accepting the risk. After doing so, don’t blindly accept upstream changes into your fork without applying the same level of rigor.