Browsing all articles in DCM4CHEE

Important DCM4CHEE security fix

Posted Posted by Martin Peacock in DCM4CHEE     Comments No comments
Jul
28

Stephen Wheat of Emory University has pointed out that a JBOSS vulnerability affects DCM4CHEE.  Click through to see details if they are of any interest but effectively the upshot is that while http GET and POST verbs are security restricted – other verbs (such as HEAD) are not.  This means remote users can run arbitrary code under the jboss user very often – root) without user credenti

It has been patched for dcm4chee-2.17.1 but the fix is easy enough to apply to previous versions.  In the file server/default/deploy/jmx-console.war/WEB-INF/web.xml find the following block of code (probably towards the bottom of the file):

<security-constraint>
<web-resource-collection>
<web-resource-name>HtmlAdaptor</web-resource-name>
<description>An example security config that only allows users with the
role JBossAdmin to access the HTML JMX console web application
</description>
<url-pattern>/*</url-pattern>
<http-method>GET</http-method>
<http-method>POST</http-method>
</web-resource-collection>
<auth-constraint>
<role-name>JBossAdmin</role-name>
</auth-constraint>
</security-constraint>

.. and remove the lines:

<http-method>GET</http-method>
<http-method>POST</http-method>

This ensures that all verbs are routed through the security checks by default.

Using XML/XSLT for Dynamic User Access Protocols

Posted Posted by Martin P in DCM4CHEE, Development     Comments No comments
Sep
24

I have a small role in the setting up of the Irish national RIS/PACS system.  I hope I provide more solutions than problems but I’m sure that could be debated.  One of the questions that has appeared is of a category that appears in the implementation of pretty much any clinical system:  Can access to (some entity) be restricted by (some data element/property).  An example of this might be:

Can access be limited to the originating institution  for 24 hours after data generation?

The stock developer’s answer, is of course:

Yes.  But.  You have to define that element/property in advance and we can schedule it into a future release, based on available development time.

.. which isn’t terribly helpful.

Implementing a user-accessible scripting language into the system is one way of providing such a feature dynamically.  But its a tough job to retro-fit a scripting language into a system if the system was not originally designed with that in mind.  There may be another way – using data transformation.

XSLT is an XML-based language which provides transformation services for XML-based datasets.  So a dataset, when expressed in XML, can be transformed into some other representation – say HTML.  Importantly, the transformed dataset may contain some subset of the original data, but need not. It could equally be a dataset that is driven by the contents of the original dataset, but actually contain none of it.

The DCM4CHEE archive uses this idea, for example in defining forwarding rules based on the contents of DICOM fields in incoming images.  It applies an XSLT template such that, based on some set of criteria within the DICOM headers, the output is (in XML form), a parameterised list of destinations to which that image should be forwarded.

So how can this be used as a User Access Protocol?  Easy.  For each data item which is to be subject to User Access Control, package up its own element values, and element values from the wider context which are relevant, as an XML stream.  Run that through an XSLT parser with an output that defines what access the current user has.

So, for example, the dataset for an ‘order’ object may include, from its own data fields as well as wider context, the following data:

<?xml version=”1.0″ encoding=”UTF-8″?>
<?xml-stylesheet type=”text/xsl” href=”Order-Access.xsl”?>
<order>
<attr tag=”procedure”>CT Head</attr>
<attr tag=”orderplacer”>Jane Doe</attr>
<attr tag=”orderinstitution”>St Elsewhere</attr>
<attr tag=”placeddatetime”>2010-09-23T12:30-04:10</attr>
<attr tag=”hours-since-order”>23</attr>
<attr tag=”currentuser”>mpeacock</attr>
<attr tag=”userinstitution”>not St Elsewhere</attr>
</order>

.. that can be processed via XSLT:

<?xml version=”1.0″ encoding=”UTF-8″?>
<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform” version=”1.0″>
<xsl:output method=”xml” indent=”no”/>
<xsl:template match=”/order”>
<xsl:variable name=”orderinstitution” select=”attr[@tag='orderinstitution']“/>
<xsl:variable name=”userinstitution” select=”attr[@tag='userinstitution']“/>
<xsl:variable name=”hours-since-order” select=”attr[@tag='hours-since-order']“/>
<access>
<xsl:choose>
<xsl:when test=”$hours-since-order > 24″>
<read>True</read>
<write>True</write>
<delete>True</delete>
</xsl:when>
<xsl:otherwise>
<xsl:if test=”$userinstitution=$orderinstitution”>
<read>True</read>
<write>True</write>
<delete>True</delete>
</xsl:if>
<xsl:if test=”$userinstitution!=$orderinstitution”>
<read>False</read>
<write>False</write>
<delete>False</delete>
</xsl:if>
</xsl:otherwise>
</xsl:choose>
</access>
</xsl:template>
</xsl:stylesheet>

…to form output XML, based on :

<?xml version=”1.0″ encoding=”UTF-8″?>
<access>
<read>True</read>
<write>True</write>
<delete>False</delete>
</access>

So in this case, we can dynamically define quite complex access rules to restrict access to an order to the originating institution only, for the first 24 hours of its lifespan. Cool.

Note there is a bit of a kludge in the XML-packaged dataset.  I’ve put in a derived field (hours-since-order) which in itself requires some pre-casting of requirements.  The main reason is that, since I’ve used DCM4CHEE as an example, DCM4CHEE straight out of the box, is limited (by virtue of JBOSS, itself by virtue of Apache XALAN) to XSLT version 1.  The really useful date-processing functions are, alas, defined in XSLT version 2.  So the next logical stage I guess is to work out how to upgrade!

DCM4CHEE Log file analysis with Chainsaw

Posted Posted by Martin P in DCM4CHEE     Comments No comments
Jul
25

DCM4CHEE is quite a complex beast, consisting of around 70 discrete services – on top of the many services delivered as part of the JBOSS application server DCM4CHEE relies upon.  Many of those services offer multi-threaded services – so even on a server with a modest workload, there can be periods when a lot is going on – all of which is being logged to the central log file.

The log file is the first recourse when problems occur – or even if something happens that is simply unexpected.  It can, however, be quite a challenge to untangle the log messages originating from different services or threads, and that is made even worse if there are stack dumps throughout the file.  There is a way to make troubleshooting a little easier, thankfully.

Apache Chainsaw is a Java GUI application that accepts input from the log4j logging framework that DCM4CHEE uses, and offers a couple of features which make the log output easier to manage:

  • A single entry for each event.  While in the standard log file, events may run to several lines (and stack dumps run to several pages), Chainsaw displays each event as a single line with drill-down to more details if appropriate.
  • Potentially sophisticated filtering abilities which allow (for example), display of events originating from a specific thread.

Installation

Installation couldn’t be easier – given an appropriate Java runtime is in place.  On my Ubuntu laptop, unzipping the ‘standalone’ package and running chainsaw.sh was all that was needed.  There is a chainsaw.bat for Windows users and a .dmg easy-to-install package for Mac users.

Log File Configuration

The default configuration that comes with DCM4CHEE doesn’t work with Chainsaw, though.  We need to tweak it just a little.  In the DCM4CHEE configuration file server/default/conf/jboss-log4j.xml, below the lines

<appender name=”FILE” class=”org.jboss.logging.appender.RollingFileAppender”>
……
……
</appender>

.. add in the following lines:

<appender name=”XML”  class=”org.jboss.logging.appender.RollingFileAppender”>
<errorHandler class=”org.jboss.logging.util.OnlyOnceErrorHandler”/>
<param name=”File” value=”${jboss.server.log.dir}/server.xml”/>
<param name=”Append” value=”false”/>
<param name=”MaxFileSize” value=”10000KB”/>
<param name=”MaxBackupIndex” value=”1″/>

<layout class=”org.apache.log4j.xml.XMLLayout”/>

</appender>

.. which will result in a second log file being produced.  The normal ‘server.log’ will be untouched, but you will also have a server.xml which can be opened in Chainsaw.  The XML formatting is less efficient than the standard formatting – so you won’t get as much in the way of time coverage in the same 10MB space but that can be adjusted if necessary.

Once in Chainsaw, the events can be filtered using the ‘Refine focus’ field by adding in rules against which any of the event fields can be tested.  Even easier, any of the event fields displayed in the main window has a right-click context menu, which includes ‘Add to refine focus field’.  Picking this re-filters the display accoring to the value in that field (e.g. a specific Thread ID):

A new book for MySQL, and a toolkit for MySQL AND PostgreSQL

Posted Posted by Martin P in DCM4CHEE, High Availability     Comments No comments
Jul
14

So I came across a new book on MySQL replication, which looks pretty useful although a little expensive. The author biographies are particularly impressive – these are the guys who really should know about replication.  One of those is the architect of MySQL’s row-based replication which set me thinking.

A while back I was implementing replication on a couple of DCM4CHEE servers and noticed that the master and slave became increasingly out of sync.  At the time, the only option in MySQL was statement-based replication and I attributed (with no real evidence) the issue to known limitations of that form of replication.  In any case – the alternative (DRBD) turned out to be quite adequate and that is the strategy I’ve used since.

However, more recently (November 2008), MySQL 5.1 included row-based replication and I never went back to review the strategy.  Seeing this new book prompted me to spend a little time checking what the state of MySQL row-based replication is.

To cut a long story short, I ended up discovering Maatkit – “power tools for open source databases”, which, while written primarily with MySQL in mind, supports other open source databases (like PostgreSQL for example) and is certainly going to join the rest of my arsenal.  The first use will be mk-table-checksum – designed to allow comparison of master and slave to ensure replication consistency.

I’ll try out the row-level locking and will a new toolset, can be comfortable that it works the way it should (or indeed, not).

BTW the O’Reilly site has a sample chapter which describes nicely the process of setting up simple replication – worth a look if you haven’t tried it before.

Good advice for PACS administrators

Posted Posted by Martin P in DCM4CHEE, PACS General     Comments No comments
Jun
30

Via AuntMinnie.

Some HighLights:

Kennedy recommended that institutions consider a business continuity system as an alternate mechanism to maintaining essential functions during downtime. This can be as simple as deploying a small public domain miniPACS, or as sophisticated as using a fully redundant primary PACS network, according to Kennedy.

DCM4CHEE is good for lots of things – including just that!

As a result, multiple methods of communication should be utilized, including phone calls and even posters. “You cannot overcommunicate,” he said. Also, anticipate misinformation and manage around it, he added.

I.e. chinese whispers (at best).

“One of the CT applications people I worked with many years ago gave me a wonderful piece of advice, and that’s to always carry a stopwatch,” Kennedy said. “I do that still, because sometimes it’s very hard to convince [a person who thinks it takes 35 seconds to load a CT scan]. But the stopwatch says five [seconds], and people will believe stopwatches.”

I’ve never resorted to carrying a stopwatch but the principle works for lots of user reports – “the system is down!” or “the system can’t do …..” as examples.  Never assume end users have the same ability to distinguish between two elements that you do.  Always be prepared to move around to the users perspective and translate.

There are things that cost a million dollars not that many years ago that now can be done for 1/100th or 1/1000th of that price. Many of our PACS vendors haven’t really leveraged that, …

Jeez I’ve been saying that for years.  For additional archive storage to be costing the same as 5 years ago is criminal.  Folk still pay, though.  Like voting, people get the government they deserve.

When it’s time for a new PACS, even more issues come up, including PACS-to-PACS migration issues and the debate of whether to consider a PACS with a vendor-neutral archive.

There’s no such thing as vendor-neutral archive – only vendor-different (except, of course for fully Open Source solutions).  Lets take the Carestream product as an example. If Carestream goes belly-up or decides to drop the product – would you get support and further development elsewhere?  No. I thought not.

Update: If relying on user reports to determine performance problems isn’t always a good idea, basing a study on them mightn’t be either. I dare say it may well have the right conclusion but I’d worry about the quality of the raw data.

New release for DCM4CHEE: 2.14.8

Posted Posted by Martin P in DCM4CHEE     Comments No comments
Mar
22

So we now see the release of DCM4CHEE 2.14.8.  While in terms of version number it is a minor update from 2.14.7, in many ways it contains some major feature additions.  As one might expect, there are a number of bug fixes and background changes, but there are a good handful of changes that may well have positive impact on installations:

  • DCMEE-1343   Improvement of tolerance of illegally/inappropriately encoded DICOM datasets (yes, even today, it still happens!).
  • DCMEE-1292   Improved timer handling for periodic taks.
  • DCMEE-1340   Sync the filesystem before sending C-STORE-RSP.  A success response is not sent until the filesystem actually writes to disk to mitigate against a server crash while data is in filesystem cache.
  • DCMEE-1373   Improved performancee in XSLT processing.

But along with these are two very significant elements:

  • DCMEE-1356   Storage of lossy compressed images on a different FS group to the original image.  This is significant in two ways – firstly, while DCM4CHEE has to date supported lossy compression on previously compressed images, it has not actually compressed images itself.  Secondly, and more importantly, is how this could potentially be used.  I haven’t played with this myself just yet (I will), but for the moment I read this as follows.  Browser-based delivery of images has proved to be tough in one important respect – Window/Levelling.  Just about any scheme for getting a browser based application to window/level images requires some aspect of server-side processing – and image resizing is an important element of that.  It is of course possible for images to be resized from the full quality image each time, but for browser delivery – that is a massive waste of server resources, and makes performance a real issue.  Having a lossy compressed version of the image available makes this scenarion so much more viable.  More on this after I have had a chance to work through the new feature.
  • DCMEE-1358   Migrate from JBOSS MQ to JBOSS Messaging.  This is an important upgrade.  JBOSS MQ was deprecated in favour of Messaging some years ago, and ongoing development discontinued not long after, as a result of which, MQ is severly limited in many ways – including message instance management.  The DCM4CHEE forums have a number of threads around monitoring and management of (for instance) the Forwarding Message Queue.  MQ has never had the capability to do this with any great effectiveness and the move the Messging can only improve the situation.  Again, more of this later.

Add High Availability to your PACS on a shoestring: Part II

Posted Posted by Martin P in DCM4CHEE, High Availability, Infrastructure, PACS General     Comments 3 comments
Oct
19

Getting the Hardware

Previous entry in this series: Introduction and Architecture.

To do this, we need 4 boxes.  We need two servers and two NAS storage arrays.  Actually, we’ll need at least 5 – since we’ll need some kind of network infrastructure to connect them, but I’m going to consider that outside the scope of these articles.  There will be a need for at least 4 network ports (see discussion on servers, below).  There are some wrinkles anyone not familiar with a datacentre environment might not think of, so I’ll try to spell them out.

The title of this series suggests that High Availability can be achieved on a shoestring.  I believe that to be true, but there is something that should not ever be skimped on – the hardware.  All of the software involved in this solution is free but hardware is not.  Putting this solution in place presents a potential point of failure for your workflow, so you should pony up, and get the best hardware that you can afford.

read more

Add High Availability to your PACS on a shoestring: Part I

Posted Posted by Martin P in DCM4CHEE, High Availability, Infrastructure, PACS General     Comments 3 comments
Sep
21

Introduction and Architecture

In most clinical organisations, PACS is a critical part of the operational workflow.  RIS is important but PACS must stand firm even when all around it have failed.  In environments which include Emergency, Intensive Care and Theatre (to name a few), the non-availability of PACS can be a danger to patient safety.

Most vendors, however, cost HA not just as an extra, but as a luxurious extra, adding a significant percentage to the total cost of a system.  It needn’t be that way.  HA is a well understood practice and the tools and materials needed can be easily worked into a standalone package that can provide HA for almost any PACS (see section on limitations, below).

I’m going to describe a solution for HA based on free, open source software (note: NOT ‘freeware’.  Know the difference).  All of the software in this solution is mature, proven software used by major organisations the world over.  Where there are choices I have selected software components that have full support options available, just in case you need a little more assurance.

read more

Background: What is Open Source Software

Posted Posted by Martin P in DCM4CHEE     Comments 1 comment
Apr
10

First installment in a user manual for DCM4CHEE.

DCM4CHEE is “Open Source Software”. It is important to have an appreciation for what means.  In particular it is not the same as “freeware”. Freeware is software which, while being provided for no monetary charge, is remains as proprietary and restricted as software for which considerable money changes hands.

DCM4CHEE is also “Free software”, which equally, is very much NOT freeware.  The differences between “Open Source” and “Free” software are in subtle and generally irrelevant details within the licenses under which software is published.  it can be said that the term “Open Source” (as defined by the Open Source Initiative) is a slightly less dogmatic perspective than “Free software” (as defined by the Free Software Foundation) on the vision of freedom in software:

From fsf.org:

Free software is a matter of liberty, not price. To understand the concept, you should think of free as in free speech, not as in free beer.

Free software is a matter of the users’ freedom to run, copy, distribute, study, change and improve the software. More precisely, it refers to four kinds of freedom, for the users of the software:

  • The freedom to run the program, for any purpose (freedom 0).
  • The freedom to study how the program works, and adapt it to your needs (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to improve the program, and release your improvements (and modified versions in general) to the public, so that the whole community benefits (freedom 3). Access to the source code is a precondition for this.

A program is free software if users have all of these freedoms. Thus, you should be free to redistribute copies, either with or without modifications, either gratis or charging a fee for distribution, to anyone anywhere. Being free to do these things means (among other things) that you do not have to ask or pay for permission.

That does not mean these freedoms extend to blind freedom.  As in “free speech”, Free Software entails responsibilities to go alongside the freedoms.  In the case of both “Free” and “Open Source” software, those responsibilities are enforces by the licenses under which the software is published.

DCM4CHEE is published under 3 different licenses – you can pick one according to your needs, although all 3 enforce the freedoms as above and some basic responsibilities:

  • If you redistribute the software you must:
    • Make the source code available.
    • Make any improvements you have made to the software available to the community.
    • Publish the software under a license which protects the same freedoms and enforces the same responsibilities (generally, THE same license under which you received it).

It is important also to appreciate that intellectual property and copyright do not cease to apply.  Unless specifically transferred, the copyright remains with the original owner – it is simply that the owner is licensing the use/distribution of the software given the associated responsibilities.

It has been said that free software is developed by ‘by definition, unpaid amateurs’. This could not be further from the truth. If we take an example of Linux – free software that has had an enormous impact, it has been estimated that at least 70% of the contributions are from folk who are paid specifically to do so.  It is also true that a very large percentage of the remainder are contributors who are nontheless professional developers but have other motivations.  Some may be motivated by simple benevolence, some by ego, and some by the opportunity for self-promotion and peer recognition.

In this way, DCM4CHEE is not unlike many other Open Software projects – many (all?) of the contributors to the code, the documentation and the support forums are software professionals with specific (if not always financial) motivation.

The benefits of developing software in such an open environment are many, but can be summarised as:

  • Community. While proprietary software has long maintained ‘user groups’ which can support each other, a succesfull Open Source community is far more powerful.  Between support forums, documentation, bug reports and code contribution, a community has the power to offer far more than a proprietary vendor can.  Very often this also feeds through to the quality of the software itself.
  • Value re-use. Although first coined in the 12th century, Isaac Newton phrases it succinctly:

If I have seen a little further it is by standing on the shoulders of Giants.

..although I understand his was in latin.  The Open Source development model facilitates the reuse of work done elsewhere without complex and debilatory cross-vendor arrangements.  DCM4CHEE is built on JBOSS (OSS), which in turn is building on JAVA (admittedly only recently OSS), and which uses TOMCAT (OSS) as a web server.  in turn, other OSS projects (E.g. Mirth amongst others) uses DCM4CHE for DICOM functionality.

  • Transparency.  When the code to software is not made available, claims of specific quality factors, or adherence to standards, must be taken at face value.  Not so with Open Source Software.  Such claims can be tested, increasing user’s confidence in the software.
  • Flexibility.  It is a common enough response to the common enough request of a vendor:

Request: Can we have feature XYZ added to this software?

Response: We’ll submit your request to product management. If it is approved it may make the backlog for version Current+2.

With Open Source software, a feature request may be considered sympathetically or not, but a user always has the option of developing (or having developed on her behalf) the feature herself.

Next up: DCM4CHE and Backround: XML / XSL / XSLT